开发者

"Real" link to file in Google search results? [closed]

开发者 https://www.devze.com 2023-02-28 09:06 出处:网络
Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.

This question does not appear to be about programming within the scope defined in the help center.

Closed 8 years ago.

Improve this question

I often search documents (mainly PDFs) using Google. But when I right click the link, or just hang the mouse cursor over it. What I get is NOT the real link, but some thing long and confusing like the following:

http://www.google.com/url?sa=t&source=web&cd=1&ved=0CCUQFjAA&url=http%3A%2F%2Fwww.marxists.org%2Freference%2Farchive%2Feinstein%2Fworks%2F1910s%2Frelative%2Frelativity.pdf&ei=Fai1TZq-Acugtgenw6DqDg&usg=AFQjCNFzYOTqpf68rQnuwW9K7wp39WL6Rg&sig2=z4RqvOLEEJsPohBqr1ghxQ

I have no idea what this is but I know thi开发者_Python百科s nonsense is not what I want, I want the real link (for the one above: http://www.marxists.org/reference/archive/einstein/works/1910s/relative/relativity.pdf), not something with Google's intervention.

How do I get the “Real” link to file in Google search results?


Maybe this is not the best solution, but here's one way that doesn't require coding or add-ons for Chrome and Firefox. Assume there are similar ways to do this for IE and others, though at least IE will usually open PDFs in the browser with the link in the url bar at the top which is easy enough to copy.

  1. Click on the search result, which should download the PDF.

  2. Now in your browser open the list of recent downloads

  • Chrome, Ctrl+J
  • Firefox on Linux(?), it's Ctrl+Shift+Y
  1. Now copy the link
  • Chrome: Right click on the URL listed beneath the name of the file and select "Copy Link Address"
  • Firefox: Right click on the file and select "Copy Download Link"

EDIT: As of December 2020, and probably earlier, Chrome shows you a clean, copyable URL in the search results.


I've created a simple web site that cleans Google search result URLs:

URL Clean

URLs copied from Google search results (such as links to PDFs) are more complicated than they need to be. This tool removes the unnecessary parts, leaving the page's original URL.


From a comment in @Blender answer, I've learned how to install a User Script in Firefox and Chrome.

Now, when right clicking and copying a URL in Google search results, I get the real link instead of that rubbish (sorry, Google, I know you love us, but we don't need no stinky tracking URLs).

At first, I used googlePrivacy as suggested by @naxa, but it's bugging nowadays. The script provided in Web Applicatations SE, Turning off Google search results indirection, does the work. It has User Script and Extension flavors:

  • "Don't track me Google" at the Chrome Web Store.
  • "Don't track me Google" at Userscripts.org

Bellow the info on how to proceed with the User Script.

Installing the UserScript

In Chrome, I installed it using Tampermonkey.

"Real" link to file in Google search results? [closed]

And Greasemonkey in Firefox.

"Real" link to file in Google search results? [closed]

Results

Before the UserScript

"Real" link to file in Google search results? [closed]

After

"Real" link to file in Google search results? [closed]


Related post in Web Applications:

  • How to make Google search not redirect


The URL is right here:

&url=http%3A%2F%2Fwww.marxists.org%2Freference%2Farchive%2Feinstein%2Fworks%2F1910s%2Frelative%2Frelativity.pdf

Just unescape it with some language, like Python:

>>> import urllib
>>> print urllib.unquote('http%3A%2F%2Fwww.marxists.org%2Freference%2Farchive%2Feinstein%2Fworks%2F1910s%2Frelative%2Frelativity.pdf')
http://www.marxists.org/reference/archive/einstein/works/1910s/relative/relativity.pdf

So to extract the URL from a Google url, here's a script to do so:

import urllib

url = raw_input('What is the Google url? ')
url = url[url.find('&url=') + 5:]
url = url[:url.find('&')]

print urllib.unquote(url)


I'm using a Firefox extension named Google/Yandex search link fix, it works just great and allows direct copy of the link target


Doing a little google searching and ran across the Firefox add-on called LinkWalker.

Simple context menu utility for links which decodes embedded and cloaked URLs, strips off query-string parameters and converts text selections to clickable link.

Sounds like that could do the trick.


it a long link because Google wants to keep track of who found what, and actually clicked on a search result...

if you want the real link (the above is also a real link!)

type this on your linkx-prompt:

php -r "print urldecode('http://www.google.com/url?sa=t&source=web&cd=1&ved=0CCUQFjAA&url=http%3A%2F%2Fwww.marxists.org%2Freference%2Farchive%2Feinstein%2Fworks%2F1910s%2Frelative%2Frelativity.pdf&ei=Fai1TZq-Acugtgenw6DqDg&usg=AFQjCNFzYOTqpf68rQnuwW9K7wp39WL6Rg&sig2=z4RqvOLEEJsPohBqr1ghxQ');" | awk -F'&' '/url=/{ print $5 }'


When I look up this search in Internet Explorer I do indeed get this link

But when I use Chrome, I get what you want. So it seems to be an IE feature, or at least have something to do with the browser you are using. If you are in the position to change browsers, I would consider using chrome (tested, gives normal URL) or opera (tested, normal url) but not firefox (tested, gives funky url)


See this tool

http://www.duvidasdeinformatica.com/blog/limpar-links-paginas-resultados-google/

It's in portuguese, but at the bottom you have a box where you can copy/paste the url, and it get's "converted" to the real one...


I think I read once, while having the same frustration, that it masks the actual URLs ONLY when you're logged into your google account and your accounts settings are configured for web history tracking.

IF my memory serves me correctly, you could try: - performing the search in a separate browser window using your browsers native "private" or "incognito" browsing feature - simply log out of your google account, get your results and log back in - go to google.com/history and click "Pause", which prevents future web activity from being saved, and then return to the same page after grabbing your results and click "Resume" (if you intend to use Web History).

If this sort of activity is something where you would routinely want to grab multiple URLs from the results and the above technique doesn't work as I recall, you can try something like an add-on to firefox, such as Copy Link URL, which provides the ability to copy the URLs of links you select which you could then paste into a text editor and replace the encoded elements with a Find & Replace.

Or, you could perhaps do a little research to find a website that will decode the URL for you. I found URL Deobfuscator on webtoolhub.com that does a good job of making the main / desired URL available for copy/paste by decoding the encoded characters, removing query strings, etc.

Cheers.

0

精彩评论

暂无评论...
验证码 换一张
取 消