web-crawler
"Hashtable" for Python Twitter Crawler
As part of the python twitter crawler I\'m creating, I am attempting to make a \"hash-table\" of sorts to ensure that I 开发者_JAVA技巧don\'t crawl any user more than once. It is below. However, I am[详细]
2023-03-26 23:08 分类:问答Restricting JS links from search engine's crawling
I would like to prevent google from following links I have in JS. I didn\'t find how to do that in robots.txt[详细]
2023-03-26 08:59 分类:问答How do search engines treat content shown by :target? [closed]
Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.[详细]
2023-03-26 02:50 分类:问答Applescript: Safari Can't Save Web Page (AppleEvent Handler Fails Error)
I want safari to download and save a web page using apple automator.With a Safari window open开发者_Python百科, I run the following script in AppleScript Editor:[详细]
2023-03-25 21:16 分类:问答While Loops Problem with Python Twitter Crawler
I\'m continuing writing my twitter crawler and am running into more problems. Take a look at the code below:[详细]
2023-03-25 19:33 分类:问答Headless Java HTTP client for crawling?
I\'m looking around for a crawling tool, written in Java, to detect invalid url\'s in our sites. The difficulty is that much of 开发者_StackOverflowthe url\'s are done with javaScript, CSS3 and Ajax.[详细]
2023-03-25 19:15 分类:问答How to ignore web crawlers?
I have a page that count how many times is visited by a user (registered, guest, every kind of users...).[详细]
2023-03-25 15:11 分类:问答Perl or Python SVN Crawler
Is there an SVN crawler, that can walk thru an SVN repo and spitt out all existi开发者_C百科ng branches, or tags?[详细]
2023-03-25 11:03 分类:问答php crawler for website with ajax content and https
i\'m trying to grab the content of a website based on ajax 开发者_StackOverflowand https but with no luck.[详细]
2023-03-24 20:15 分类:问答How can I test my crawlability with Google using AJAX?
I\'ve created my website so site.com/#!/page/var1/ans1/var2/ans2 maps onto site.com/pages/page.php?var1=ans1&var2=ans2[详细]
2023-03-24 13:13 分类:问答