开发者

getting all tweets of a twitter user, rate limit problem

开发者 https://www.devze.com 2023-02-25 16:23 出处:网络
I\'ve been trying to get all tweets of a some public(unlocked) twitter user. I\'m using the REST API: http://api.twitter.com/1/statuses/user_timeline.json?screen_name=andy_murray&count=200&pag

I've been trying to get all tweets of a some public(unlocked) twitter user. I'm using the REST API: http://api.twitter.com/1/statuses/user_timeline.json?screen_name=andy_murray&count=200&page=1'

While going over the 16 pages (page param) it allows, thus getting 3200 tweets which is ok. BUT then I discovered the rate limit for such calls is 150 per hour(!!!), meaning like less than 10 user queries in an hour (16 pages each). 开发者_开发技巧(350 are allowed if u authenticate, still very low number)

Any ideas on how to solve this? the streaming\search APIs don't seem appropriate(?), and there are some web services out there that do seem to have this data.

Thanks


You can either queue up the requests and make them as the rate limit allows or you can make authenticated requests as multiple users. Each users has 350 requests/hour.


One approach would be to use the streaming API (or perhaps the more specific user streams, if that's better suited to your application) to start collecting all tweets as they occur from your target user(s) without having to bother with the traditional rate limits, and then use the REST API to backfill those users' historical tweets.

Granted, you only have 350 authenticated requests per hour, but if you run your harvester around the clock, that's still 1,680,000 tweets per day (350 requests/hour * 24 hours/day * 200 tweets/request).

So, for example, if you decided to pull 1,000 tweets per user per day (5 API calls @ 200 tweets per call), you could run through 1,680 user timelines per day (70 timelines per hour). Then, on the next day, begin where you left off by harvesting the next 1,000 tweets using the oldest status ID per user as the max_id parameter in your statuses/user_timeline request.

The streaming API will keep you abreast of any new statuses your target users tweet, and the REST API calls will pretty quickly, in about four days, start running into Twitter's fetch limit for those users' historical tweets. After that, you can add additional users to fetch going forward from the streaming endpoint by adding them to the follow list, and you can stop fetching historical tweets for those users that have maxed out, and start fetching a new target group's tweets.


The Search API would seem to be appropriate for your needs, since you can search on screen name. The Search API rate limit is higher than the REST API rate limit.

0

精彩评论

暂无评论...
验证码 换一张
取 消