Based on some search parameters, I am collecting tweets. However, I want to EXCLUDE tweets that m开发者_开发问答eet the following conditions:
- contains a URL (no hyperlinks!)
- is in another language other than English
- contains an @reply
- contains hash reference #keyword
For you Twitter API Search gurus out there, how would you go about doing so?
There is a lang option in the twitter api for the search method, it will take care of the tweets written in another language.
For the other conditions, I would just use a regexp on the "text" attributes to exclude the tweets. There is a callback option for the search method. I suppose you could use it to filter out the tweets.
The search API does have a lang=en option, but it doesn't work that well. It only makes sure that the user has set their language to English in their account profile. Unfortunately, this is the default, so there will be search results returned that are not in English even when you search with lang=en. Excluding these completely is not easy. I know of no guaranteed way to do this.
精彩评论