I am trying to write a regex that will extract a tweet id from a Twi开发者_开发知识库tter URL.
I have this one, which works when the Twitter username has a number in it:
'.*?\\d+.*?(\\d+)'
ruby-1.9.2-p0 > Regexp.new('.*?\\d+.*?(\\d+)',Regexp::IGNORECASE).match('https://twitter.com/#!/sportsguy33/status/41257488166686720')[1]
=> "41257488166686720"
ruby-1.9.2-p0 > Regexp.new('.*?\\d+.*?(\\d+)',Regexp::IGNORECASE).match('http://twitter.com/#!/dailythunder/status/41382006113841153')[1]
=> "3"
And this one, which works when the Twitter username doesn't have a number in it
'.*?(\\d+)'
ruby-1.9.2-p0 > Regexp.new('.*?(\\d+)',Regexp::IGNORECASE).match('https://twitter.com/#!/sportsguy33/status/41257488166686720')[1]
=> "33"
ruby-1.9.2-p0 > Regexp.new('.*?(\\d+)',Regexp::IGNORECASE).match('http://twitter.com/#!/dailythunder/status/41382006113841153')[1]
=> "41382006113841153"
How can I write one that will work in either case?
if the tweet ID is the last part of the url, you can use:
'\/(\d+)$'
the $ means the end of the string
I just released a gem tweet_url to parse Twitter URL.
require 'tweet_url'
tweet_url = TweetUrl.parse('https://twitter.com/yukihiro_matz/status/755950562227605504')
tweet_url.status_id #=> 755950562227605504
Heads up! Be aware of that possibly there's a URL like https://twitter.com/sferik/status/540897316908331009/photo/1, so we cannot simply extract the last numeric part.
I would suggest you try out Rubular.
Rubular is a Ruby-based regular expression editor. It's a handy way to test regular expressions as you write them.
精彩评论