开发者

Turn URLs and @* into links

开发者 https://www.devze.com 2023-02-02 03:02 出处:网络
I\'m getting my latest tweets with HTTParty and Hashie like so. tweet = Hashie::Ma开发者_开发知识库sh.new HTTParty.get(http://twitter.com/statuses/user_timeline/ethnt.json).first

I'm getting my latest tweets with HTTParty and Hashie like so.

tweet = Hashie::Ma开发者_开发知识库sh.new HTTParty.get(http://twitter.com/statuses/user_timeline/ethnt.json).first
puts tweet.text

I want to be able to turn every link (http://*.*) and usernames (@.) into links. What would the regex for both of these be, and how would I implement it?


def link_urls_and_users s

    #regexps
    url = /( |^)http:\/\/([^\s]*\.[^\s]*)( |$)/
    user = /@(\w+)/

    #replace @usernames with links to that user
    while s =~ user
        s.sub! "@#{$1}", "<a href='http://twitter.com/#{$1}' >#{$1}</a>"
    end

    #replace urls with links
    while s =~ url
        name = $2
        s.sub! /( |^)http:\/\/#{name}( |$)/, " <a href='http://#{name}' >#{name}</a> "
    end

     s

end


puts link_urls_and_users(tweet.text)

This works, so long as URLs are padded by spaces or are at the beginning and/or end of the tweet.


For finding URLs in text, why not reuse an existing wheel instead of invent a new one?

require 'uri'
require 'open-uri'

body = open('http://stackoverflow.com/questions/4571229/turn-urls-and-into-links').read
uris = URI::extract(body)
uris.size # => 102
uris.first # => "http://www.w3.org/TR/html4/strict.dtd"
uris.last # => "http://edge.quantserve.com/quant.js"

Add that to the answer given by @stef and you're done.


This project has a method for it: https://github.com/mzsanford/twitter-text-rb

From their docs:

class MyClass
  include Twitter::Extractor
  usernames = extract_mentioned_screen_names("Mentioning @twitter and @jack")
  # usernames = ["twitter", "jack"]
end


You can try this:

# Arrays
links = []    
usernames = []

links = tweet.text.scan(/(http:\/\/\w+(\.?\w+(:\d+)?\/?)+)/i).map{|e| e[0]}
usernames = tweet.text.scan(/@(\w+)/i).map{|e| "<a href='http://twitter.com/#{e[0]}'>@#{e[0]}</a>"}

The regex for the url is not perfect, but good enough for the common ones.


Expanding on the Tin Man's answer, there's a simple one liner for making URLs clickable.

URI::extract(body).each { |uri| body.gsub!(uri, %Q{<a href="#{uri}">#{uri}</a>})}

You'll then need to use body.html_safe if in Rails. For the Twitter users, you should really be relying on the Twitter API to tell you what is and isn't a valid username, because they can correctly filter out "@looksvalid" when there is no user by that name.

0

精彩评论

暂无评论...
验证码 换一张
取 消