I have a client request on one of my projects where they want to be able to enter a url and have it pull in some information form the site who's url they entered and save it in the database.
So the user enters: http://www.example.com/2342342 and my controller visits that site, and gets the content of the first <h1>Tag</h1>
on the site and saves this in the database. Is this possible? If so, how would I go about开发者_开发知识库 doing it? Would I use some rails commands to do it, or something else, like jQuery?
Nokogiri is a great parser and can work directly with an url.
So two steps there:
Instantiate a Nokogiri object with the url as param
Parse the html page to get what you expect
Find instructions here: http://nokogiri.org/tutorials/parsing_an_html_xml_document.html
Because you'll work with another website, keep in mind two advice:
wrap your queries so that you can rescue if the website is down
consider using ajax request because it could be long
I would checkout the Railscast here:
http://railscasts.com/episodes/190-screen-scraping-with-nokogiri
It's explained very well on how to use Nokogiri and scrape content from other sites.
精彩评论