I am building a small application in RoR that has a form asking for a URL. Once the URL has been filled in and submit button is pressed I have downloaded a web-scraping plugin scrAPI(which is working fine) which gets the of URL and creates a record in db with title.
My issue right now is that I am able to make the whole thing work if the URL is valid and scrAPI is able to process it. If a URL entered does not work it gives this "Scraper::Reader::HTTPInvalidURLError" which is expected, but my knowledge of working in Model is preventing me from handing that error in a correct manner.
Controller:
#controller
class ArticleController < ApplicationController
def savearticle
@newarticle = params[:newarticle]
@link = @newarticle["link"]
@id = @newarticle["id"]
Article.getlink(@link)
success = Article.find(:last).update_attributes( params[:newarticle] )
if success
render :partial => 'home/articlesuccess'
else
render :partial => 'home/articlebad'
end
end
end
# model
require 'scrapi'
class Article < ActiveRecord::Base
attr_accessor :getlink
def self.getlink(link)
scraper = Scraper.define do
process "title", :title => :text
result :title
end
uri = URI.parse(link)
Article.create(:title => scraper.scrape(uri))
end
end
How to:
1) Handle the Scraper::Reader::HTTPInvalidURLError properly, so text could be returned to view with proper error.
2) I would also like to know how I can return 'uri' from model and use it in the controller or view.
3) Also, I would like to return the ID of the Article created in开发者_C百科 Model so I can use that in the controller instead of doing find(:last) which seems like bad practice.
Something like...
class ApplicationController < ActionController::Base
rescue_from 'Scraper::Reader::HTTPInvalidURLError', :with => :invalid_scrape_url
private
def invalid_scrape_url
flash[:error] = 'The URL for scraping is invalid.'
render :template => 'pages/invalid_scrape_url'
end
end
rescue_from is what you need.
That's 1)
for 2) You could just use @uri but personally I'd create a new model called Scrape and then you can retrieve each Scrape that is attempted.
for 3) I'm not quite sure of the question but
@article = Article.create(:title => scraper.scrape(uri))
then @article.id
Hope that helps!
(1) In Ruby, you can handle any exception as follows:
begin
# Code that may throw an exception
rescue Scraper::Reader::HTTPInvalidURLError
# Code to execute if Scraper::Reader::HTTPInvalidURLError is raised
rescue
# Code to execute if any other exception is raised
end
So you could check for this in your controller as follows:
begin
Article.getlink(@link)
# all your other code
rescue Scraper::Reader::HTTPInvalidURLError
render :text => "Invalid URI, says scrAPI"
rescue
render :text => "Something else horrible happened!"
end
You'll need to require 'scrapi'
in your controller to have access Scraper::Reader::HTTPInvalidURLError
constant.
I would probably make the creation of the new Article
and the call to scrAPI's method separate:
title = scraper.scrape(uri)
Article.create(:title => title)
(2) and (3) In Ruby, the last statement of a method is always the return value of that method. So, in your self.getlink
method, the return value is the newly created Article
object. You could get the ID like this in your controller:
article = Article.getlink(@link)
article_id = article.id
You may need to refactor the code a bit to get the results you want (and make the code sample on the whole cleaner).
精彩评论