开发者

Hpricot error parsing special characters in URI

开发者 https://www.devze.com 2022-12-19 04:20 出处:网络
I\'m working on a ruby script to grab historical stock prices from Yahoo, using Hpricot to parse the pages. This is mostly straighforward: the url is \"http://finance.yahoo.com/q/hp?s=TickerSymbol\" F

I'm working on a ruby script to grab historical stock prices from Yahoo, using Hpricot to parse the pages. This is mostly straighforward: the url is "http://finance.yahoo.com/q/hp?s=TickerSymbol" For example, to look up Google, I would use "http://finance.yahoo.com/q/hp?s=GOOG"

Unfortunately, it bre开发者_如何学Pythonaks down when I'm looking up the price of an index. The indexes are prefixed with a caret, such as "http://finance.yahoo.com/q/hp?s=^DJI" for the Dow.

The line:

ticker_symbol = '^DJI'
doc = Hpricot(open("http://finance.yahoo.com/q/hp?s=#{ticker_symbol}"))

throws this exception:

bad URI(is not URI?): http://finance.yahoo.com/q/hp?s=^DJI

Hpricot chokes on the caret (I think because the underlying Ruby URI library does). Is there a way to escape that character or force the library to try it?


Well, don't I feel dumb. Five more minutes and I got this working:

doc = Hpricot(open(URI.encode("http://finance.yahoo.com/q/hp?s=#{ticker_symbol}")))

So if anyone else is wondering, that's how you do it. facepalm


The escape for ^ is %5E; you could do a straight substitution on the URL.

http://finance.yahoo.com/q/hp?s=%5EDJI

0

精彩评论

暂无评论...
验证码 换一张
取 消