开发者

Python Parsing user defined values

开发者 https://www.devze.com 2023-03-16 16:02 出处:网络
I was trying to change:- import urllib2 as urllib ... ... file2 = urllib.urlopen(url2) ... ... for line in file2:

I was trying to change:-

import urllib2 as urllib
... ...
file2 = urllib.urlopen(url2)
... ...
for line in file2:
    indexfrom2 = line.find('Mean Temperature')
    if indexfrom2 > -1:
        nxtLn = file2.next()
        nextLine = file2.next()
        indexfrom21 = nextLine.find('"nobr"')
        if indexfrom21 > -1:
            indexto21 = nextLine.find('</span>&nbsp;&开发者_如何学JAVA;deg;C</span>',indexfrom21)
        code2 = nextLine[indexfrom21+23:indexto21]
        print code2

and make it to look something like:-

class (...)  
def ....  
Temperature = parse( file2, '<span>Mean Temperature</span></td>', '<b>' )  

but I'm not sure how to do it. The above set of codes that I want to parse is a repeated for different values and I want to keep it short using parsing function so that it forms a set or a loop where i don't have to repeat all the codes again and again. [for every value (like mean temp, max temp, humidity, pressure, etc.), the code is repeated on my script, kinda looks unprofessional].


You probably want to be using BeautifulSoup for this. It's the canonical way to parse HTML (and it works pretty well even in some horrible edge cases). If you continue with your current approach, you're relying on things like line numbers and so your code is pretty brittle in the face of minor document structure changes.

http://www.crummy.com/software/BeautifulSoup/

0

精彩评论

暂无评论...
验证码 换一张
取 消