开发者

how to parse text

开发者 https://www.devze.com 2023-02-07 21:59 出处:网络
i have a question that i want to read a file, search for any line that has session ID (e.i. 12345), if matched then print all lines after that until encounter newline.

i have a question that i want to read a file, search for any line that has session ID (e.i. 12345), if matched then print all lines after that until encounter newline. after that, how can i assoc开发者_StackOverflow中文版iate all these lines to the session ID if i need to further parse these lines. And i want to do it in Python.

Thanks


this answers the first part of your question:

with open('myfile.txt') as f:
    for line in f:
        if '12456' in line:
            print line

I didn't understand what else you were asking for. can you translate : "how can i associate all these lines to the session ID if i need to further parse these lines"?


I am going to assume that your log file is formatted like

session 321: abc de
    567 89 abd ec

session 12345: ghi lm
    763 98 dba ce

and that what you want to do is find the appropriate session and all following lines until you see a blank line.

import collections
import re

sessionData = collections.defaultdict(list)
lookfor = [12345, 13981]
newSession = re.compile(r'session (\d+):')

with open('my_log_file.txt', 'r') as inf:
    session = None
    for ln in inf:
        ln = ln.rstrip()
        if len(ln):
            match = newSession.match(ln)
            if match:
                s = int(match.group(0))
                if s in lookfor:
                    session = s
            if session:
                print ln
                sessionData[session].append(ln)
        else:
            session = None

sessionData is now a session-keyed dict; for each session, it contains a list of all related lines. Using the above sample data, sessionData would look like

{ 12345: ["session 12345: ghi lm", "    763 98 dba ce"] }
0

精彩评论

暂无评论...
验证码 换一张
取 消