I am trying to parse an output of about a hundred lines. The format of this output is as such:
<random text> STATION "STATION_NAME_ONE": <random text>
<random text> IP Address: 0.0.0.0 <random text>
<SEVERAL LINES OF RANDOM TEXT>
<random text> STATION "STATION_NAME_TWO": <random text>
<random text> IP Address: 1.1.1.1 <random text>
<SEVERAL LINES OF RANDOM TEXT>
... and so on
I know the IP Address of the station I am looking for. Using the IP address, I am trying to construct a regex that will find the station name. The station name can be any length and can contain any number of numbers/letters/underscores. The station name will always be preceded by STATION and will always be followed by a colon. The IP Address will always be on the line following the station name and will always be preceded by IP Address:.
Note there are several stations with different station names and IP Addresses. The 'random text' can be of any length and contain any symbol/number/letter.
So far my attempts have been:
re.search('(?<=STATION ).*?(?=:.*IP Address: %s)' % sta_ip, output, re.DOTALL)
but obviously this will return pretty much the first station name every time.
How would you make a regex that can se开发者_如何学Pythonarch for the specified station name? Is this possible?
Edit I've got it. The key is that the station name and IP are only separated by one newline, so we can hardcode that newline.
re.search('STATION(?P<StationName>.*?):.*?\n.*?IP Address: %s' % sta_ip, output).group("StationName")
STATION\s*"(.*?)":\s*.*?(?:\r|\n)<.*?>\s*IP Address:\s*IPHERE\s*<
Replace IPHERE with the IP address and to get the station name, extract the first matching group.
Try /STATION\s*?"(.*?)"\s*?:.*?IP Address:\s*?%s/
The trick is not to be greedy about matching. After matching this regex, the name you want will be in the first capture.
精彩评论