开发者

Python: Matching & Stripping port number from socket data

开发者 https://www.devze.com 2022-12-23 01:03 出处:网络
I have data coming in to a python server via a socket. Within this data is the string \'<port>80</port>\' or which ever port is being used.

I have data coming in to a python server via a socket. Within this data is the string '<port>80</port>' or which ever port is being used.

I wish to extract the port number into a variable. The data coming in is not XML, I just used the tag approach to identifying data for future XML use if needed. I do not wish to use an XML python library, but simply开发者_运维技巧 use something like regexp and strings.

What would you recommend is the best way to match and strip this data?

I am currently using this code with no luck:

p = re.compile('<port>\w</port>')
m = p.search(data)
print m

Thank you :)


Regex can't parse XML and shouldn't be used to parse fake XML. You should do one of

  • Use a serialization method that is nicer to work with to start with, such as JSON or an ini file with the ConfigParser module.
  • Really use XML and not something that just sort of looks like XML and really parse it with something like lxml.etree.
  • Just store the number in a file if this is the entirety of your configuration. This solution isn't really easier than just using JSON or something, but it's better than the current one.

Implementing a bad solution now for future needs that you have no way of defining or accurately predicting is always a bad approach. You will be kept busy enough trying to write and maintain software now that there is no good reason to try to satisfy unknown future needs. I have never seen a case where "I'll put this in for later" has led to less headache later on, especially when I put it in by doing something completely wrong. YAGNI!

As to what's wrong with your snippet other than using an entirely wrong approach, angled brackets have a meaning in regex.


Though Mike Graham is correct, using regex for xml is not 'recommended', the following will work:

(I have defined searchType as 'd' for numerals)
searchStr = 'port'

if searchType == 'd':
    retPattern = '(<%s>)(\d+)(</%s>)'
else:
    retPattern = '(<%s>)(.+?)(</%s>)'

searchPattern = re.compile(retPattern % (searchStr, searchStr))
found = searchPattern.search(searchStr)
retVal = found.group(2)

(note the complete lack of error checking, that is left as an exercise for the user)

0

精彩评论

暂无评论...
验证码 换一张
取 消