this is my pseudocode but I dont see that regex has this function, at least the way I am thinking of it:
#!/usr/bin/env python
import sys
import os
import re
def main():
wantedchars = re.match([ANY CHAR THAT APPEARS LESS THAN 8 TIMES], <text will be pasted here>)
print wantedchars
if __name__=='__main__':
main()
I would like to match any ascii char not just alphanumerics and less meaningful symbols.
like I want to match brackets and backslashes as well if they appear less than 8 times, the only thing I dont care to match/return are whitespace chars. The reason for this whole thing and why I am not trying to pass the text as an argument is that it is for a one time thing that I will expand on later as part of a learning process I am trying to organize. I mainly would like to know if I am going about this in the right way. The other option that came to mind is to iterate over each char in 开发者_如何学JAVAthe text and for each iteration increasing a counter for each unique char, then maybe printing the counters with the lowest values.If the 8 chars aren't contiguous, here is a way to do it using Counter (Python2.7+)
>>> from collections import Counter
# You can get the letters as a list...
>>> [k for k,v in Counter("<text xx will xx be xx pasted xx here>").items() if v<8]
['a', 'b', 'e', 'd', 'i', 'h', 'l', 'p', 's', 'r', 't', 'w', '<', '>']
# ...or a string
>>> "".join(k for k,v in Counter("<text xx will xx be xx pasted xx here>").items() if v<8)
'abedihlpsrtw<>'
There's a recipe for doing counters in older versions of Python too. Here's one for 2.5/2.6
>>> from collections import defaultdict
>>> counter = defaultdict(int)
>>> for c in "<text xx will xx be xx pasted xx here>":
... counter[c]+=1
...
>>> "".join(k for k,v in counter.items() if v<8)
'abedihlpsrtw<>'
here's one for python2.4
>>> counter={}
>>> for c in "<text xx will xx be xx pasted xx here>":
... counter[c] = counter.get(c,0)+1
...
>>> "".join(k for k,v in counter.items() if v<8)
'abedihlpsrtw<>'
精彩评论