I have this as my input
content = 'abc.zip'\n
I want to take out abc out of it . How do I do it using regex in python ?
Edit :
No this is not a homework question . I am trying to automate something and I am stuck at a certain point so that I can make the automate generic to any zip file I have .
os.system('python unzip.py -z data/ABC.zip -o data/')
After I take in the zip file , I unzip开发者_运维技巧 it . I am planning to make it generic , by getting the filename from the directory the zip file was put in and then provide the file name to the upper stated syntax to unzip it
As I implied in my comment, regular expressions are unlikely to be the best tool for the job (unless there is some artificial restriction on the problem, or it is far more complex than your example). The standard string and/or path libraries provide functions which should do what you are after. To better illustrate how these work, I'll use the following definition of content
instead:
>>> content = 'abc.def.zip'
If its a file, and you want the name and extension:
>>> import os.path
>>> filename, extension = os.path.splitext(content)
>>> print filename
abc.def
>>> print extension
.zip
If it is a string, and you want to remove the substring 'abc':
>>> noabc = content.replace('abc', '')
>>> print noabc
.def.zip
If you want to break it up on each occurrence of a period;
>>> broken = content.split('.')
>>> print broken
['abc', 'def', 'zip']
If it has multiple periods, and you want to break it on the first or last one:
>>> broken = content.split('.', 1)
>>> print broken
['abc', 'def.zip']
>>> broken = content.rsplit('.', 1)
>>> print broken
['abc.def', 'zip']
Edit: Changed the regexp to match for "content = 'abc.zip\n'" instead of the string "abc.zip".
import re
#Matching for "content = 'abc.zip\n'"
matches = re.match("(?P<filename>.*).zip\n'$", "content = 'abc.zip\n'")
matches = matches.groupdict()
print matches
#Matching for "abc.zip"
matches = re.match("(?P<filename>.*).zip$", "abc.zip")
matches = matches.groupdict()
print matches
Output:
{'filename': 'abc'}
This will print the matches of everything before .zip
. You can access everything like a regular dictionary.
If you're trying to break up parts of a path, you may find the os.path module to be useful. It has nice abstractions with clear semantics that are easy to use.
精彩评论