I'm writing a python script that uses this awkward glob syntax.
import glob
F = glob.glob('开发者_如何转开发./www.dmoz.org/Science/Environment/index.html')
F += glob.glob('./www.dmoz.org/Science/Environment/*/index.html')
F += glob.glob('./www.dmoz.org/Science/Environment/*/*/index.html')
F += glob.glob('./www.dmoz.org/Science/Environment/*/*/*/index.html')
F += glob.glob('./www.dmoz.org/Science/Environment/*/*/*/*/index.html')
Seems like there ought to be a way to wrap this is one line:
F = glob.glob('./www.dmoz.org/Science/Environment/[super_wildcard]/index.html')
But I don't know what the appropriate super wildcard would be. Does such a thing exist?
Sorry - it does not. You will have to probably write few lines of code using os.walk:
for root, dirs, files in os.walk('/starting/path/'):
for myFile in files:
if myFile == "index.html":
print os.path.join(root, myFile)
I don't know if this is new, but glob CAN do this now.
For example,
F = glob.glob('./www.dmoz.org/Science/Environment/**/index.html', recursive=True)
I have just released Formic which implements exactly the wildcard you need - '**' - in an implementation of Apache Ant's FileSet and Globs.
The search can be implemented:
import formic
fileset = formic.FileSet(include="/www.dmoz.org/Science/Environment/**/index.html")
for file_name in fileset.qualified_files():
# Do something with file_name
This will search from the current directory. I hope this helps.
It's not perfect, but works for me:
for i in range(max_depth):
components= ['./www.dmoz.org/Science/Environment',]+(['*']*i)+['index.html']
fsearch=os.path.join(*components)
fs_res=glob.glob(fsearch)
if len(fs_res)==1:
return fs_res[0]
10 years later ... pathlib solution
from pathlib import Path
F = Path("./www.dmoz.org/Science/Environment/").glob('**/*index.html')
Where [super_wildcard]
= **
.
精彩评论