I know how to use the DirectoryInfo.GetFiles(), but I do not think this is the fastest way to go. These "FileInfo" objects seem a little bit big...
Why do I need it? Well, I tried to achieve my search tool with WDS, but I give up. The OleDB connection is horrible, strange errors without any explanation. So what I am going to do is:
Rebuild the file index in SQL2008.
Currently there are a few open points to check, mostly regarding maintenance:
- How do I get all files into the DB
- How would I keep the DB in sync with the file system
I will try how much resources the FileSystemWatcher needs later, for now I am looking for the fastest way to get all files from a drive, the full path as string would be sufficient.
So, assume I give you this:
List<string> allFiles =
How would you fill it really fast :-) And btw
new FileInfo("D:").GetFiles("*",SearchOption.All)
is not the best way, I think. Reason 1, possible overhead. More severe reason 2开发者_开发百科: throws in case of not accessible path (which will most surely happen after 1.5 Mio files)..
I fire off a separate thread for each subdirectory, and throttle the threads with wait objects. This way I keep a manageable memory size by sending the file names to a database (or a file if you want) and make it fast by having a couple of threads doing the work.
Take a look at this question, it has several alternatives to recursively get files in a lazy way, thus considerably reducing overhead.
What about this. It uses the ThreadPool and recursion. Sending the output directly to a database took way too long, but I think once you get it into a file, you can figure out an efficient way to get it to a database if you want.
Output...
56337/379104 - (number directories/files)
Elapsed seconds: 13.0
Code...
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Threading;
using System.Data.SqlClient;
using System.IO;
namespace FileCacher
{
class Program
{
public static void Main()
{
try
{
CacheFiles();
}
finally
{
Console.WriteLine();
Console.WriteLine("Press any key to exit.");
Console.ReadKey();
}
}
private static List<string> sFiles = new List<string>();
private static void AddFiles(params string[] files)
{
lock (sFiles)
{
sFiles.AddRange(files);
}
}
private static List<string> sDirectories = new List<string>();
private static void AddDirectories(params string[] dirs)
{
lock (sDirectories)
{
sDirectories.AddRange(dirs);
}
}
private static void CacheFiles()
{
AddDirectories(@"C:\");
CacheDirectory(@"C:\");
var numFiles = 0;
var numDirs = 0;
while (true)
{
Thread.Sleep(1000);
var newNumDirs = sDirectories.Count;
var newNumFiles = sFiles.Count;
if (newNumDirs == numDirs && newNumFiles == numFiles)
{
Console.WriteLine();
break;
}
numDirs = newNumDirs;
numFiles = newNumFiles;
Console.CursorLeft = 0;
Console.Write(string.Format("{0}/{1}", numDirs, numFiles));
}
using (var fs = new FileStream(@"C:\garb\Dirs.txt", FileMode.Create, FileAccess.Write))
{
var sw = new StreamWriter(fs);
sDirectories.Sort();
foreach (var dir in sDirectories)
sw.WriteLine(dir);
}
using (var fs = new FileStream(@"C:\garb\Files.txt", FileMode.Create, FileAccess.Write))
{
var sw = new StreamWriter(fs);
sFiles.Sort();
foreach (var file in sFiles)
sw.WriteLine(file);
}
}
private static void CacheDirectory(object dir)
{
try
{
var dirPath = (string)dir;
var dirs = Directory.GetDirectories(dirPath);
AddDirectories(dirs);
AddFiles(Directory.GetFiles(dirPath));
foreach (var childDir in dirs)
ThreadPool.QueueUserWorkItem(new WaitCallback(CacheDirectory), childDir);
}
catch (UnauthorizedAccessException)
{
//ignore
}
}
}
}
精彩评论