开发者

Monitoring a directory for new file creation without FileSystemWatcher

开发者 https://www.devze.com 2023-01-14 20:02 出处:网络
I have to create a Windows service which monitors a specified folder for new files and processes it and moves it to other location.

I have to create a Windows service which monitors a specified folder for new files and processes it and moves it to other location.

I started w开发者_开发知识库ith using FileSystemWatcher. My boss doesn't like FileSystemWatcher and wants me to use polling by using a Timer or any other mechanism other than FileSystemWatcher.

How can you monitor directorying without using FileSystemWatcher using .NET framework?


Actually, the FileWatcher component is not 100% "stable" from my experience over the years. Push enough files into a folder and you will lose some events. This is especially true if you monitor a file share, even if you increase the buffer size.

So, for all practical reasons, use FileWatcher together with a Timer which scans a folder for changes, for the most optimal solution.

Examples of creating Timer code should be in abundance if you google it. If you keep track of the last DateTime when the timer ran, then check the modified date of each file, and compare it to the date. Fairly simple logic.

The timer interval depends of how urgent the changes are for your system. But check every minute should be fine for many scenarios.


Using @Petoj's answer I've included a full windows service that polls every five minutes for new files. Its contrained so only one thread polls, accounts for processing time and supports pause and timely stopping. It also supports easy attaching of a debbugger on system.start

 public partial class Service : ServiceBase{


    List<string> fileList = new List<string>();

    System.Timers.Timer timer;


    public Service()
    {
        timer = new System.Timers.Timer();
        //When autoreset is True there are reentrancy problems.
        timer.AutoReset = false;

        timer.Elapsed += new System.Timers.ElapsedEventHandler(DoStuff);
    }


    private void DoStuff(object sender, System.Timers.ElapsedEventArgs e)
    {
       LastChecked = DateTime.Now;

       string[] files = System.IO.Directory.GetFiles("c:\\", "*", System.IO.SearchOption.AllDirectories);

       foreach (string file in files)
       {
           if (!fileList.Contains(file))
           {
               fileList.Add(file);

               do_some_processing();
           }
       }


       TimeSpan ts = DateTime.Now.Subtract(LastChecked);
       TimeSpan MaxWaitTime = TimeSpan.FromMinutes(5);

       if (MaxWaitTime.Subtract(ts).CompareTo(TimeSpan.Zero) > -1)
           timer.Interval = MaxWaitTime.Subtract(ts).TotalMilliseconds;
       else
           timer.Interval = 1;

       timer.Start();
    }

    protected override void OnPause()
    {
        base.OnPause();
        this.timer.Stop();
    }

    protected override void OnContinue()
    {
        base.OnContinue();
        this.timer.Interval = 1;
        this.timer.Start();
    }

    protected override void OnStop()
    {
        base.OnStop();
        this.timer.Stop();
    }

    protected override void OnStart(string[] args)
    {
       foreach (string arg in args)
       {
           if (arg == "DEBUG_SERVICE")
                   DebugMode();

       }

        #if DEBUG
            DebugMode();
        #endif

        timer.Interval = 1;
        timer.Start();
   }

   private static void DebugMode()
   {
       Debugger.Break();
   }

 }


At program startup, use Directory.GetFiles(path) to get the list of files.

Then create a timer, and in its elapsed event call hasNewFiles:

    static List<string> hasNewFiles(string path, List<string> lastKnownFiles)
    {
        List<string> files = Directory.GetFiles(path).ToList();
        List<string> newFiles = new List<string>();

        foreach (string s in files)
        {
            if (!lastKnownFiles.Contains(s))
                newFiles.Add(s);
        }

        return new List<string>();
    }

In the calling code, you'll have new files if:

    List<string> newFiles = hasNewFiles(path, lastKnownFiles);
    if (newFiles.Count > 0)
    {
        processFiles(newFiles);
        lastKnownFiles = newFiles;
    }

edit: if you want a more linqy solution:

    static IEnumerable<string> hasNewFiles(string path, List<string> lastKnownFiles)
    {
        return from f in Directory.GetFiles(path) 
               where !lastKnownFiles.Contains(f) 
               select f;
    }

    List<string> newFiles = hasNewFiles(path, lastKnownFiles); 
    if (newFiles.Count() > 0) 
    { 
        processFiles(newFiles); 
        lastKnownFiles = newFiles; 
    } 


You could use Directory.GetFiles():

using System.IO;

var fileList = new List<string>();

foreach (var file in Directory.GetFiles(@"c:\", "*", SearchOption.AllDirectories))
{
    if (!fileList.Contains(file))
    {
        fileList.Add(file);
        //do something
    }
}

Note this only checks for new files not changed files, if you need that use FileInfo


I would question why not to use the FileSystemWatcher. It registers with the OS and is notified immediately when the event finishes in the file system.

If you really have to poll, then just create a System.Timers.Timer, create a method for it to call, and check for the file in this method.


Yes, you can create a Timer, and plug a handler into the Elapsed event that will instantiate a DirectoryInfo class for the directory you're watching, and call either GetFiles() or EnumerateFiles(). GetFiles() returns a FileInfo[] array, while EnumerateFiles() returns a "streaming" IEnumerable. EnumerateFiles() will be more efficient if you expect a lot of files to be in that folder when you look; you can start working with the IEnumerable before the method has retrieved all the FileInfos, while GetFiles will make you wait.

As to why this may actually be better than FileWatcher, it depends on the architecture behind the scenes. Take, for example, a basic Extract/Transform/Validate/Load workflow. First, such a workflow may have to create expensive instances of objects (DB connections, instances of a rules engine, etc). This one-time overhead is significantly mitigated if the workflow is structured to handle everything available to it in one go. Second, FileWatcher would require anything called by the event handlers, like this workflow, to be thread-safe, since MANY events can be running at once if files are constantly flowing in. If that is not feasible, a Timer can be very easily configured to restrict the system to one running workflow, by having event handlers examine a thread-safe "process running" flag and simply terminate if another handler thread has set it and not yet finished. The files in the folder at that time will be picked up the next time the Timer fires, unlike FileWatcher, where if you terminate the handler the information about the existence of that file is lost.


1) Sounds like your boss is an idiot
2) You will have to use functions like Directory.GetFiles, File.GetLastAccessTime, etc and keep it in memory to check if it changed.


It is a little odd that you cannot use FileSystemWatcher or presumably any of the Win32 APIs that do the same thing, but that is irrelevant at this point. The polling method might look like this.

public class WorseFileSystemWatcher : IDisposable
{
  private ManaulResetEvent m_Stop = new ManaulResetEvent(false);

  public event EventHandler Change;

  public WorseFileSystemWatcher(TimeSpan pollingInterval)
  {
    var thread = new Thread(
      () =>
      {
        while (!m_Stop.WaitOne(pollingInterval))
        {
          // Add your code to check for changes here.
          if (/* change detected */)
          {
            if (Change != null)
            {
              Change(this, new EventArgs())
            }
          }
        }
      });
    thread.Start();
  }

  public void Dispose()
  {
    m_Stop.Set();
  }
}
0

精彩评论

暂无评论...
验证码 换一张
取 消