I'm writing a Windows service in ruby using the Win32-utils gems. The service currently works but a large part of it's function requires it to know when a file has been modified. I'm currently doing this with a large hash containing data about each file, which works great for relatively small directories, but when put into use on a folder containing ~50000 files this eats a lot of memory and takes a long time to check for updates.
The code looks like this:
First run (setting up the hash):
Find.find(@local_base) do |path|
# Don't keep any directories in the hash
if not FileTest.directory?(path)
f = open(path)
f.rewind
@files[path.gsub(@local_base, "")] = DataFile.new(@local_base,
path.gsub(@local_base, ""),
Digest::MD5.hexdigest(f.read.gsub("\n", "\r\n")),
f.mtime.to_i,
@last_checked)
end
end
Subsequent runs (checking for updates):
def check_for_updates
# can't/shouldn't modified a hash while iterating, so set up temp storage
tempHash = Hash.new
Find.find(@local_base) do |path|
# Ignore directories
if not FileTest.directory?(path)
File.open(path) do |f|
#...and the file is already in the hash...
if not @files[path.gsub(@local_base, "")].nil?
# If it's been modified since the last scan...
if f.mtime.to_i > @last_checked
#...and the contents are modified...
if @files[path.gsub(@local_base, "")].modified?
#...update the hash with the new mtime and checksum
@files[path.gsub(@local_base, "")].update
end
end # mtime check
else
# If it's a new file stick it in the temporary hash
f.rewind
tempHash[f.path] = DataFile.new(@local_base,
path.gsub(@local_base, ""),
Digest::MD5.hexdigest(f.read.gsub("\n", "\r\n")),
f.mtime.to_i,
@last_scan)
end # nil check
end # File.open block
end # directory check
end # Find.find block
# If any new files are in the tempHash, add them to @files
if not tempHash.empty?
tempHash.each do |k, v|
开发者_如何转开发@files[k] = v
end
end
# clear tempHash and update registry
tempHash = nil
update_last_checked
end
Is there a faster/more efficient way to notify my program of modified files, even better if I can do it without recursively searching the whole directory.
You could leave it to Windows to warn you if the change journal is modified. There is a gem which "listens" to the service.
Check out rstakeout.rb. It will recursively watch directories, but it looks like it checks for the file modification criteria differently. I'm unsure of the speed on large file sets, but maybe it will give you some ideas.
精彩评论