开发者

How to get around batch file processing limit

开发者 https://www.devze.com 2022-12-22 10:49 出处:网络
I have a Windows batch file that processes all the files in a given directory. I have 206,783 files I need to process:

I have a Windows batch file that processes all the files in a given directory. I have 206,783 files I need to process:

for %%f in (*.xml) do call :PROCESS %%开发者_高级运维f
goto :STOP

:PROCESS
:: do something with the file
program.exe %1 > %1.new
set /a COUNTER=%COUNTER%+1
goto :EOF

:STOP
@echo %COUNTER% files processed

When I run the batch file, the following output is written:

65535 files processed

As part of the processing, an output file is created for each file procesed, with a .new extension. When I do a dir *.new it reports 65,535 files exist.

So, it appears my command environment has a hard limit on the number of files it can recognize, and that limit is 64K - 1.

  1. Is there a way to extend the command environment to manage more than 64K - 1 files?
  2. If not, would a VBScript or JavaScript be able to process all 206,783 files?

I'm running on Windows 2003 server, Enterprise Edition, 32-bit.


UPDATE

It looks like the root cause of my issue was with the built-in Windows "extract" command for ZIP files.

The files I have to process were copied from another system via a ZIP file. My server doesn't have a ZIP utility installed, just the native Windows commands. I right-clicked on the ZIP file, and did an "Extract all...", which apparently just extracted the first 65,535 files.

I downloaded and installed 7-zip onto my server, unzipped all the files, and my batch script worked as intended.


Another option might be to iterate over the output of dir instead of directly over the files. I usually hate it when people do this, but apparently there are limitations to standard iterating idioms.

for /f "delims=" %%f in ('dir /b *.xml') do call :PROCESS %%f 

I'm currently trying this out, but it might take a while; just filled a directory with 100k files.

But keep in mind that using the output of a command has problems with Unicode if you're using Raster fonts, so make sure that your console window has Lucida Console or another TrueType font set. Otherwise Unicode characters get resolved to question marks or their closest equivalent in the current codepage—but the program won't find the file, then.

ETA: This can't be the issue, apparently. Both your code and my testing code which iterates over dir output process 300k files on both Windows Server 2k3 R2, 32 bit and Windows 7.


  1. if program.exe is in-built, you can refactor it to take in arguments so that you can do away with the for loop
  2. you can store you output files into different directories instead of creating onto the same directory
  3. you can group your outputs into categories, so you have less output files to deal with.


Two options:

1) I suggest you to add a "move" after the .exe processing, so that your batch file can be relaunched and it will process only the files which are still in the original directory. This is a good idea regardless of the actual size limit, so you don't risk to have to reprocess stuff in case your batch is interrupted or power goes off etc.

2) Use another scripting language, like a windows Perl interpreter, or maybe WSH.

0

精彩评论

暂无评论...
验证码 换一张
取 消