Let's say that I have a batch file that reads arbitrary integers from a file. The file is structured such that each line contains one integer, like so:
24
17
43
103
...
I need to calculate the average of the top 20 numbers that are in the file. In order to do that, I need some sort of data structure that stores the top 20 numbers. However, as far as I know there are no arrays in batch files. I may have to resort to using t开发者_开发问答emporary files or some other method that I am not aware of. So my ultimate goal is to determine the best approach for implementing some sort of sorting algorithm for the batch file and calculate the average of the top 20 integers.
There is a constraint that I need to place on the problem. The file is pretty huge in terms of size (around ~500 lines) so I would rather not use temporary files due to the huge amount of read/write operations done (unless if you can convince me otherwise of course).
You can mimic arrays in batch. Take a look at Using Arrays in Batch Files.
The following solution uses arrays as described in the article by the link provided in the @Stoney's answer. It also uses zero-padding for correct sorting, which is a nice idea by @jeb, although this solution doesn't use the sort
command. Instead, the sorting is automatically done by the SET
command, whose output is used for iterating over the 'array'.
@ECHO OFF
SET top=5
SET cnt=0
FOR /F %%N IN (datafile) DO CALL :insert %%N
IF %cnt%==0 GOTO :EOF
IF %cnt% LSS %top% (SET threshold=0) ELSE SET /A threshold=cnt-top
SET s=0
SET i=0
FOR /F "tokens=2 delims=.=" %%A IN ('SET __number.') DO CALL :calc %%A
SET /A res=s/(cnt-threshold)-1000000
ECHO Average is %res%
PAUSE
GOTO :EOF
:insert
SET /A n=1000000+%1
SET /A __number.%n%+=1
SET /A cnt+=1
GOTO :EOF
:calc
SET /A i_prev=i
SET /A i+=__number.%1
IF %i% LEQ %threshold% GOTO :EOF
IF %i_prev% GEQ %threshold% (
SET /A s+=%1*__number.%1
) ELSE (
SET /A "s+=%1*(i_prev+__number.%1-threshold)"
)
Basically, the solution implements the following algorithm:
Pick numbers from the file one by one:
1.1. If the number is encountered for the first time, add it to the array with the count of 1.
1.2. If the number is a duplicate of an already added number, increase the corresponding count value by 1.
1.3. Increase the total count of numbers by 1.
Calculate the threshold value, which is the total count minus the top quantity of numbers whose average is to be calculated.
Iterate through the array items like this:
3.1. Increase the index by the current number's count value.
3.2. If the index exceeds the threshold:
If it has exceeded the threshold at the current iteration, increase the total sum by the product of the number and that part of its count that has exceeded the threshold.
If the threshold was exceeded earlier, increase the total sum by the product of the number and its count.
3.3. If the index doesn't exceed the threshold, omit the item.
Calculate the average as the total sum divided by the difference between total count and threshold.
You can use the sort
command to sort the numbers, but there is the porblem, that sort uses a string sort and does not sort numbers, so a 2 seems to be greater than 10.
But this can be solved if you format all numbers to the same length into a temporary file.
So you get
024
017
043
103
...
Sort them with the /R (Reverse) option, to begin the output with the biggest number.
Then you can simply read 20 lines and build the sum for the average
for small files:
@Echo oFF
for /f %%a in (f1.txt) do Call :Append %%a
call :sort %sort%
pause
goto :EOF
:Append
call set sort=%%sort%% %*
goto :eof
:Sort
Setlocal EnableDelayedExpansion
Set/A n=1,s=0,c=s,r=s
for %%: In (%*) do (
Set /a c+=1
Set "nm.!c!=%%:")
:LP.1
if %s% EQU %c% Set/A n+=1,s=0
Set/A s+=1
Call :SPL %n% %s%
If %n% LEQ %c% goto :LP.1
:LP.2
Echo:!nm.%c%!
Set/A c-=1
If %c% GTR 0 goto :LP.2
Endlocal & goto :EOF
:SPL
If !nm.%1! GTR !nm.%2! (
Set "t=!nm.%2!"&Set "nm.%2=!nm.%1!"
Set "nm.%1=!t!"
)
goto :EOF
or a variant:
@echo off
for /f %%# in (f1.txt) do (
set x=##########%%#
call set #%%x:~-0xa%%==)
for /f "delims=#=" %%a in ('set ##') do echo(%%a
pause
there are also: gnu sort
精彩评论