开发者

Junk Generator Speed Problem

开发者 https://www.devze.com 2023-04-02 20:03 出处:网络
I am looking into generating a file ( 750 MB ) full of random byt开发者_开发技巧es. The code I am using in a separate thread looks like this:

I am looking into generating a file ( 750 MB ) full of random byt开发者_开发技巧es. The code I am using in a separate thread looks like this:

I allocated a buffer of that size since writing on the disk consumes more time:

function Generate(buf:Pointer):DWORD;stdcall;
var
i:DWORD;
begin
      for i := 0 to keysize -1 do
            PByte(DWORD(buf) + i)^ := Random(256);
      Result:=0;
end;

The problem is that it takes ages until the process completes. Any ideas for a faster method? I will try to implement it in assembly if there isn't any alternative.


This sounded like a nice practice problem so I went ahead and implemented a parallel solution. It uses slightly over 3 seconds to generate 750 MB file and uses over 90% CPU during its work. (SSD disk helps, too. 3,5 seconds were needed to generate the file on a RAID0 disk pair and 4 seconds to generate a file on a slower 512 GB disk.)

All reused code is available with the OpenBSD license (which is almost "use however you wish"): DSiWin32, GpStuff, GpRandomGen, Otl*.

uses
  DSiWin32,
  GpStuff,
  GpRandomGen,
  OtlCommon,
  OtlCollections,
  OtlParallel;

{$R *.dfm}

procedure FillBuffer(buf: pointer; bufSize: integer; randomGen: TGpRandom);
var
  buf64: PInt64;
  buf8 : PByte;
  i    : integer;
  rnd  : int64;
begin
  buf64 := buf;
  for i := 1 to bufSize div SizeOf(int64) do begin
    buf64^ := randomGen.Rnd64;
    Inc(buf64);
  end;
  rnd := randomGen.Rnd64;
  buf8 := PByte(buf64);
  for i := 1 to bufSize mod SizeOf(int64) do begin
    buf8^ := rnd AND $FF;
    rnd := rnd SHR 8;
    Inc(buf8);
  end;
end; { FillBuffer }

procedure CreateRandomFile(fileSize: integer; output: TStream);
const
  CBlockSize = 1 * 1024 * 1024 {1 MB};
var
  buffer        : TOmniValue;
  lastBufferSize: integer;
  memStr        : TMemoryStream;
  numBuffers    : integer;
  outQueue      : IOmniBlockingCollection;
begin
  outQueue := TOmniBlockingCollection.Create;
  numBuffers := (fileSize - 1) div CBlockSize + 1;
  lastBufferSize := (fileSize - 1) mod CBlockSize + 1;
  Parallel.ForEach(1, numBuffers).NoWait
    .NumTasks(Environment.Process.Affinity.Count)
    .OnStop(
      procedure
      begin
        outQueue.CompleteAdding;
      end)
    .Initialize(
      procedure(var taskState: TOmniValue)
      begin
        taskState := TGpRandom.Create;
      end)
    .Finalize(
      procedure(const taskState: TOmniValue)
      begin
        taskState.AsObject.Free;
      end)
    .Execute(
      procedure(const value: integer; var taskState: TOmniValue)
      var
        buffer      : TMemoryStream;
        bytesToWrite: integer;
      begin
        if value = numBuffers then
          bytesToWrite := lastBufferSize
        else
          bytesToWrite := CBlockSize;
        buffer := TMemoryStream.Create;
        buffer.Size := bytesToWrite;
        FillBuffer(buffer.Memory, bytesToWrite, taskState.AsObject as TGpRandom);
        outQueue.Add(buffer);
      end);
  for buffer in outQueue do begin
    memStr := buffer.AsObject as TMemoryStream;
    output.CopyFrom(memStr, 0);
    FreeAndNil(memStr);
  end;
end;

procedure TForm43.btnRandomClick(Sender: TObject);
var
  fileStr: TFileStream;
  time   : int64;
begin
  time := DSiTimeGetTime64;
  try
    fileStr := TFileStream.Create('e:\0\random.dat', fmCreate);
    try
      CreateRandomFile(750*1024*1024, fileStr);
    finally FreeAndNil(fileStr); end;
  finally Caption := Format('Completed in %d ms', [DSiElapsedTime64(time)]); end;
end;

EDIT: Using ForEach in this case was not really elegant solution so I enhanced OmniThreadLibrary with Parallel.ParallelTask and with better IOmniCounter. Using release 993 (or newer) from the SVN you can solve this multiple-producer-single-consumer problem as follows.

procedure CreateRandomFile(fileSize: integer; output: TStream);
const
  CBlockSize = 1 * 1024 * 1024 {1 MB};
var
  buffer   : TOmniValue;
  memStr   : TMemoryStream;
  outQueue : IOmniBlockingCollection;
  unwritten: IOmniCounter;
begin
  outQueue := TOmniBlockingCollection.Create;
  unwritten := CreateCounter(fileSize);
  Parallel.ParallelTask.NoWait
    .NumTasks(Environment.Process.Affinity.Count)
    .OnStop(Parallel.CompleteQueue(outQueue))
    .Execute(
      procedure
      var
        buffer      : TMemoryStream;
        bytesToWrite: integer;
        randomGen   : TGpRandom;
      begin
        randomGen := TGpRandom.Create;
        try
          while unwritten.Take(CBlockSize, bytesToWrite) do begin
            buffer := TMemoryStream.Create;
            buffer.Size := bytesToWrite;
            FillBuffer(buffer.Memory, bytesToWrite, randomGen);
            outQueue.Add(buffer);
          end;
        finally FreeAndNil(randomGen); end;
      end
    );
  for buffer in outQueue do begin
    memStr := buffer.AsObject as TMemoryStream;
    output.CopyFrom(memStr, 0);
    FreeAndNil(memStr);
  end;
end;

EDIT2: A longer blog post about this problem: Life after 2.1: Parallel data production (Introducing Parallel.Task)


I don't know about Delphi, but it might be wasting time on the Random(256) call. Why don't you handcode something pseudorandom to the effect of

n = (n * 1103515245 + 12345) & 0xff;

Let n start somewhere and use the recursion, such as this one, to generate next n. It's not really that random, but it should do for creating random files.

EDIT Some food for thought. If you are creating this file in hopes that it will not be easily compressible, then the method outlined above isn't that good, because of the & 0xff part. It's better then to do

n = (n * 1103515245 + 12345) & 0x7fffffff;

as 0x7fffffff = 2147483647 is a prime number. And store the exact larger value of n, and do a n % 256 on assignment. I've had some good runs with this choice of constants, and much prefer it as entropy source to the in-built .NET alternative, as it's many times faster, and you seldom need truly random or better pseudorandom numbers anyway.


The problem is that Random() has limited entropy. And if you generate 750MiB of data, you will get only one of the 2^31 possible different strings (since that is the period of the RNG), not 2^(750*1024*1024*8), which would be the case if generator was perfect. This is a huge disparity.

In short, if you use Random(), your data is not random at all. Anyone could guess all 750MiB of data from a 4MB sample / piece of the file.

You have to do it different way. If you have linux machine, execute this command from your program:

dd if=/dev/urandom of=file.img bs=1M count=750

It finishes in under a half a minute on my old laptop.


As the Random function has no good distribution anyway, you may reduce your code by nearly a factor of four with the following:

function Generate(buf: Pointer): DWORD; stdcall;
var
  i: DWORD;
  p: PInteger;
begin
  p := buf;
  for i := 0 to (keysize div 4) - 1 do begin
    p^ := Random(MaxInt);
    Inc(p);
  end;
  Result := 0;
end;

Update: The above code needs about 650ms on my system, while the original code needs about 3s.


You could try RandomRange(Low(Integer), High(Integer)) and see if it works. This will generate 4 bytes of random data at a time (be aware that it's signed, and that I'm supposing the Integer is 4 bytes, but The Integer type is an Integer whose size is not guaranteed (http://www.delphibasics.co.uk/RTL.asp?Name=Integer).


    var
  F: TFileStream;
  I: Cardinal;
  index: integer;

  a: array[1..10240] of Cardinal;
  IndexA: integer;

  T1: TDateTime;
begin
  T1 := Now;

  F := TFileStream.Create( 'D:\filler.fil', fmCreate);
  try
    for index := 1 to (650 * MByte) div (sizeof( A)) do begin

      for indexA := 1 to 10240 do begin
        a[ IndexA] := Random( 4294967295    );
      end;
      F.WriteBuffer( A, SizeOf( A));
    end;
  finally
    F.Free;
  end;

  ShowMessage( SecondsBetween( T1, Now));
end;

Works in 3~4 seconds on an SSD drive. Way easier.


Aside from do your own Random() function and/or use aditional CPU's, for loops a fast approach is:

procedure Generate(p: pointer; size: integer);
type
  TCardinalArray = array[0..0] of cardinal;
  PCardinalArray = ^TCardinalArray;

var
  i: integer;

begin
  i := (size div 4) - 1;
  while i >= 0 do
  begin
    PCardinalArray(p)[i] := Random(MaxInt) * 2;
    Dec(i);
  end;
end;

Since there is no need to increment the pointer and the loop index is compared with a TEST op.

Unit6.pas.46: i := (size div 4) - 1;
0045209C 8BD9             mov ebx,ecx
0045209E 85DB             test ebx,ebx
004520A0 7903             jns $004520a5
004520A2 83C303           add ebx,$03
004520A5 C1FB02           sar ebx,$02
004520A8 4B               dec ebx
Unit6.pas.47: while i >= 0 do
004520A9 85DB             test ebx,ebx
004520AB 7C14             jl $004520c1
Unit6.pas.49: PCardinalArray(p)[i] := Random(MaxInt) * 2;
004520AD B8FFFFFF7F       mov eax,$7fffffff
004520B2 E8C50EFBFF       call Random
004520B7 03C0             add eax,eax
004520B9 89049E           mov [esi+ebx*4],eax
Unit6.pas.50: Dec(i);
004520BC 4B               dec ebx
Unit6.pas.47: while i >= 0 do
004520BD 85DB             test ebx,ebx
004520BF 7DEC             jnl $004520ad

Of course there is no big difference, but it is something...


Excepting other factors, the major speed issues I see with the code in the original post are:

1) running Random for every byte. This function counts for the majority of the processing. Processing every four bytes will be advantageous. 2) minimize the calculations within the loop. I would establish the pointer bounds and then run a while loop (inc or dec by 4) until the difference between the upper bound and the lower bound is less than 4, then inc or dec by 1 the rest of the way. I probably wouldn't consider a for loop at any point in this. 3) I wouldn't run this against a huge amount of data - I wouldn't do 750MB all at once because the speed degradation for handling that amount of data tends to outweigh any performance enhancements through code.

Very lightly tested, and probably much to be improved upon, but the basic idea I had is here:

function Generate(buf: Pointer): DWord; stdcall;
  var
    inbuf, uplimit: Cardinal;
  begin
    inbuf := Cardinal(buf);
    uplimit := inbuf + keysize - 1;
    while (uplimit - inbuf) >= 4 do
      begin
        PDWord(inbuf)^ := Random(MAXINT);
        inc(inbuf, 4);
      end;
    while inbuf <= uplimit do
      begin
        PByte(inbuf)^ := Random(256);
        inc(inbuf, 1);
      end;
    Result := 0;
  end;
0

精彩评论

暂无评论...
验证码 换一张
取 消

关注公众号