开发者

Calculating CRC in awk

开发者 https://www.devze.com 2023-02-10 15:38 出处:网络
Has anyone implemented the POSIX 1003.2 compiliant CRC algorithm (as output by cksum) in awk/gawk?I\'m needing to do a checksum on an in memory string (not the whole file) and shelling out to call cks

Has anyone implemented the POSIX 1003.2 compiliant CRC algorithm (as output by cksum) in awk/gawk? I'm needing to do a checksum on an in memory string (not the whole file) and shelling out to call cksum is slow and expensive.

My overall need is to generate a numerical checks开发者_如何学Goum that fits within 10 digits or less. Other hash/CRC functions could work too, anyone have any thing handy?

A Google search and a scan of awk.info turned up nothing interesting.


EDIT:

I ended up using the external cksum command, but caching the results into an awk associative array. Performance was good enough and I didn't need to reinvent the wheel.


gawk/awk implimentation of crc32 (compatible with the POSIX cksum command)

Here is a awk (gawk) implimentation of crc32. Notice that we use T and X as our lookup table. T is used for the crc32_table, and X is used for lookup of data to int value.

If you wish, you can compute the crc32_table on runtime, however it was a bit slow on startup, so you would have a tradeoff between small codesize and slow tartup, or Reasonable speed crc32 calculation and large code size. I would recomend version with a crc_table, as the code size increace was well justifyable when compareson of speed was done.

If you have a version of awk that does not support and(),xor(),compl(),lshift(),rshift() then do not forget to load the bitwise operation libs.

BEGIN{
 # Initialize CRC32 table
  T[0]=0x00000000;
  T[1]=0x04c11db7;T[2]=0x09823b6e;T[3]=0x0d4326d9;T[4]=0x130476dc;T[5]=0x17c56b6b;
  T[6]=0x1a864db2;T[7]=0x1e475005;T[8]=0x2608edb8;T[9]=0x22c9f00f;T[10]=0x2f8ad6d6;
  T[11]=0x2b4bcb61;T[12]=0x350c9b64;T[13]=0x31cd86d3;T[14]=0x3c8ea00a;T[15]=0x384fbdbd;
  T[16]=0x4c11db70;T[17]=0x48d0c6c7;T[18]=0x4593e01e;T[19]=0x4152fda9;T[20]=0x5f15adac;
  T[21]=0x5bd4b01b;T[22]=0x569796c2;T[23]=0x52568b75;T[24]=0x6a1936c8;T[25]=0x6ed82b7f;
  T[26]=0x639b0da6;T[27]=0x675a1011;T[28]=0x791d4014;T[29]=0x7ddc5da3;T[30]=0x709f7b7a;
  T[31]=0x745e66cd;T[32]=0x9823b6e0;T[33]=0x9ce2ab57;T[34]=0x91a18d8e;T[35]=0x95609039;
  T[36]=0x8b27c03c;T[37]=0x8fe6dd8b;T[38]=0x82a5fb52;T[39]=0x8664e6e5;T[40]=0xbe2b5b58;
  T[41]=0xbaea46ef;T[42]=0xb7a96036;T[43]=0xb3687d81;T[44]=0xad2f2d84;T[45]=0xa9ee3033;
  T[46]=0xa4ad16ea;T[47]=0xa06c0b5d;T[48]=0xd4326d90;T[49]=0xd0f37027;T[50]=0xddb056fe;
  T[51]=0xd9714b49;T[52]=0xc7361b4c;T[53]=0xc3f706fb;T[54]=0xceb42022;T[55]=0xca753d95;
  T[56]=0xf23a8028;T[57]=0xf6fb9d9f;T[58]=0xfbb8bb46;T[59]=0xff79a6f1;T[60]=0xe13ef6f4;
  T[61]=0xe5ffeb43;T[62]=0xe8bccd9a;T[63]=0xec7dd02d;T[64]=0x34867077;T[65]=0x30476dc0;
  T[66]=0x3d044b19;T[67]=0x39c556ae;T[68]=0x278206ab;T[69]=0x23431b1c;T[70]=0x2e003dc5;
  T[71]=0x2ac12072;T[72]=0x128e9dcf;T[73]=0x164f8078;T[74]=0x1b0ca6a1;T[75]=0x1fcdbb16;
  T[76]=0x018aeb13;T[77]=0x054bf6a4;T[78]=0x0808d07d;T[79]=0x0cc9cdca;T[80]=0x7897ab07;
  T[81]=0x7c56b6b0;T[82]=0x71159069;T[83]=0x75d48dde;T[84]=0x6b93dddb;T[85]=0x6f52c06c;
  T[86]=0x6211e6b5;T[87]=0x66d0fb02;T[88]=0x5e9f46bf;T[89]=0x5a5e5b08;T[90]=0x571d7dd1;
  T[91]=0x53dc6066;T[92]=0x4d9b3063;T[93]=0x495a2dd4;T[94]=0x44190b0d;T[95]=0x40d816ba;
  T[96]=0xaca5c697;T[97]=0xa864db20;T[98]=0xa527fdf9;T[99]=0xa1e6e04e;T[100]=0xbfa1b04b;
  T[101]=0xbb60adfc;T[102]=0xb6238b25;T[103]=0xb2e29692;T[104]=0x8aad2b2f;T[105]=0x8e6c3698;
  T[106]=0x832f1041;T[107]=0x87ee0df6;T[108]=0x99a95df3;T[109]=0x9d684044;T[110]=0x902b669d;
  T[111]=0x94ea7b2a;T[112]=0xe0b41de7;T[113]=0xe4750050;T[114]=0xe9362689;T[115]=0xedf73b3e;
  T[116]=0xf3b06b3b;T[117]=0xf771768c;T[118]=0xfa325055;T[119]=0xfef34de2;T[120]=0xc6bcf05f;
  T[121]=0xc27dede8;T[122]=0xcf3ecb31;T[123]=0xcbffd686;T[124]=0xd5b88683;T[125]=0xd1799b34;
  T[126]=0xdc3abded;T[127]=0xd8fba05a;T[128]=0x690ce0ee;T[129]=0x6dcdfd59;T[130]=0x608edb80;
  T[131]=0x644fc637;T[132]=0x7a089632;T[133]=0x7ec98b85;T[134]=0x738aad5c;T[135]=0x774bb0eb;
  T[136]=0x4f040d56;T[137]=0x4bc510e1;T[138]=0x46863638;T[139]=0x42472b8f;T[140]=0x5c007b8a;
  T[141]=0x58c1663d;T[142]=0x558240e4;T[143]=0x51435d53;T[144]=0x251d3b9e;T[145]=0x21dc2629;
  T[146]=0x2c9f00f0;T[147]=0x285e1d47;T[148]=0x36194d42;T[149]=0x32d850f5;T[150]=0x3f9b762c;
  T[151]=0x3b5a6b9b;T[152]=0x0315d626;T[153]=0x07d4cb91;T[154]=0x0a97ed48;T[155]=0x0e56f0ff;
  T[156]=0x1011a0fa;T[157]=0x14d0bd4d;T[158]=0x19939b94;T[159]=0x1d528623;T[160]=0xf12f560e;
  T[161]=0xf5ee4bb9;T[162]=0xf8ad6d60;T[163]=0xfc6c70d7;T[164]=0xe22b20d2;T[165]=0xe6ea3d65;
  T[166]=0xeba91bbc;T[167]=0xef68060b;T[168]=0xd727bbb6;T[169]=0xd3e6a601;T[170]=0xdea580d8;
  T[171]=0xda649d6f;T[172]=0xc423cd6a;T[173]=0xc0e2d0dd;T[174]=0xcda1f604;T[175]=0xc960ebb3;
  T[176]=0xbd3e8d7e;T[177]=0xb9ff90c9;T[178]=0xb4bcb610;T[179]=0xb07daba7;T[180]=0xae3afba2;
  T[181]=0xaafbe615;T[182]=0xa7b8c0cc;T[183]=0xa379dd7b;T[184]=0x9b3660c6;T[185]=0x9ff77d71;
  T[186]=0x92b45ba8;T[187]=0x9675461f;T[188]=0x8832161a;T[189]=0x8cf30bad;T[190]=0x81b02d74;
  T[191]=0x857130c3;T[192]=0x5d8a9099;T[193]=0x594b8d2e;T[194]=0x5408abf7;T[195]=0x50c9b640;
  T[196]=0x4e8ee645;T[197]=0x4a4ffbf2;T[198]=0x470cdd2b;T[199]=0x43cdc09c;T[200]=0x7b827d21;
  T[201]=0x7f436096;T[202]=0x7200464f;T[203]=0x76c15bf8;T[204]=0x68860bfd;T[205]=0x6c47164a;
  T[206]=0x61043093;T[207]=0x65c52d24;T[208]=0x119b4be9;T[209]=0x155a565e;T[210]=0x18197087;
  T[211]=0x1cd86d30;T[212]=0x029f3d35;T[213]=0x065e2082;T[214]=0x0b1d065b;T[215]=0x0fdc1bec;
  T[216]=0x3793a651;T[217]=0x3352bbe6;T[218]=0x3e119d3f;T[219]=0x3ad08088;T[220]=0x2497d08d;
  T[221]=0x2056cd3a;T[222]=0x2d15ebe3;T[223]=0x29d4f654;T[224]=0xc5a92679;T[225]=0xc1683bce;
  T[226]=0xcc2b1d17;T[227]=0xc8ea00a0;T[228]=0xd6ad50a5;T[229]=0xd26c4d12;T[230]=0xdf2f6bcb;
  T[231]=0xdbee767c;T[232]=0xe3a1cbc1;T[233]=0xe760d676;T[234]=0xea23f0af;T[235]=0xeee2ed18;
  T[236]=0xf0a5bd1d;T[237]=0xf464a0aa;T[238]=0xf9278673;T[239]=0xfde69bc4;T[240]=0x89b8fd09;
  T[241]=0x8d79e0be;T[242]=0x803ac667;T[243]=0x84fbdbd0;T[244]=0x9abc8bd5;T[245]=0x9e7d9662;
  T[246]=0x933eb0bb;T[247]=0x97ffad0c;T[248]=0xafb010b1;T[249]=0xab710d06;T[250]=0xa6322bdf;
  T[251]=0xa2f33668;T[252]=0xbcb4666d;T[253]=0xb8757bda;T[254]=0xb5365d03;T[255]=0xb1f740b4;

# Init raw data to int lookup table
  for(i=0;i<=255;i++)X[sprintf("%c",i)]=i;
}

Then calculate the crc32

# Limit var size to 32bit
function u32(v){return and(v,0xffffffff)}
{
  # Lets try with $0 as buf in this example.
  buf = $0;

  # Step 1) Start CRC32 calculation.
  len = 0;      #// Total size. 
  crc=u32( 0 ); #// Initial seed. POSIX compatible crc32 uses 0

  # Step 2) Repeat this for as many buf as nessary, we assume "buf" contains data.
  A[0]=split(buf,A,"");
  len += A[0]
  for(i=1;i<=A[0];i++)crc=u32(xor(u32(lshift(crc,8)),T[u32(and(xor(rshift(crc,24),X[A[i]]),0xFF))]));

  # Step 3) End CRC32 calculation. Calculate the total size of buf read, and write into CRC
  while(len){crc=u32(xor(u32(lshift(crc,8)),T[u32(and(xor(rshift(crc,24),and(len,0xFF)),0xFF))]));len=rshift(len,8);}
  crc=u32(compl(crc));

  print "crc=["crc"]";
}

Result

crc=[4294967295]  <-- ""
crc=[1220704766]  <-- "a"
crc=[1219131554]  <-- "abc"
crc=[3644109718]  <-- "message digest"

All crc32 values match, exactly like how cksum command does it :-)


Since cksum uses a large table, it's probably impractical to re-implement it in AWK. You might be able to calculate it on the fly without using a table, but that's likely to be slower than calling cksum.

References:

  • POSIX
  • GNU cksum source

Translating it from C to AWK should be fairly trivial, however, if someone were so inclined.

By the way, gawk has coprocesses:

gawk 'BEGIN {
    cmd="cksum"
    print "hello" |& cmd
    close(cmd, "to")
    while (cmd |& getline a > 0)
        print a
    close(cmd)
    }'
0

精彩评论

暂无评论...
验证码 换一张
取 消