开发者

Parallelization of a for loop using pthreads

开发者 https://www.devze.com 2023-02-13 22:43 出处:网络
void rijndael_enc(RIJNDAEL_context *ctx, UINT开发者_运维技巧8 *input, int inputlen, UINT8 *output)
void rijndael_enc(RIJNDAEL_context *ctx,  
       UINT开发者_运维技巧8 *input, int inputlen, UINT8 *output)
{ 
int i, nblocks; 
nblocks = inputlen / RIJNDAEL_BLOCKSIZE; 
for (i = 0; i<nblocks; i++) 
{ 
rijndael_encrypt(ctx, input, output); 
input+= RIJNDAEL_BLOCKSIZE; 
output+= RIJNDAEL_BLOCKSIZE; 
}

I would like to execute this for loop in parallel (using pthreads..) i.e each block gets executed by a thread and the number of threads = nblocks. I am not sure how to proceed..will putting a mutex before input and output is incremented do the trick?


Starting a thread per block is probably not the right approach - creating and destroying threads is an expensive operation (how much depends a lot on the operating system).

A strategy like this could work:

  • determine how many blocks total you need to encrypt
  • determine how many threads you want to use

This will give you how many blocks each thread should process. You would then process the first n blocks with thread one, the next n blocks with thread 2, etc...

The arguments a thread would receive when it starts can simply be:

  • the offset into the input buffer to read from
  • the offset into the output buffer to write to
  • the number of blocks it should process

Once all the threads are started, your main thread should join all the worker threads and you're good to go.

Since all the threads would operate on different areas of memory, you don't need to worry about synchronisation between those accesses.

There's a catch though: if your input is not exactly a multiple of the block size, the last block probably needs to be treated with care (padding). I'd suggest in that case dealing with the last block in the main thread before waiting for the workers to finish.

As said in the comments, ECB should be avoided in most cases. Since this is for educational purposes, then no problem. Suggestion: once you get that working for ECB, maybe try something in CTR mode (which can also be parallelized)?

0

精彩评论

暂无评论...
验证码 换一张
取 消

关注公众号