Recursive sync faster than Recursive async_问答_开发者

Recursive sync faster than Recursive async

开发者 https://www.devze.com 2023-03-28 18:49 出处：网络

How come that Solution 2 is more efficient than Solution 1? (The time is the average of 100 runs, and the total folders they go through is 13217)

相关专题：f#

How come that Solution 2 is more efficient than Solution 1?

(The time is the average of 100 runs, and the total folders they go through is 13217)

// Solution 1 (2608,9ms)
let rec folderCollector path =
  async { let! dirs = Directory.AsyncGetDirectories path 
          do! [for z in dirs -> folderCollector z] 
              |> Async.Parallel |> Async.Ignore }

// Solution 2 (2510,9ms)
let rec folderCollector path =
  let dirs = Directory.GetDirectories path 
  for z in dirs do folderCollector z

I would have th开发者_运维百科ought that Solution 1 would be faster because it's async, and that I run it in Parallel. What am I'm missing?

As Daniel and Brian already clearly explained, your solution is probably creating too many short-lived asynchronous computations (so the overhead is more than the gains from parallelism). The AsyncGetDirectories operation also probably isn't really non-blocking as it is not doing much work. I don't see a truly async version of this operation anywhere - how is it defined?

Anyway, using the ordinary GetDirectories, I tried the following version (which creates only a small number of parallel asyncs):

// Synchronous version
let rec folderCollectorSync path =
    let dirs = Directory.GetDirectories path 
    for z in dirs do folderCollectorSync z

// Asynchronous version that uses synchronous when 'nesting <= 0'
let rec folderCollector path nesting =
    async { if nesting <= 0 then return folderCollectorSync path 
            else let dirs = Directory.GetDirectories path 
                 do! [for z in dirs -> folderCollector z (nesting - 1) ] 
                     |> Async.Parallel |> Async.Ignore }

Calling a simple synchronous version after certain number of recursive calls is a common trick - it is used when parallelizing any tree-like structure that is very deep. Using folderCollector path 2, this will start only tens of parallel tasks (as opposed to thousands), so it will be more efficient.

On a sample directory I used (with 4800 sub-dirs and 27000 files), I get:

folderCollectorSync path takes 1 second
folderCollector path 2 takes takes 600ms (result is similar for any nesting between 1 and 4)

From the comments:

Your function incurs the cost of async without any of the benefits because

you're creating too many asyncs for the short amount of work to be done
your function is not CPU, but rather IO, bound

I expect for a problem like this, you may have the best results if at the top-level you do async/parallel work, but then have the sub-calls be sync. (Or if the trees are very deep, maybe have the first two levels be async, and then sync after that.)

The keys are load-balancing and granularity. Too tiny a piece of work, and the overhead of async outweighs the benefits of parallelism. So you want big enough chunks of work to leverage parallel and overcome the overheads. But if the work pieces are too large and unbalanced (e.g. one top-level dir has 10000 files, and 3 other top-level dirs have 1000 each), then you also suffer because one guy is busy while the rest finish quickly, and you don't maximize parallelism.

If you can estimate the work for each sub-tree beforehand, you can do even better scheduling.

Apparently, your code is IO-bound. Keep in mind how HDDs work. When u use Async to do multiple read, the reading heads of the HDD have to jump back and forth to serve different read commands at the same time, which introduces latency. This will likely become much worse if the data on disk is heavily fragmented.