So using the regular MongoDB library in Ruby I have the following query to开发者_Python百科 find average filesize across a set of 5001 documents:
avg = 0
    total = collection.count()
    Rails.logger.info "#{total} asset creation stats in the system"
    collection.find().each {|row| avg += (row["filesize"] * (1/total.to_f)) if row["filesize"]}
Its pretty simple, so I'm trying to do the same using map/reduce as a learning exercise. This is what I came up with:
map = 'function(){emit("filesizes", {size: this.filesize, num: 1});}'
    reduce = 'function(k, vals){
            var result = {size: 0, num: 0};
            for(var x in vals) {
              var new_total = result.num + vals[x].num;
              result.num = new_total
              result.size = result.size + (vals[x].size * (vals[x].num / new_total));
            }
            return result;
    }'
    @results = collection.map_reduce(map, reduce)
However the two queries come back with two different results!
What am I doing wrong?
You're weighting the results by doing the division in every reduce function.
Say you had [{size : 5, num : 1}, {size : 5, num : 1}, {size : 5, num : 1}].  Your reduce would calculate:
result.size = 0 + (5*(1/1)) = 5
result.size = 5 + (5*(1/2)) = 7.25
result.size = 7.25 + (5*(1/3)) = 8.9
As you can see, this weights the results towards the earliest elements.
Fortunately, there's a simple solution. Just add a finalize function, which will be run once after the reduce step is finished.
 
         
                                         
                                         
                                         
                                        ![Interactive visualization of a graph in python [closed]](https://www.devze.com/res/2023/04-10/09/92d32fe8c0d22fb96bd6f6e8b7d1f457.gif) 
                                         
                                         
                                         
                                         加载中,请稍侯......
 加载中,请稍侯......
      
精彩评论