Clojure: reduce vs. apply_问答_开发者_运维开发者技术经验分享

I understand the conceptual difference between reduce and apply:

(reduce + (list 1 2 3 4 5))
; translates to: (+ (+ (+ (+ 1 2) 3) 4) 5)

(apply + (list 1 2 3 4 5))
; translates to: (+ 1 2 3 4 5)

However, which one is more idiomatic clojure? Does it make much difference one way or the other? From my (limited) performance testing, it seems reduce is a bit fa开发者_运维问答ster.

reduce and apply are of course only equivalent (in terms of the ultimate result returned) for associative functions which need to see all their arguments in the variable-arity case. When they are result-wise equivalent, I'd say that apply is always perfectly idiomatic, while reduce is equivalent -- and might shave off a fraction of a blink of an eye -- in a lot of the common cases. What follows is my rationale for believing this.

+ is itself implemented in terms of reduce for the variable-arity case (more than 2 arguments). Indeed, this seems like an immensely sensible "default" way to go for any variable-arity, associative function: reduce has the potential to perform some optimisations to speed things up -- perhaps through something like internal-reduce, a 1.2 novelty recently disabled in master, but hopefully to be reintroduced in the future -- which it would be silly to replicate in every function which might benefit from them in the vararg case. In such common cases, apply will just add a little overhead. (Note it's nothing to be really worried about.)

On the other hand, a complex function might take advantage of some optimisation opportunities which aren't general enough to be built into reduce; then apply would let you take advantage of those while reduce might actually slow you down. A good example of the latter scenario occuring in practice is provided by str: it uses a StringBuilder internally and will benefit significantly from the use of apply rather than reduce.

So, I'd say use apply when in doubt; and if you happen to know that it's not buying you anything over reduce (and that this is unlikely to change very soon), feel free to use reduce to shave off that diminutive unnecessary overhead if you feel like it.

For newbies looking at this answer,
be careful, they are not the same:

(apply hash-map [:a 5 :b 6])
;= {:a 5, :b 6}
(reduce hash-map [:a 5 :b 6])
;= {{{:a 5} :b} 6}

It doesn't make a difference in this case, because + is a special case that can apply to any number of arguments. Reduce is a way to apply a function that expects a fixed number of arguments (2) to an arbitrarily long list of arguments.

Opinions vary- In the greater Lisp world, reduce is definitely considered more idiomatic. First, there is the variadic issues already discussed. Also, some Common Lisp compilers will actually fail when apply is applied against very long lists because of how they handle argument lists.

Amongst Clojurists in my circle, though, using apply in this case seems more common. I find it easier to grok and prefer it also.

I normally find myself preferring reduce when acting on any kind of collection - it performs well, and is a pretty useful function in general.

The main reason I would use apply is if the parameters mean different things in different positions, or if you have a couple of initial parameters but want to get the rest from a collection, e.g.

(apply + 1 2 other-number-list)

In this specific case I prefer reduce because it's more readable: when I read

(reduce + some-numbers)

I know immediately that you're turning a sequence into a value.

With apply I have to consider which function is being applied: "ah, it's the + function, so I'm getting... a single number". Slightly less straightforward.

When using a simple function like +, it really doesn't matter which one you use.

In general, the idea is that reduce is an accumulating operation. You present the current accumulation value and one new value to your accumulating function The result of the function is the cumulative value for the next iteration. So, your iterations look like:

cum-val[i+1] = F( cum-val[i], input-val[i] )    ; please forgive the java-like syntax!

For apply, the idea is that you are attempting to call a function expecting a number of scalar arguments, but they are currently in a collection and need to be pulled out. So, instead of saying:

vals = [ val1 val2 val3 ]
(some-fn (vals 0) (vals 1) (vals 2))

we can say:

(apply some-fn vals)

and it is converted to be equivalent to:

(some-fn val1 val2 val3)

So, using "apply" is like "removing the parentheses" around the sequence.

Bit late on the topic but I did a simple experiment after reading this example. Here is result from my repl, I just can't deduce anything from the response, but seems there is some sort of caching kick in between reduce and apply.

user=> (time (reduce + (range 1e3)))
"Elapsed time: 5.543 msecs"
499500
user=> (time (apply + (range 1e3))) 
"Elapsed time: 5.263 msecs"
499500
user=> (time (apply + (range 1e4)))
"Elapsed time: 19.721 msecs"
49995000
user=> (time (reduce + (range 1e4)))
"Elapsed time: 1.409 msecs"
49995000
user=> (time (reduce + (range 1e5)))
"Elapsed time: 17.524 msecs"
4999950000
user=> (time (apply + (range 1e5)))
"Elapsed time: 11.548 msecs"
4999950000

Looking at source code of clojure reduce its pretty clean recursion with internal-reduce, didn't found anything on implementation of apply though. Clojure implementation of + for apply internally invoke reduce, which is cached by repl, which seem to explain the 4th call. Can someone clarify whats really happening here?

The beauty of apply is given function (+ in this case) can be applied to argument list formed by pre-pending intervening arguments with an ending collection. Reduce is an abstraction to process collection items applying the function for each and doesn't work with variable args case.

(apply + 1 2 3 [3 4])
=> 13
(reduce + 1 2 3 [3 4])
ArityException Wrong number of args (5) passed to: core/reduce  clojure.lang.AFn.throwArity (AFn.java:429)

A bit late, but...

In this case, there is not a big difference. But in general they are not equivalent. Further more reduce can be more performant. Why?

reduce checks if a collection or type implements IReduced interface. That means a type knows how provide its values to the reducing function in the most performant why. reduce can be stopped prematurely by returning a Reduced value.

Apply on the other hand, is invoked by applyToHelper. Which dispatches to the right arity by counting the args, unpacking the values from the collection.

Is it a big performance impact? Probably not. My opinion is as others already pointed out. Use reduce if you want to semantically "reduce" a collection to a single value. Otherwise use apply.