Function syntax puzzler in scalaz_问答_开发者_运维开发者技术经验分享

Following watching Nick Partidge's presentation on deriving scalaz, I got to looking at this example, which is just awesome:

import scalaz._
import Scalaz._
def even(x: Int) : Validation[NonEmptyList[String], Int] 
    = if (x % 2 ==0) x.success else "not even: %d".format(x).wrapNel.fail

println( even(3) <|*|> even(5) ) //prints: Failure(NonEmptyList(not even: 3, not even: 5))

I was trying to understand what the <|*|> method was doing, here is the source code:

def <|*|>[B](b: M[B])(implicit t: Functor[M], a: Apply[M]): M[(A, B)] 
    = <**>(b, (_: A, _: B))

OK, that is fairly confusing (!) - but it references the <**> method, which is declared thus:

def <**>[B, C](b: M[B], z: (A, B) => C)(implicit t: Functor[M], a: Apply[M]): M[C] 
    = a(t.fmap(value, z.curried), b)

So I have a few questions:

How come the method appea开发者_开发技巧rs to take a higher-kinded type of one type parameter (M[B]) but can get passed a Validation (which has two type paremeters)?
The syntax (_: A, _: B) defines the function (A, B) => Pair[A,B] which the 2nd method expects: what is happening to the Tuple2/Pair in the failure case? There's no tuple in sight!

Type Constructors as Type Parameters

M is a type parameter to one of Scalaz's main pimps, MA, that represents the Type Constructor (aka Higher Kinded Type) of the pimped value. This type constructor is used to look up the appropriate instances of Functor and Apply, which are implicit requirements to the method <**>.

trait MA[M[_], A] {
   val value: M[A]
   def <**>[B, C](b: M[B], z: (A, B) => C)(implicit t: Functor[M], a: Apply[M]): M[C] = ...
}

What is a Type Constructor?

From the Scala Language Reference:

We distinguish between first-order types and type constructors, which take type parameters and yield types. A subset of first-order types called value types represents sets of (first-class) values. Value types are either concrete or abstract. Every concrete value type can be represented as a class type, i.e. a type designator (§3.2.3) that refers to a class1 (§5.3), or as a compound type (§3.2.7) representing an intersection of types, possibly with a refinement (§3.2.7) that further constrains the types of itsmembers. Abstract value types are introduced by type parameters (§4.4) and abstract type bindings (§4.3). Parentheses in types are used for grouping. We assume that objects and packages also implicitly define a class (of the same name as the object or package, but inaccessible to user programs).

Non-value types capture properties of identifiers that are not values (§3.3). For example, a type constructor (§3.3.3) does not directly specify the type of values. However, when a type constructor is applied to the correct type arguments, it yields a first-order type, which may be a value type. Non-value types are expressed indirectly in Scala. E.g., a method type is described by writing down a method signature, which in itself is not a real type, although it gives rise to a corresponding function type (§3.3.1). Type constructors are another example, as one can write type Swap[m[_, _], a,b] = m[b, a], but there is no syntax to write the corresponding anonymous type function directly.

List is a type constructor. You can apply the type Int to get a Value Type, List[Int], which can classify a value. Other type constructors take more than one parameter.

The trait scalaz.MA requires that it's first type parameter must be a type constructor that takes a single type to return a value type, with the syntax trait MA[M[_], A] {}. The type parameter definition describes the shape of the type constructor, which is referred to as its Kind. List is said to have the kind '* -> *.

Partial Application of Types

But how can MA wrap a values of type Validation[X, Y]? The type Validation has a kind (* *) -> *, and could only be passed as a type argument to a type parameter declared like M[_, _].

This implicit conversion in object Scalaz converts a value of type Validation[X, Y] to a MA:

object Scalaz {
    implicit def ValidationMA[A, E](v: Validation[E, A]): MA[PartialApply1Of2[Validation, E]#Apply, A] = ma[PartialApply1Of2[Validation, E]#Apply, A](v)
}

Which in turn uses a trick with a type alias in PartialApply1Of2 to partially apply the type constructor Validation, fixing the type of the errors, but leaving the type of the success unapplied.

PartialApply1Of2[Validation, E]#Apply would be better written as [X] => Validation[E, X]. I recently proposed to add such a syntax to Scala, it might happen in 2.9.

Think of this as a type level equivalent of this:

def validation[A, B](a: A, b: B) = ...
def partialApply1Of2[A, B C](f: (A, B) => C, a: A): (B => C) = (b: B) => f(a, b)

This lets you combine Validation[String, Int] with a Validation[String, Boolean], because the both share the type constructor [A] Validation[String, A].

Applicative Functors

<**> demands the the type constructor M must have associated instances of Apply and Functor. This constitutes an Applicative Functor, which, like a Monad, is a way to structure a computation through some effect. In this case the effect is that the sub-computations can fail (and when they do, we accumulate the failures).

The container Validation[NonEmptyList[String], A] can wrap a pure value of type A in this 'effect'. The <**> operator takes two effectful values, and a pure function, and combines them with the Applicative Functor instance for that container.

Here's how it works for the Option applicative functor. The 'effect' here is the possibility of failure.

val os: Option[String] = Some("a")
val oi: Option[Int] = Some(2)

val result1 = (os <**> oi) { (s: String, i: Int) => s * i }
assert(result1 == Some("aa"))

val result2 = (os <**> (None: Option[Int])) { (s: String, i: Int) => s * i }
assert(result2 == None)

In both cases, there is a pure function of type (String, Int) => String, being applied to effectful arguments. Notice that the result is wrapped in the same effect (or container, if you like), as the arguments.

You can use the same pattern across a multitude of containers that have an associated Applicative Functor. All Monads are automatically Applicative Functors, but there are even more, like ZipStream.

Option and [A]Validation[X, A] are both Monads, so you could also used Bind (aka flatMap):

val result3 = oi flatMap { i => os map { s => s * i } }
val result4 = for {i <- oi; s <- os} yield s * i

Tupling with `<|**|>`

<|**|> is really similar to <**>, but it provides the pure function for you to simply build a Tuple2 from the results. (_: A, _ B) is a shorthand for (a: A, b: B) => Tuple2(a, b)

And beyond

Here's our bundled examples for Applicative and Validation. I used a slightly different syntax to use the Applicative Functor, (fa ⊛ fb ⊛ fc ⊛ fd) {(a, b, c, d) => .... }

UPDATE: But what happens in the Failure Case?

what is happening to the Tuple2/Pair in the failure case?

If any of the sub-computations fails, the provided function is never run. It only is run if all sub-computations (in this case, the two arguments passed to <**>) are successful. If so, it combines these into a Success. Where is this logic? This defines the Apply instance for [A] Validation[X, A]. We require that the type X must have a Semigroup avaiable, which is the strategy for combining the individual errors, each of type X, into an aggregated error of the same type. If you choose String as your error type, the Semigroup[String] concatenates the strings; if you choose NonEmptyList[String], the error(s) from each step are concatenated into a longer NonEmptyList of errors. This concatenation happens below when two Failures are combined, using the ⊹ operator (which expands with implicits to, for example, Scalaz.IdentityTo(e1).⊹(e2)(Semigroup.NonEmptyListSemigroup(Semigroup.StringSemigroup)).

implicit def ValidationApply[X: Semigroup]: Apply[PartialApply1Of2[Validation, X]#Apply] = new Apply[PartialApply1Of2[Validation, X]#Apply] {
  def apply[A, B](f: Validation[X, A => B], a: Validation[X, A]) = (f, a) match {
    case (Success(f), Success(a)) => success(f(a))
    case (Success(_), Failure(e)) => failure(e)
    case (Failure(e), Success(_)) => failure(e)
    case (Failure(e1), Failure(e2)) => failure(e1 ⊹ e2)
  }
}

Monad or Applicative, how shall I choose?

Still reading? (Yes. Ed)

I've shown that sub-computations based on Option or [A] Validation[E, A] can be combined with either Apply or with Bind. When would you choose one over the other?

When you use Apply, the structure of the computation is fixed. All sub-computations will be executed; the results of one can't influence the the others. Only the 'pure' function has an overview of what happened. Monadic computations, on the other hand, allow the first sub-computation to influence the later ones.

If we used a Monadic validation structure, the first failure would short-circuit the entire validation, as there would be no Success value to feed into the subsequent validation. However, we are happy for the sub-validations to be independent, so we can combine them through the Applicative, and collect all the failures we encounter. The weakness of Applicative Functors has become a strength!