开发者

Haskell: Prefer pattern-matching or member access?

开发者 https://www.devze.com 2023-03-25 13:16 出处:网络
Suppose I have a Vector datatype defined as follows: data Vector = Vector { x :: Double , y :: Double , z :: Double

Suppose I have a Vector datatype defined as follows:

data Vector = Vector { x :: Double
                     , y :: Double
                     , z :: Double
                     }

Would it be more usual to define functions against it using member access:

vecAddA v w
    = Vector (x开发者_开发知识库 v + x w)
             (y v + y w)
             (z v + z w)

Or using pattern-matching:

vecAddB (Vector vx vy vz) (Vector wx wy wz)
    = Vector (vx + wx)
             (vy + wy)
             (vz + wz)

(Apologies if I've got any of the terminology incorrect).


I would normally use pattern matching, especially since you're using all of the constructor's arguments and there aren't a lot of them. Also, In this example it's not an issue, but consider the following:

data Foo = A {a :: Int} | B {b :: String}

fun x = a x + 1

If you use pattern matching to do work on the Foo type, you're safe; it's not possible to access a member that doesn't exist. If you use accessor functions on the other hand, some operations such as calling fun (B "hi!") here will result in a runtime error.

EDIT: while it's of course quite possible to forget to match on some constructor, pattern matching makes it pretty explicit that what happens depends on what constructor is used (you can also tell the compiler to detect and warn you about incomplete patterns) whereas the use of a function hints more that any constructor goes, IMO.

Accessors are best saved for cases when you want to get at just one or a few of the constructor's (potentially many) arguments and you know that it's safe to use them (no risk of using an accessor on the wrong constructor, as in the example.)


Another minor "real world" argument: In general, it isn't a good idea to have such short record entry names, as short names like x and y often end up being used for local variables.

So the "fair" comparison here would be:

vecAddA v w 
  = Vector (vecX v + vecX w) (vecY v + vecY w) (vecZ v + vecZ w)
vecAddB (Vector vx vy vz) (Vector wx wy wz) 
  = Vector (vx + wx) (vy + wy) (vz + wz)

I think pattern matching wins out in most cases of this type. Some notable exceptions:

  • You only need to access (or change!) one or two fields in a larger record
  • You want to remain flexible to change the record later, such as add more fields.


This is an aesthetic preference since the two are semantically equivalent. Well, I suppose a in a naive compiler the first one would be slower because of the function calls, but I have a hard time believing that would not be optimized away in real life.

Still, with only three elements in the record, since you're using all three anyway and there is presumably some significance to their order, I would use the second one. A second (albeit weaker) argument is that this way you're using the order for both composition and decomposition, rather than a mixture of order and field access.


(Alert, may be wrong. I am still a Haskell newbie, but here's my understanding)

One thing that other people have not mentioned is that pattern matching will make the function "strict" in its argument. (http://www.haskell.org/haskellwiki/Lazy_vs._non-strict)

To choose which pattern to use, the program must reduce the argument to WHNF before calling the function, whereas using the record-syntax accessor function would evaluate the argument inside the function.

I can't really give any concrete examples (still being a newbie) but this can have performance implications where huge piles of "thunks" can build up in recursive, non-strict functions. (That is to mean, for simple functions like extracting values, there should be no performance difference).

(Concrete examples very much welcome)

In short

f (Just x) = x

is actually (using BangPatterns)

f !jx = fromJust jx

Edit: The above is not a good example of strictness, because both are actually strict from definition (f bottom = bottom), just to illustrate what I meant from the performance side.


As kizzx2 pointed out, there is a subtle difference in strictness between vecAddA and vecAddB

vecAddA ⊥ ⊥ = Vector ⊥ ⊥ ⊥
vecAddB ⊥ ⊥ = ⊥

To get the same semantics when using pattern matching, one would have to use irrefutable patterns.

vecAddB' ~(Vector vx vy vz) ~(Vector wx wy wz)
    = Vector (vx + wx)
             (vy + wy)
             (vz + wz)

However, in this case, the fields of Vector should probably be strict to begin with for efficiency:

data Vector = Vector { x :: !Double
                     , y :: !Double
                     , z :: !Double
                     }

With strict fields, vecAddA and vecAddB are semantically equivalent.


Hackage package vect solves both these problems by allowing matching like f (Vec3 x y z) and indexing like:

get1 :: Vec3 -> Float
get1 v = _1 v

Look up HasCoordinates class.

http://hackage.haskell.org/packages/archive/vect/0.4.7/doc/html/Data-Vect-Float-Base.html

0

精彩评论

暂无评论...
验证码 换一张
取 消