Update: I've added an answer that describes my final solution (hint: the single Expr
data type wasn't sufficient).
I'm writing an evaluator for a little expression language, 开发者_运维知识库but I'm stuck on the LetRec
construct.
This is the language:
type Var = String
type Binds = [(Var, Expr)]
data Expr
= Var Var
| Lam Var Expr
| App Expr Expr
| Con Int
| Sub Expr Expr
| If Expr Expr Expr
| Let Var Expr Expr
| LetRec Binds Expr
deriving (Show, Eq)
And this this the evaluator so far:
data Value
= ValInt Int
| ValFun Env Var Expr
deriving (Show, Eq)
type Env = [(Var, Value)]
eval :: Env -> Expr -> Either String Value
eval env (Var x) = maybe (throwError $ x ++ " not found")
return
(lookup x env)
eval env (Lam x e) = return $ ValFun env x e
eval env (App e1 e2) = do
v1 <- eval env e1
v2 <- eval env e2
case v1 of
ValFun env1 x e -> eval ((x, v2):env1) e
_ -> throwError "First arg to App not a function"
eval _ (Con x) = return $ ValInt x
eval env (Sub e1 e2) = do
v1 <- eval env e1
v2 <- eval env e2
case (v1, v2) of
(ValInt x, ValInt y) -> return $ ValInt (x - y)
_ -> throwError "Both args to Sub must be ints"
eval env (If p t f) = do
v1 <- eval env p
case v1 of
ValInt x -> if x /= 0
then eval env t
else eval env f
_ -> throwError "First arg of If must be an int"
eval env (Let x e1 e2) = do
v1 <- eval env e1
eval ((x, v1):env) e2
eval env (LetRec bs e) = do
env' <- evalBinds
eval env' e
where
evalBinds = mfix $ \env' -> do
env'' <- mapM (\(x, e') -> eval env' e' >>= \v -> return (x, v)) bs
return $ nub (env'' ++ env)
This is my test function I want to evaluate:
test3 :: Expr
test3 = LetRec [ ("even", Lam "x" (If (Var "x")
(Var "odd" `App` (Var "x" `Sub` Con 1))
(Con 1)
))
, ("odd", Lam "x" (If (Var "x")
(Var "even" `App` (Var "x" `Sub` Con 1))
(Con 0)
))
]
(Var "even" `App` Con 5)
EDIT:
Based on Travis' answer and Luke's comment, I've updated my code to use the MonadFix instance for the Error monad. The previous example works fine now! However, the example bellow doesn't work correctly:
test4 :: Expr
test4 = LetRec [ ("x", Con 3)
, ("y", Var "x")
]
(Con 0)
When evaluating this, the evaluator loops, and nothing happens. I'm guessing I've made something a bit too strict here, but I'm not sure what it is. Am I violating one of the MonadFix laws?
When Haskell throws a fit, that's usually an indication that you have not thought clearly about a core issue of your problem. In this case, the question is: which evaluation model do you want to use for your language? Call-by-value or call-by-need?
Your representation of environments as [(Var,Value)]
suggests that you want to use call-by-value, since every Expr
is evaluated to a Value
right away before storing it in the environment. But letrec
does not go well with that, and your second example shows!
Furthermore, note that the evaluation model of the host language (Haskell) will interfere with the evaluation model of the language you want to implement; in fact, that's what you are currently making use of for your examples: despite their purpose, your Value
s are not evaluated to weak head normal form.
Unless you have a clear picture of the evaluation model of your little expression language, you won't make much progress on letrec
or on the error checking facilities.
Edit:
For an example specification of letrec
in a call-by-value language, have a look at the Ocaml Manual. On the simplest level, they only allow right-hand sides that are lambda expressions, i.e. things that are syntactically known to be values.
Maybe I'm missing something, but doesn't the following work?
eval env (LetRec bs ex) = eval env' ex
where
env' = env ++ map (\(v, e) -> (v, eval env' e)) bs
For your updated version: What about the following approach? It works as desired on your test case, and doesn't throw away errors in LetRec
expressions:
data Value
= ValInt Int
| ValFun EnvWithError Var Expr
deriving (Show, Eq)
type Env = [(Var, Value)]
type EnvWithError = [(Var, Either String Value)]
eval :: Env -> Expr -> Either String Value
eval = eval' . map (second Right)
where
eval' :: EnvWithError -> Expr -> Either String Value
eval' env (Var x) = maybe (throwError $ x ++ " not found")
(join . return)
(lookup x env)
eval' env (Lam x e) = return $ ValFun env x e
eval' env (App e1 e2) = do
v1 <- eval' env e1
v2 <- eval' env e2
case v1 of
ValFun env1 x e -> eval' ((x, Right v2):env1) e
_ -> throwError "First arg to App not a function"
eval' _ (Con x) = return $ ValInt x
eval' env (Sub e1 e2) = do
v1 <- eval' env e1
v2 <- eval' env e2
case (v1, v2) of
(ValInt x, ValInt y) -> return $ ValInt (x - y)
_ -> throwError "Both args to Sub must be ints"
eval' env (If p t f) = do
v1 <- eval' env p
case v1 of
ValInt x -> if x /= 0
then eval' env t
else eval' env f
_ -> throwError "First arg of If must be an int"
eval' env (Let x e1 e2) = do
v1 <- eval' env e1
eval' ((x, Right v1):env) e2
eval' env (LetRec bs ex) = eval' env' ex
where
env' = env ++ map (\(v, e) -> (v, eval' env' e)) bs
Answering my own question; I wanted to share the final solution I came up with.
As Heinrich correctly pointed out, I didn't really think through the impact the evaluation order has.
In a strict (call-by-value) language, an expression that is already a value (weak head normal form) is different from an expression that still needs some evaluation. Once I encoded this distinction in my data type, everything fell into place:
type Var = String
type Binds = [(Var, Val)]
data Val
= Con Int
| Lam Var Expr
deriving (Show, Eq)
data Expr
= Val Val
| Var Var
| App Expr Expr
| Sub Expr Expr
| If Expr Expr Expr
| Let Var Expr Expr
| LetRec Binds Expr
deriving (Show, Eq)
The only difference with my my original Expr
data type, is that I pulled out two constructors (Con
and Lam
) into their own data type Val
. The Expr
data type has a new constructor Val
, this represents the fact that a value is also a valid expression.
With values in their own data type, they can be handled separately from other expression, for example letrec
bindings can only contain values, no other expressions.
This distinction is also made in other strict languages like C, where only functions and constants can be defined in global scope.
See the complete code for the updated evaluator function.
精彩评论