开发者

Traversing and filtering a tree in haskell

开发者 https://www.devze.com 2022-12-21 02:42 出处:网络
I am pretty new to Haskell (still working on totally understanding monads).I have a problem where I have a tree like structure

I am pretty new to Haskell (still working on totally understanding monads). I have a problem where I have a tree like structure

type Tree = [DataA]

data DataA =  DataA1 [DataB] 
            | DataA2 String 
            | DataA3 String [DataA]
               deriving Show

data DataB =  DataB1 [DataA] 
            | DataB2 String 
            | DataB3 String [DataB]
               deriving Show

What I would like to do is be able to traverse this and generate a new tree with a filter. For example I may want to change all DataB2 in the tree t开发者_高级运维o "foo".

I have seen examples of tree when they are in the same data section, and the constructors are similar.

In the python world, I would just traverse the list, match to whatever I needed, and replace the value.

In Haskell I am guessing that I need to be able to copy my tree, but how do you deal with lists buried in constructors and different data types?


You can use generic programming for this.

One such generic programming library is called Scrap Your Boilerplate. At the very top of your module, enable Scrap Your Boilerplate by writing:

{-# LANGUAGE DeriveDataTypeable #-}

Import module Data.Generics. Then besides Show, also derive Typeable and Data instances for your datatypes.

Now you can write the function you requested like this:

toFoo :: Data a => a -> a
toFoo = everywhere (mkT step)
  where
    step (DataA2 _)  = DataA2 "foo"
    step x           = x

That's all you need to do to make this work. For example, when you call toFoo [DataA1 [], DataA2 "hi", DataA3 "yo" []], the answer is [DataA1 [],DataA2 "foo",DataA3 "yo" []].

Hope this helps!


I don't know a general answer to your question. The data type is quite contrived, and I would probably choose to implement a fold rather than a filter. Here, however, are some filter functions that can update strings in all four positions. I have put the code through the compiler, so it typechecks, but I haven't run it.

type SFilter = String -> String

-- to filter a tree, say how A2, A3, B2, and B3 should be changed

type Filter tree = SFilter -> SFilter -> SFilter -> SFilter -> (tree -> tree)

afilter :: Filter DataA
bfilter :: Filter DataB
tfilter :: Filter Tree

tfilter a2 a3 b2 b3 = map (afilter a2 a3 b2 b3)
afilter a2 a3 b2 b3 = fil
  where fil (DataA1 bs)   = DataA1 $ map (bfilter a2 a3 b2 b3) bs
        fil (DataA2 s)    = DataA2 (a2 s)
        fil (DataA3 s as) = DataA3 (a3 s) (map fil as)

bfilter a2 a3 b2 b3 = fil
  where fil (DataB1 as)   = DataB1 $ map (afilter a2 a3 b2 b3) as
        fil (DataB2 s)    = DataB2 (b2 s)
        fil (DataB3 s bs) = DataB3 (b3 s) (map fil bs)


You want to traverse the whole data structure and change some items here and there. This is usually done by a function that takes the data structure as a parameter and returns the new, changed version of the structure.

For every case of input this function defines how the new value that is returned should look like.

The basic function that modifies a Tree (which is just a list of DataA values) probably should just return a new list of modified values. If we defer those modifications of the values to a modifyA function, the main modification function looks like this:

-- # function to change a |Tree|
mutate :: Tree -> Tree
mutate as = map mutateA as
     -- # (The |map| function applies the |mutateA| function to every
     -- #  element of |as|, creating a list of all the return values)

Now the mutateA function needs to be defined to change all possible DataA values, and best it is accompanied by a mutateB function that handles DataB values.

These functions look at the different possible cases of values and return the appropriate new values:

-- # function to change |DataA| items
mutateA :: DataA -> DataA
     -- # A |DataA1| is a |DataA1| with modified values
mutateA (DataA1 bs)   = DataA1 (map mutateB bs)
     -- # A |DataA3| is a |DataA3| with modified values
mutateA (DataA3 s as) = DataA3 s (map mutateA as)
     -- # In the remaining case(s) the value stays the same
mutateA d             = d

-- # function to change |DataB| items
mutateB :: DataB -> DataB
mutateB (DataB1 as) = DataB1 (map mutateA as)
mutateB (DataB3 s bs) = DataB3 s (map mutateB bs)
     -- # Here comes a real change
mutateB (DataB2 _)  = DataB2 "foo"

This way for every element in the tree a new element is computed, where all the DataB2 values anywhere in the tree are replaced by "foo".

It's relatively verbose because you have five different cases that contain a list of values that needs to be walked through, but that is not specific to Haskell. In an imperative language you would usually have five for loops in place of the five calls to map.

Maybe you could simplify your data structure to reduce this "overhead". This of course depends on your actual use case, but maybe, for example, you don't need the Data2 cases: Is there a difference between DataA2 "abc" and DataA3 "abc" []?


You may want to take a look at the multirec library for working with mutually recursive data types. I've not used it, but from what you've described it sounds like it's aimed at precisely the kind of problem that you're working with. It uses generic programming like the other answers here have suggested, but might save you the time of implementing everything yourself.

0

精彩评论

暂无评论...
验证码 换一张
取 消