test :: [String] -> [String]
test = foldr step []
where step x ys
| elem x ys = x : ys
| otherwise = ys
I am trying to build a new list consisting of all the distinct strings being input. My test data is:
test ["one", "one", "two", "two", "three"]
expected result:
["one", "two", "three"]
I am new to Haskell, and I am sure that I am missing something very fundamental and obvious, but have run out of ways to explore this. Could you provide pointers to where my thinking is deficient?
The actual response is []
. It seems that the first guard condition is never met (if I replace it with True
, the original list is replicated), so the output list is never built.
My understanding was that the fold would accumulate the result of step on each item of the list, adding it to the empty list. I anticipated that step would test each item for its inclusion in the output list (the first element tested not being开发者_C百科 there) and would add anything that was not already included to the output list. Obviously not :-)
Your reasoning is correct: you just need to switch = x : ys
and = ys
so that you add the x
when it's not an element of ys
. Also, Data.List.nub
does this exact thing.
Think about it: your code is saying "when x is in the remainder, prepend x to the result", i.e. creating a duplicate. You just need to change it to "when x is not in the remainder, prepend x to the result" and you get the correct function.
This function differs from Data.List.nub
in an important way: this function is more strict. Thus:
test [1..] = _|_ -- infinite loop (try it)
nub [1..] = [1..]
nub
gives the answer correctly for infinite lists -- this means that it doesn't need the whole list to start computing results, and thus it is a nice player in the stream processing game.
The reason it is strict is because elem
is strict: it searches the whole list (presuming it doesn't find a match) before it returns a result. You could write that like this:
nub :: (Eq a) => [a] -> [a]
nub = go []
where
go seen [] = []
go seen (x:xs) | x `elem` seen = go seen xs
| otherwise = x : go (x:seen) xs
Notice how seen
grows like the output so far, whereas yours grows like the remainder of the output. The former is always finite (starting at []
and adding one at a time), whereas the latter may be infinite (eg. [1..]
). So this variant can yield elements more lazily.
This would be faster (O(n log n) instead of O(n^2)) if you used a Data.Set
instead of a list for seen
. But it adds an Ord
constraint.
精彩评论