开发者

Haskell Space Overflow

开发者 https://www.devze.com 2023-03-29 21:30 出处:网络
I\'ve compiled this program and am trying to run it. import Data.List import Data.Ord import qualified Data.MemoCombinators as Memo

I've compiled this program and am trying to run it.

import Data.List
import Data.Ord
import qualified Data.MemoCombinators as Memo

collatzLength :: Int -> Int
collatzLength = Memo.arrayRange (1, 1000000) collatzLength'
  where
    collatzLength' 1 = 1
    collatzLength' n | odd n  = 1 + collatzLength (3 * n + 1)
                     | even n = 1 + collatzLength (n `quot` 2)

main = print $ maximumBy (comparing fst) $ [(collatzLength n, n) | n <- [1..1000000]]

I'm getting the following from GHC

Stack space overflow: current size 8388608 bytes.
Use `+RTS -Ksiz开发者_运维知识库e -RTS' to increase it.

I assume this is one of the "space overflow" things I've been hearing about. (I'm pretty new to Haskell.) How do I fix it? Do I have to rewrite collatzLength to be tail recursive?


As the author of the code in question, I am now slightly embarrassed because it has not one but two possible stack overflow bugs.

  1. It uses Int. On a 32-bit system this will overflow, as the Collatz sequence can go quite a bit higher than the starting number. This overflow can cause infinite recursion as the function jumps back and forth between negative and positive values.

    In the case of numbers between one and a million, the worst starting point is 704511, which goes as high as 56,991,483,520 before coming back down towards 1. This is well outside the 32-bit range.

  2. It uses maximumBy. This function is not strict, so it will cause a stack overflow when used on long lists. One million elements is more than enough for this to happen with the default stack size. It still works with optimizations enabled, though, due to the strictness analysis performed by GHC.

    The solution is to use a strict version. Since none is available in the standard libraries, we can use the strict left fold ourselves.

Here is an updated version which should (hopefully) be stack overflow-free, even without optimizations.

import Data.List
import Data.Ord
import qualified Data.MemoCombinators as Memo

collatzLength :: Integer -> Integer
collatzLength = Memo.arrayRange (1,1000000) collatzLength'
  where
    collatzLength' 1 = 1
    collatzLength' n | odd n  = 1 + collatzLength (3 * n + 1)
                     | even n = 1 + collatzLength (n `quot` 2)

main = print $ foldl1' max $ [(collatzLength n, n) | n <- [1..1000000]]


Here's a shorter program that fails in the same way:

main = print (maximum [0..1000000])

Yep.

$ ghc --make harmless.hs && ./harmless
[1 of 1] Compiling Main             ( harmless.hs, harmless.o )
Linking harmless ...
Stack space overflow: current size 8388608 bytes.
Use `+RTS -Ksize -RTS' to increase it.

With -O2 it works. What do I make of it? I don't know :( These space mysteries are a serious gotcha.

Edit:

Thx to hammar for pointing out the culprit.

Changing your program to use

maximum' = foldl1' max

Makes it work without -O2. The implementation of Prelude's maximum is lazy and so doesn't quite work for long lists without compiler magic dust.


I think it's most likely that you're hitting integer overflow with some of the Collatz sequences, and then ending up in an "artificial" cycle that contains overflows but never hits 1. That would produce an infinite recursion.

Remember that some Collatz sequences get very much larger than their starting number before they finally (?) end up at 1.

Try to see if it fixes your problem to use Integer instead of Int.


Use the optimizer (via the -O2 flag) any time you are concerned about performance. GHC's optimizations are hugely important not just to run time but to stack use. I've tested this with GHC 7.2 and optimization takes care of your issue.

EDIT: In addtion, if you're on a 32 bit machine be sure to use Int64 or Word64. You'll overflow the size of a 32 bit int and cause non-termination otherwise (thanks to Henning for this, upvote his answer).

0

精彩评论

暂无评论...
验证码 换一张
取 消