开发者

Stackoverflow with specialized Hashtbl (via Hashtbl.make)

开发者 https://www.devze.com 2023-03-21 07:28 出处:网络
I am using this piece of code and a stackoverflow will be triggered, if I use Extlib\'s Hashtbl the error does not occur. Any hints to use specialized Hashtbl without stackoverflow?

I am using this piece of code and a stackoverflow will be triggered, if I use Extlib's Hashtbl the error does not occur. Any hints to use specialized Hashtbl without stackoverflow?

module ColorIdxHash = Hashtbl.Make(
  struct
    type t = Img_types.rgb_t
    let equal = (==)
    let hash = Hashtbl.hash
  开发者_运维技巧end
)
(* .. *)
let (ctable: int ColorIdxHash.t) = ColorIdxHash.create 256 in
    for x = 0 to width -1 do
      for y = 0 to height -1 do
        let c = Img.get img x y in
        let rgb = Color.rgb_of_color c in
          if not (ColorIdxHash.mem ctable rgb) then ColorIdxHash.add ctable rgb (ColorIdxHash.length ctable)
      done
    done;
(* .. *)

The backtrace points to hashtbl.ml:

Fatal error: exception Stack_overflow Raised at file "hashtbl.ml", line 54, characters 16-40 Called from file "img/write_bmp.ml", line 150, characters 52-108 ...

Any hints?


Well, you're using physical equality (==) to compare the colors in your hash table. If the colors are structured values (I can't tell from this code), none of them will be physically equal to each other. If all the colors are distinct objects, they will all go into the table, which could really be quite a large number of objects. On the other hand, the hash function is going to be based on the actual color R,G,B values, so there may well be a large number of duplicates. This will mean that your hash buckets will have very long chains. Perhaps some internal function isn't tail recursive, and so is overflowing the stack.

Normally the length of the longest chain will be 2 or 3, so it wouldn't be surprising that this error doesn't come up often.

Looking at my copy of hashtbl.ml (OCaml 3.12.1), I don't see anything non-tail-recursive on line 54. So my guess might be wrong. On line 54 a new internal array is allocated for the hash table. So another idea is just that your hashtable is just getting too big (perhaps due to the unwanted duplicates).

One thing to try is to use structural equality (=) and see if the problem goes away.


One reason you may have non-termination or stack overflows is if your type contains cyclic values. (==) will terminates on cyclic values (while (=) may not), but Hash.hash is probably not cycle-safe. So if you manipulate cyclic values of type Img_types.rgb_t, you have to devise your one cycle-safe hash function -- typically, calling Hash.hash on only one of the non-cyclic subfields/subcomponents of your values.

I've already been bitten by precisely this issue in the past. Not a fun bug to track down.

0

精彩评论

暂无评论...
验证码 换一张
取 消