开发者

Finding shortest unique prefix length using map/reduce

开发者 https://www.devze.com 2023-02-21 18:30 出处:网络
I have a list of strings (from documents in CouchDB). I want to find the minimum prefix length so that all shortened strings (taking the first LEN characters) are unique.

I have a list of strings (from documents in CouchDB).

I want to find the minimum prefix length so that all shortened strings (taking the first LEN characters) are unique.

For example:

  • aabb
  • aabc
  • abcd

should give: LEN is three.

Is it possible to write this as 开发者_运维百科a map/reduce function ?


Doing it the brute force way:

MAP: Create for each input record "ABCDE" records with the keys - "A" - "AB" - "ABC" - "ABCD" - "ABCDE"

REDUCE:

  • IF you have 1 value in the iterator output: "length(key)" "true"
  • If you have more than one 1 value in the iterator output: "length(key)" "false"

MAP: Identity mapper

REDUCE: Output "true" if all input values are true. Else output false (or nothing);

That should result in a true for all lengths that are "unique"

0

精彩评论

暂无评论...
验证码 换一张
取 消