开发者

Efficient MapReduce when dealing with streams to queries to the same dataset

开发者 https://www.devze.com 2022-12-18 23:08 出处:网络
I have a massive, static dataset and I\'ve a function to apply to it. f is in the form reduce(map(f, dataset)), so I would use the MapReduce s开发者_开发知识库keleton. However, I don\'t want to scat

I have a massive, static dataset and I've a function to apply to it.

f is in the form reduce(map(f, dataset)), so I would use the MapReduce s开发者_开发知识库keleton. However, I don't want to scatter the data at each request (and ideally I want to take advantage of indexing in order to speedup f). There is a MapReduce implementation that address this general case?

I've taken a look at IterativeMapReduce and maybe it does the job, but seems to address a slightly different case, and the code isn't available yet.


Hadoop's MapReduce (and all the others map-reduce skeleton inspired by Google) doesn't scatter the data all the time.

0

精彩评论

暂无评论...
验证码 换一张
取 消