开发者

Group By (Aggregate Map Reduce Functions) in MongoDB using Scala (Casbah/Rogue)

开发者 https://www.devze.com 2023-04-02 03:59 出处:网络
Here\'s a specific query I\'m having trouble with. I\'m using Lift-mongo- records so that i can use Rogue. I\'m happy to use Rogue specific

Here's a specific query I'm having trouble with. I'm using Lift-mongo- records so that i can use Rogue. I'm happy to use Rogue specific syntax , or whatever works.

While there are good examples for using javascript strings via java noted below, I'd like to know what the best practices might be.

Imagine here that there is a table like

comments {
 _id
 topic
 title
 text
 created
}

The desired output is a list of topics and their count, for example

  • cats (24)
  • dogs (12)
  • mice (5)

So a user can see an list, ordered by count, of a distinct/group by

Here's some psuedo SQL:

SELECT [DISTINCT] topic, count(topic) as topic_count
FROM comments
GROUP BY topic
ORDER BY topic_count DESC
LIMIT 10
OFFSET 10

One approach is using some DBObject DSL like

val cursor  = coll.group( MongoDBObject(
"key" -> MongoDBObject( "topic" -> true ) ,
//
"initial" -> MongoDBObject( "count" ->  0 ) ,
"reduce" -> "function( obj , prev) { prev.count += obj.c; }"
 "out" -> "topic_list_result"
))

 [...].sort( MongoDBObject( "created" ->
-1 )).skip( offset ).limit( limit );

Variations of the above do not compile.

I could just ask "what am I doing wrong" but I thought I could make my confusion more acute:

  • can I chain the results directly or do I need "out"?
  • what kind of output can I expect - I mean, do I iterate over a cursor, or the "out" param
  • is "cond" required?
  • should I be using count() or distinct()
  • some examples contain a "map" param...

A recent post I found which covers the java driver implies I should use strings instead of a DSL : http://blog.evilmonkeylabs.com/2011/02/28/MongoDB-1_8-MR-Java/

Would this be the preferred method in either casbah or Rogue?

Update: 9/23

This fails in Scala/Casbah (compiles but produces error {MapReduceError 'None'} )

val map = "function (){ emit({ this.topic }, { count: 1 }); }"
val reduce = "function(key, values) {  var count = 0; values.forEach(function(v) { count += v['count']; }); return {count: count}; }"
val out  = coll.mapReduce(  map ,  reduce  , MapReduceInlineOutput  )
ConfiggyObject.log.debug( out.toString() )

I settled on the above after seeing https://github.com/mongodb/casbah/blob/master/casbah-core/src/test/scala/MapReduceSpec.scala

Guesses:

  • I am misunderstanding the toString method and what the out.object is?
  • missing finalize?
  • missing output specification?
  • https://jira.mongodb.org/browse/SCALA-43 ?

This works as desired开发者_JAVA百科 from command line:

   map = function (){
        emit({ this.topic }, { count: 1 });
    }

    reduce = function(key, values) {  var count = 0; values.forEach(function(v) { count += v['count']; }); return {count: count}; };

    db.tweets.mapReduce( map, reduce,  { out: "results" } ); //
    db.results.ensureIndex( {count : 1});
    db.results.find().sort( {count : 1});

Update The issue has not been filed as a bug at Mongo. https://jira.mongodb.org/browse/SCALA-55


The following worked for me:

val coll = MongoConnection()("comments")
val reduce = """function(obj,prev) { prev.csum += 1; }"""
val res = coll.group( MongoDBObject("topic"->true),
                       MongoDBObject(), MongoDBObject( "csum" -> 0 ), reduce)

res was an ArrayBuffer full of coll.T which can be handled in the usual ways.


Appears to be a bug - somewhere.

For now, I have a less-than-ideal workaround working now, using eval() (slower, less safe) ...

db.eval( "map = function (){ emit( { topic: this.topic } , { count: 1 }); } ; ");
db.eval( "reduce = function(key, values) { var count = 0; values.forEach(function(v) { count += v['count']; }); return {count: count}; }; ");
db.eval( " db.tweets.mapReduce( map, reduce, { out: \"tweetresults\" } ); ");
db.eval( " db.tweetresults.ensureIndex( {count : 1}); ");

Then I query the output table normally via casbah.

0

精彩评论

暂无评论...
验证码 换一张
取 消