开发者

Ravendb mapreduce grouping by multiple fields

开发者 https://www.devze.com 2023-02-19 12:01 出处:网络
We have a site that contains streaming video and we want to display three reports of most watched videos in the last week, month and year (a rolling window).

We have a site that contains streaming video and we want to display three reports of most watched videos in the last week, month and year (a rolling window).

We store a document in ravendb each time a video is watched:

public class ViewedContent
{
    public开发者_StackOverflow社区 string Id { get; set; }
    public int ProductId { get; set; }
    public DateTime DateViewed { get; set; }
}

We're having trouble figuring out how to define the indexes / mapreduces that would best support generating those three reports.

We have tried the following map / reduce.

public class ViewedContentResult
{
    public int ProductId { get; set; }
    public DateTime DateViewed { get; set; }
    public int Count { get; set; }
}

public class ViewedContentIndex :
        AbstractIndexCreationTask<ViewedContent, ViewedContentResult>
{
    public ViewedContentIndex()
    {
        Map = docs => from doc in docs
                      select new
                                 {
                                     doc.ProductId,
                                     DateViewed = doc.DateViewed.Date,
                                     Count = 1
                                 };

        Reduce = results => from result in results
                            group result by result.DateViewed
                            into agg
                            select new
                                       {
                                           ProductId = agg.Key,
                                           Count = agg.Sum(x => x.Count)
                                       };
    }
}

But, this query throws an error:

var lastSevenDays = session.Query<ViewedContent, ViewedContentIndex>()
                .Where( x => x.DateViewed > DateTime.UtcNow.Date.AddDays(-7) );

Error: "DateViewed is not indexed"

Ultimately, we want to query something like:

var lastSevenDays = session.Query<ViewedContent, ViewedContentIndex>()
                .Where( x => x.DateViewed > DateTime.UtcNow.Date.AddDays(-7) )
                .GroupBy( x => x.ProductId )
                .OrderBy( x => x.Count )

This doesn't actually compile, because the OrderBy is wrong; Count is not a valid property here.

Any help here would be appreciated.


Each report is a different GROUP BY if you're in SQL land, that tells you that you need three indexes - one with just the month, one with entries by week, one by month, and one by year (or maybe slightly different depending on how you're actually going to do the query.

Now, you have a DateTime there - that presents some problems - what you actually want to do is index the Year component of the DateTime, the Month component of the date time and Day component of that date time. (Or just one or two of these depending on which report you want to generate.

I'm only para-quoting your code here so obviously it won't compile, but:

public class ViewedContentIndex :
    AbstractIndexCreationTask<ViewedContent, ViewedContentResult>
{
public ViewedContentIndex()
{
    Map = docs => from doc in docs
                  select new
                             {
                                 doc.ProductId,
                                 Day = doc.DateViewed.Day,
                                 Month = doc.DateViewed.Month,
                                 Year = doc.DateViewed.Year
                                 Count = 1
                             };

    Reduce = results => from result in results
                        group result by new {
                             doc.ProductId,
                             doc.DateViewed.Day,
                             doc.DateViewed.Month,
                             doc.DateViewed.Year
                        }
                        into agg
                        select new
                                   {
                                       ProductId = agg.Key.ProductId,
                                       Day = agg.Key.Day,
                                       Month = agg.Key.Month,
                                       Year = agg.Key.Year  
                                       Count = agg.Sum(x => x.Count)
                                   };
}

}

Hopefully you can see what I'm trying to achieve by this - you want ALL the components in your group by, as they are what make your grouping unique.

I can't remember if RavenDB lets you do this with DateTimes and I haven't got it on this computer so can't verify this, but the theory remains the same.

So, to re-iterate

You want an index for your report by week + product id You want an index for your report by month + product id You want an index for your report by year + product id

I hope this helps, sorry I can't give you a compilable example, lack of raven makes it a bit difficult :-)

0

精彩评论

暂无评论...
验证码 换一张
取 消