开发者

Amazon SimpleDB Query to Find "Post By Friends"

开发者 https://www.devze.com 2023-01-10 22:35 出处:网络
I have been developing an iPhone app which queries a server that relays data I store in Amazon SimpleDB.I have a database table of \"Submissions\" by various users.I am interfacing with Facebook to re

I have been developing an iPhone app which queries a server that relays data I store in Amazon SimpleDB. I have a database table of "Submissions" by various users. I am interfacing with Facebook to retrieve Facebook Friends and wish to make a query to "Submissions" to find posts by friends - like:

SELECT * FROM submissions WHERE userID = '00123' OR userID = '00124' OR .... 

(through complete list of friends)

I think this will run into an Amazon query limit with this kind of select statement -

[Maximum number of comparisons per Select expression: 20]

Can you think of a way to elegantly pull this off w开发者_运维百科ith SimpleDB? I'd rather not have to do a bunch of 20 person queries. Or, do I need to move to a different database package and then do cross-table queries?

Thanks!


There is a way to do it with SimpleDB but it isn't elegant, it's more of a hack since it requires you to artificially duplicate the userid attribute in your submission items.

It's based on the fact that although you can only have 20 comparisons per IN predicate, you can have 20 IN predicates if they each name different attributes. So add additional synthetic attributes to your submission items of the form:

userID='00123' userID_2='00123' userID_3='00123' userID_4='00123' ... userID_20='00123'

They all have the identical value for a given submission. Then you can fetch the submission of up to 400 friends with a single query:

SELECT * FROM submissions 
WHERE userID IN('00120','00121',...,'00139') OR
    `userID_2` IN('00140','00141',...,'00159') OR
    `userID_3` IN('00160','00161',...,'00179') OR
    `userID_4` IN('00180','00181',...,'00199') OR
    ...
    `userID_20` IN('00300','00301',...,'00319')

You can populate the 19 extra attributes at the time the submission is created (if you have the attributes to spare) and it doesn't sound like a submission's user would ever change. Also you may want to explicitly name the attributes to be returned (instead of using * ) since you would now have 19 of them that you don't care about in the return data set.

From the data model point of view, this is clearly a hack. But having said that, it gives you exactly what you would want, for users with 400 friends or less: a single query so you can restrict by date or other criteria, sort by most recent, page through results, etc. Unfortunately, a capacity of 400 won't accommodate the friend lists of all facebook users. So you may still need to implement a multi-query solution for large friend lists just the same.

My suggestion is that if SimpleDB suits the needs of your app with the exception of this issue, then consider using the hack. But if you need to do things like this repeatedly then SimpleDB is probably not the best choice.


You're needing either an IN clause or a join to a temp table. Unfortunately AmazonSimpleDB has its limitations. We abandoned it on a promising project for this very reason. We went down the path of multithreading and using the NextToken functionality before we switched gears.

You could execute parallel (multithreaded) queries to SimpleDB to get submissions, each query looking for up to 20 user IDs and then merging the results into one list. Still, it's probably time to consider a switch to MySQL or SQL Server to be able to upload a list of IDs as a temp table and then do a simple join to get the results.


I created the Simple Savant .NET library for SimpleDB, and I happen to have some utility code lying around for splitting and running in parallel multiple select queries, while limiting the IN clause of each select to 20 values. I'll probably roll this code into the next Savant release, but here it is for anyone who finds it useful:

    /// <summary>
    /// Invokes select queries that use parameter lists (with IN clauses) by splitting the parameter list
    /// across multiple invocations that are invoked in parallel.
    /// </summary>
    /// <typeparam name="T">The item type</typeparam>
    /// <typeparam name="P">The select parameter type</typeparam>
    /// <param name="savant">The savant instance.</param>
    /// <param name="command">The command.</param>
    /// <param name="paramValues">The param values.</param>
    /// <param name="paramName">Name of the param.</param>
    /// <returns></returns>
    public static List<T> SelectWithList<T,P>(ISimpleSavantU savant, SelectCommand<T> command, List<P> paramValues, string paramName)
    {
        var allValues = SelectAttributesWithList(savant, command, paramValues, paramName);
        var typedValues = new List<T>();
        foreach (var values in allValues)
        {
            typedValues.Add((T)PropertyValues.CreateItem(typeof (T), values));
        }
        return typedValues;
    }

    /// <summary>
    /// Invokes select queries that use parameter lists (with IN clauses) by splitting the parameter list
    /// across multiple invocations that are invoked in parallel.
    /// </summary>
    /// <typeparam name="P">The select parameter type</typeparam>
    /// <param name="savant">The savant instance.</param>
    /// <param name="command">The command.</param>
    /// <param name="paramValues">The param values.</param>
    /// <param name="paramName">Name of the param.</param>
    /// <returns></returns>
    public static List<PropertyValues> SelectAttributesWithList<P>(ISimpleSavantU savant, SelectCommand command, List<P> paramValues, string paramName)
    {
        Arg.CheckNull("savant", savant);
        Arg.CheckNull("command", command);
        Arg.CheckNull("paramValues", paramValues);
        Arg.CheckNullOrEmpty("paramName", paramName);

        var allValues = new List<PropertyValues>();
        if (paramValues.Count == 0)
        {
            return allValues;
        }

        var results = new List<IAsyncResult>();
        do
        {
            var currentParams = paramValues.Skip(results.Count * MaxValueTestsPerSimpleDbQuery).Take(MaxValueTestsPerSimpleDbQuery).ToList();
            if (!currentParams.Any())
            {
                break;
            }
            var currentCommand = Clone(command);
            currentCommand.Reset();
            var parameter = currentCommand.GetParameter(paramName);
            parameter.Values.Clear();
            parameter.Values.AddRange(currentParams.Select(e => (object)e));
            var result = savant.BeginSelectAttributes(currentCommand, null, null);
            results.Add(result);
        } while (true);

        foreach (var result in results)
        {
            var values = ((ISimpleSavant2)savant).EndSelectAttributes(result);
            allValues.AddRange(values);
        }

        return allValues;
    }

    private static SelectCommand Clone(SelectCommand command)
    {
        var newParameters = new List<CommandParameter>();
        foreach (var parameter in command.Parameters)
        {
            var newParameter = new CommandParameter(parameter.Name, parameter.PropertyName, null);
            newParameter.Values.Clear();
            newParameters.Add(newParameter);
        }
        var newCommand = new SelectCommand(command.Mapping, command.CommandText, newParameters.ToArray())
            {
                MaxResultPages = command.MaxResultPages
            };
        return newCommand;
    }
0

精彩评论

暂无评论...
验证码 换一张
取 消

关注公众号