开发者

What is the best (scalable, fast, reliable) approach to implement an Activity Feed, Messaging Queue or RDBMS or NoSQL DBs?

开发者 https://www.devze.com 2023-02-05 08:29 出处:网络
I need to build an activity feed (stream? A \"lifestream\" to be more accurate.) for a system similar (same) in resemblance to many popular social networking platforms. My initial attempt was to use a

I need to build an activity feed (stream? A "lifestream" to be more accurate.) for a system similar (same) in resemblance to many popular social networking platforms. My initial attempt was to use an RDBMS but quickly dropped the idea due to the vast amounts of JOINs needed. Scavenging for other possible (and better-suited) approaches, I stumbled upon the following post:

How do social networking websites compute friend updates?

Taking the advise to make use of a message queue, I have spent some time studying RabbitMQ and its PubSubHubbub protocol. And I postulated the following approach:

1) Each user has a "topic"

2) Other users subscribe to the topic

3) When the user performs some action, a message is published which is then related (References resolved), formatted (Human-friendly language, links, etc.) and aggregated (X, Y and Z have commented on post P) with a PHP-script.

However, I would still have to go through each message and process it (unless my approach is completely wrong). So, what would the difference 开发者_开发问答be between storing everything in a RDBMS and using a message queue (other than the implementation of the PubSubHubbub protocol)?

Are there more efficient ways to build such a system? (If so, please specify)

Comments / Suggestions / Criticisms are welcome. :)

Thank you in advance!

P.S.: There is an interesting article on how FriendFeed implements it ( http://bret.appspot.com/entry/how-friendfeed-uses-mysql ). However, I feel the "hackery" pushes MySQL out of it's comfortable domain (which is simply Relational Data and what would be the point of using an RDBMS without relational data?)

P.P.S.: Another issue using a message queue that I see (perhaps, due to me being new to this technology) is that once the message is fetched by the "Consumer", it is removed from the queue, however, I want it to persist for an arbitrary amount of time.


Some tips I would like to give you:

  • Don't use a RDBMS, but an in-memory(FAST) database like for example redis. As hopefully you agree with me from the redis benchmarks, redis is pretty fast. As another sidenote I would like to point out installing redis is child's play :).

    make

There is a redis-client for PHP which uses C so that is also going to be very fast. - If I understand you correctly you think that pubsubhubbub is the same as a message queue but they aren't:

Parties (servers) speaking the PubSubHubbub protocol can get near-instant notifications (via webhook callbacks) when a topic (feed URL) they're interested in is updated.

Versus message queue:

In computer science, message queues and mailboxes are software-engineering components used for interprocess communication, or for inter-thread communication within the same process. They use a queue for messaging – the passing of control or of content.

You might think they are the same(they have some similarities), but they aren't the same. For my message queue I would redis(redis is very powerfull because it also has a basic message queue :)). You could put message(unit of work) onto a queue using rpush.

rpush <name of queue> <message>

Then from your worker processes you could receive messages from the queue using brpop(blocking pop :))

brpop <name of queue> 0

The workers process spawn are going to be started from the cli to stay in memory so aren't going to have overhead loading PHP in memory again and again.

php worker.php

I hope this is hopefully for you and if you might have any question I am very willing to answer them ;)

0

精彩评论

暂无评论...
验证码 换一张
取 消