Does anyone have any ideas on what could be used as a way to filter untrustworthy user submitted content?
Take Yelp for instance, they would need to prevent competitors writing business reviews on their competitors. They would need to prevent business owners favourably reviewing their own business, or forcing friends/family to do so. They would need to prevent poor quality reviews from affectin开发者_开发知识库g a businesses rating and so on.
I can't think what they might use to do this:
- Prevent multiple users from the same IP reviewing certain things
- Prevent business owners reviewing their own business (maybe even other businesses in the same categories as their own?)
- Somehow determine what a review is about and what the actual intentions behind it are
Other than the first and second points, I can't think of any clever/easy way to filter potentially harmful reviews from being made available, other than a human doing it. Obviously for a site the size of Yelp this wouldn't be feasible, so what parameters could they take into consideration? Even with human intervention, how would anyone know it was the owners best buddy writing a fake review without knowing the people?
I'm using this as an example in a larger study on the subject of filtering user content automatically. Does anyone have any ideas how these systems may work and what they take into consideration?
Thanks!
I would second Zachary on not being able to really prevent people posting things for any particular reason.
The best thing is to expect there to be some bad or dodgy reviews, some spam, some idiots trying to spoil it for the rest of us, but also that the majority of people are well-intentioned. Stack Overflow was built on these ideas. So:
- Keep a dictionary of IP addresses, and give each a rating. Limit the frequency with which a given IP can post multiple reviews, and if they try to flood the system ban the IP for an amount of time. This way, the worse they behave, the harder it is for them.
- Let users of the site rate each review - Amazon does this with 'was this review helpful?'.
- Alongside 2., keep a score for each user (publicly or privately) like the SO reputation score, and use it to limit the actions of new or badly behaved users. If your reputation is too low, you cannot rate others' reviews. Slashdot lets you choose whether to filter out low-scored responses.
- Let the business put forward their side of things in a special review that sits at the top of the list (and mark it as such), so that they have somewhere to say all the fluff they have to say.
- Take note of the principle of punishing and rewarding the behaviour, not the person. That way, people who mildly misbehave can be corrected and turned around into productive contributors, since they're often after attention anyway.
- Bury low-scored responses at the bottom of the list, just like SO orders answers. That way attention-seeking impulses drive users to produce good quality reviews, not post FAKE!!!!11!!
- Read Jeff Atwood's Coding Horror blog; and listen to the SO podcast entries in order. There is a mine of experience there.
The third one sounds completely impossible, without having a computer capable of reading the user's mind, and at that point it would still be an invasion of privacy. Despite their reasons, people should be free to review something based on whatever criteria they want to.
I think a review based website like IMDB or yelp would do a couple of other things instead:
Require the user to either rate so many items or be a member for a certain period of time before their ratings actually count for anything.
Hope that the number of reviews are high enough so that a few outliers in either extreme don't affect the average. You may considering using a different algorithm than a pure average to calculate a final score - perhaps use the median instead.
精彩评论