开发者

Calculating spam probability

开发者 https://www.devze.com 2023-01-02 08:34 出处:网络
I am building a website in python/django and want to predict wether a user submission is valid or wether it is spam.

I am building a website in python/django and want to predict wether a user submission is valid or wether it is spam.

Users have an accept rate on their submissions, like this website has.

Users can moderate other users' submissions; and these moderations are later metamoderated by an admin.

Given this:

  • the registered user A with an submission accept rate of 60% submits something.
  • user B moderates A's post as a valid submission. However, user B is wrong 70% of the time.
  • user C moderates A's post as spam. User C is usually right. If user C says something is spam/ no spam, this will be correct 80% of the time.

How can I predict the chance of A's post being spam?

Edit: I made a python script simulating this scenario:

#!/usr/bin/env python

import random

def submit(p):
    """Return 'ham' with (p*100)% probability"""
    return 'ham' if random.random() < p else 'spam'

def moderate(p, ham_or_spam):
    """Moderate ham as ham and spam as spam with (p*100)% probability"""
    if ham_or_spam == 'spam':
        return 'spam' if random.random() < p else 'ham'
    if ham_or_spam == 'ham':
        return 'ham' if random.random() < p else 'spam'

NUMBER_OF_SUBMISSIONS = 100000 
USER_A_HAM_RATIO = 0.6 # Will submit 60% ham
USER_B_PRECISION = 0.3 # Will moderate a submission correctly 30% of the time
USER_C_PRECISION = 0.8 # Will moderate a submission correctly 80% of the time

user_a_submissions = [submit(USER_A_HAM_RATIO) \
                        for i in xrange(NUMBER_OF_SUBMISSIONS)]

print "User A has made %d submissions. %d of them are 'ham'." \
        % ( len(user_a_submissions), user_a_submissions.count('ham'))

user_b_moderations = [ moderate( USER_B_PRECISION, ham_or_spam) \
                        for ham_or_spam in user_a_submissions]

user_b_moderations_which_are_correct = \
    [i for i, j in zip(user_a_submissions, user_b_moderations) if i == j]

print "User B has correctly moderated %d submissions." % \
    len(user_b_moderations_which_are_cor开发者_StackOverflow中文版rect)

user_c_moderations = [ moderate( USER_C_PRECISION, ham_or_spam) \
                        for ham_or_spam in user_a_submissions]

user_c_moderations_which_are_correct = \
    [i for i, j in zip(user_a_submissions, user_c_moderations) if i == j]

print "User C has correctly moderated %d submissions." % \
    len(user_c_moderations_which_are_correct)

i = 0
j = 0    
k = 0 
for a, b, c in zip(user_a_submissions, user_b_moderations, user_c_moderations):
    if b == 'spam' and c == 'ham':
        i += 1
        if a == 'spam':
            j += 1
        elif a == "ham":
            k += 1

print "'spam' was identified as 'spam' by user B and 'ham' by user C %d times." % j
print "'ham' was identified as 'spam' by user B and 'ham' by user C %d times." % k
print "If user B says it's spam and user C says it's ham, it will be spam \
        %.2f percent of the time, and ham %.2f percent of the time." % \
         ( float(j)/i*100, float(k)/i*100)

Running the script gives me this output:

  • User A has made 100000 submissions. 60194 of them are 'ham'.
  • User B has correctly moderated 29864 submissions.
  • User C has correctly moderated 79990 submissions.
  • 'spam' was identified as 'spam' by user B and 'ham' by user C 2346 times.
  • 'ham' was identified as 'spam' by user B and 'ham' by user C 33634 times.
  • If user B says it's spam and user C says it's ham, it will be spam 6.52 percent of the time, and ham 93.48 percent of the time.

Is the probability here reasonable? Would this be the correct way to simulate the scenario?


Bayes' Theorem tells us:

Calculating spam probability

Let's change the letters for the events A and B to X and Y resp. because you're using A, B and C to stand for people, and it would make things confusing:

P(X|Y) = P(Y|X) P(X) / P(Y)

Edit: the following is slightly wrong because X should be this post _by A_ is spam, not just "this post is spam" (and thus Y should just be "B accepts A's post, C rejects it"). I'm not redoing the math right here because the numbers are changed anyway -- see other edit below for the right number and correct arithmetic.

You want X to mean "this post is spam", Y to stand for the combination of circumstances A has posted it, B approved it, C rejected it (and let's assume conditional independence of the circumstances in question).

We need P(X), the a priori probability that any post (no matter who makes it or approves it) is spam; P(Y), the a priori probability that a post would be made by A, approved by B, rejected by C (whether it's spam or not); and P(Y | X), same as the latter but with the post being spam.

As you may be noticing, you haven't really given us all the bits and pieces we need for the computation. You three points tell us: a given post by A is spam with a probability of 0.4 (that seems to be how the first point reads); B's acceptance probabilty is 0.3 but we have no idea how that differs for spam and non-spam, except that there should be "little" difference (low accuracy); C's is 0.8 and again we don't know how that's influenced by spam vs non-spam, except that there should be "large" difference (high accuracy).

So we need some more numbers! The fact that C has high accuracy while accepting 80% of the posts tells us that overall spam must be surprisingly low -- if spam overall was the same 40% as for A, then C would have to accept half of it (even if he was perfect on always accepting non-spam) to get the overall 80% accept rate, and that would be hardly "high accuracy". So say spam overall is just 20% and C only accepts 1/4 of it (and rejects 1/16 of non-spam), pretty good accuracy indeed and overall matching the numbers you're giving.

Guessing for B, who accepts at 30% overall, and now "knowing" that spam overall is 20%, we could guess that B accepts 1/4 of the spam and only 5/16 of non-spam.

So: P(X)=0.2; P(Y)=0.3*0.2=0.06 (B's overall acceptance times C's rejection prob); P(Y|X)=0.4*0.25*0.75=0.075 (A's prob of spamming time B's prob of accepting spam time C's prob of rejecting spam).

So P(X|Y)=0.075*0.2/0.06=0.25 -- unless I've made some arithmetic error (quite possible, the point is mostly to show you how one can reason in such cases;-), the probability of this particular post being spam is 0.25 -- a bit higher than the probability of any random post being spam, lower than the probability of a random post by A being spam.

But of course (even under the simplifying hypothesis of conditional independence all over hte place;=) this computation is highly sensitive to my guesses/hypotheses about the ratios of false positives vs false negatives for B and C, and the overall spam ratio. There are five numbers of this kind involved (overall spam prob, conditional prob for each of B and C for spam and non-spam) and you only give us two relevant (linear) constraints (unconditional prob of acceptance for B and C) and two vague "handwaving" statements (about low and high accuracy), so there's plenty of degrees of freedom there.

If you can better estimate the five key numbers, the computation can be made more precise.

And, BTW, Python (and a fortiori Django) have absolutely nothing to do with the case -- I recommend you remove those irrelevant tags to get a broader range of responses!

Edit: the user clarifies (in a comment -- shd really edit his Q!):

When I said "B's moderations' accept rate is a mere 30%" I mean that for every ten times B moderates something spam/no spam he makes the wrong decision 7 times. So there is a 70% chance he will tag something spam/no spam when it is not. For user C, "His moderations' accept rate is 80%" means that if C says something is spam or no spam, he is right 80% of the time. Overall chance of a registered user spamming is 20%.

...and asks me to redo the math (I'm assuming false positives and negatives are equally likely for each of B and C). Note that B's an excellent "contrarian indicator", since he's wrong 70% of the time!-).

Anyway: B's overall acceptance rate of A's posts must then be 0.6*0.3 (for when he accepts A's nonspam) + 0.4*0.7 (for when he accepts A's spam) = 0.18 + 0.28 = 0.46; C's must be 0.8*0.4 + 0.2*0.6 = 0.32 + 0.12 = 0.44. So we have...:

P(X)=0.4 (I had it wrong at 0.2 before since I was ignoring the fact that A's probability of spamming is 0.4 -- the overall prob of spam is not relevant, since we know that the post is A's!); P(Y)=0.46*0.56=0.2576 (B's overall acceptance rate for A times C's rejection prob for A); P(Y|X)=0.7*0.8=0.56 (B's prob of accepting spam time C's prob of rejecting spam).

So P(X|Y)=0.56*0.4/0.2576=0.87 (rounding). IOW: while a priori the probability that a post of A's is spam is 0.4, both B's acceptance and C's rejection heighten it, so this specific post of A's has about 87% chance of being spam.


One could use bayesean classification to detect spam and select your training sets for spam and ham based on the results of modification. Results could possibly also be weighted by the rate of accepted posts by the user.

Results with a high probability of being spam could be pushed into a moderation workflow (if you have dedicated moderators). Similarly, a sample of results of former moderations could be submitted into a meta-moderation workflow to get a view on the quality of the classification (i.e. have we got an unacceptably high rate of false positives and negatives).

Finally, an 'appeal' where a user could complain about postings being unfairly classified could also push postings into a meta-moderation workflow. If the user has a history of rejection on appeal or excessively high rates of submitting appeals (maybe an attempt at a DOS attack) their postings might get assigned progressively lower priority in the appeals workflow.


We go at it more empirically.

We've found one of the best indicators of spam is the number of external links in a post/comment, since the whole point of the spam is to get you to go somewhere and buy something and/or make the friendly googlebot think the linked-to page is more interesting.

Our general rules for unregistered users are: 1 link might be OK, 2 is 80%+ probably spam, 3 or more and they're toast. We keep a list of the main domain names that appear in rejected posts and those become triggers even if in a 1 or 2 linker. You can also use a RBL, but be careful since they can be really draconian.

Something this simple might not work for you, but it drastically reduced the load on our moderators and we have had no complaints from real humans.

0

精彩评论

暂无评论...
验证码 换一张
取 消