开发者

Script to find if a similar question exists

开发者 https://www.devze.com 2022-12-18 18:10 出处:网络
G\'day, I\'m creating an FAQ system, and the user needs to be able开发者_运维技巧 to see if a similar question has been asked. Just wondering if anyone knows of an scripts (php or javascript preferab

G'day,

I'm creating an FAQ system, and the user needs to be able开发者_运维技巧 to see if a similar question has been asked. Just wondering if anyone knows of an scripts (php or javascript preferably, or possibly actionscript) that has some kind of AI that will do this? I've noticed on stackoverflow as a question is typed, related questions are given underneath.

Any advice would be appreciated.

Thank-you.


This question isn't easily answered without knowing what your database looks like (assuming you have one) or how your site operates.

You could base similarity off of many things:

  1. Share a common category
  2. Share common tags
  3. Share common keywords within their body
    These keywords are often determined after common-words ('and', 'is', 'the', 'it', etc...) are stripped from the string, leaving uncommon words ('C#', 'database', 'questions') to perform lookups with.
  4. Users explicitly declared them similar
  5. etc...

These are all the types of items you should consider when determining similarity. I hope this helps! Return with more specific questions in the future to receive more specific answers.


I think the best you can hope for is for a simple search engine: split the question into words and record the words against the question in a rdbms e.g.

Table questions (id, text, ....)

Table words (question_id, word)

Then to get questions similar to a new question with id $x:

SELECT prev.id, prev.text, count(*) AS common_words
FROM questions prev, words prev_words, words curr_words
WHERE curr_words.question_id=$x 
AND curr_words.word=prev_words.word
AND prev_words.question_id=prev.id
GROUP BY id, text
ORDER BY COUNT(*) DESC
LIMIT.....?

You could certainly apply more elaborate comparison methods on the shortlist returned - but this should certainly be the first step.

C.

0

精彩评论

暂无评论...
验证码 换一张
取 消