We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 5 years ago.
Improve this questionI am looking on running a search on my database at set intervals for a list of words I consider offensive (because I am an authoritarian dictator and I hate free speech 开发者_JAVA百科- I rule with an Iron fist).
How would I most efficiently search my database for a list of keywords? The two columns I intend to search are indexed as Fulltext.
If anyone knows of a list of offensive words that would be useful too.
A note to those who ridicule my attempts at censorship
I have will have two systems in place. The first is a report function which is checked daily by admins. The second tool to combat the dissenters is this one. All it needs to be is a word search so that the admin may check through and descide if the content is offensive or not.
Mysql won't give you the tools for an acurate search, take this sample, if you have among your words:
freedom
Since you are a dictator you don't want it, it should appear, but clever users will put fr33dom, which is the same, now you have 3 ways to dot this:
- You place in your list one word and most derivations you can imagine
- You make a search with a LIKE in your MySql query, but it should be sloow when you hit the thousands, even with fulltext indexes
- You Index your content using Lucene
I would go for the third, since Lucene is the best choice for performing searches, and since you are looking for words I can imagine that you are dealing with text, so this might help more than you think. Lucene can help you searching words similar to freedom, but not it, there you shouldn't miss much!! And your rule is guarrantied!
There are extensions for Lucene using Zend Framework, you can find them easily in Google.
Best of luck in your dictatorial efforst!
here's your staring list! http://onlineslangdictionary.com/lists/most-vulgar-words/ Check site for more
idea: DB their list, then screen against your DB. Or, DB their list, create all as key words, blocking entry. Then, use SQL wild card within words to check for: freedom or Fr**dom,
But problems tech1 derivations are infinite.
The link below leads to the list of 2200 bad words in 12 languages. MySQL dump, JSON, XML or CSV options are available.
https://github.com/turalus/openDB
Execute this dump into your own database and then query for any occurrence.
精彩评论