Is it possible to find repeating patterns in the text?
My table looks like this:
CREATE TABLE `textanalysis` (
`ID` int(11) NOT NULL AUTO_INCREMENT,
开发者_高级运维`abstract` text,
UNIQUE KEY `ID` (`ID`),
FULLTEXT KEY `abstract` (`abstract`)
) ENGINE=MyISAM AUTO_INCREMENT=2 DEFAULT CHARSET=latin1;
I would like to find the words or group of words in the text then make a statistics.
Here is some tricks (not very optimized)
use "apple" for example,
length for apple is 5
SELECT
(LENGTH(abstract)-LENGTH(REPLACE(LOWER(abstract), 'apple', '')))/5
AS occurrences
FROM
textanalysis
WHERE
MATCH (abstract) AGAINST ('+apple' IN BOOLEAN MODE);
What is does is to replace apple (make the length of abstract shorter),
and you compare the original length to deduce number of occurrences.
I'm not so clear about your requirement, but if you want to count the occurrence of each distinct words, you can try
select count(id) as total_word, abstract from textanalysis group by abstract;
精彩评论