开发者

Is there a list of most common english words for indexing text for search?

开发者 https://www.devze.com 2022-12-19 17:59 出处:网络
Is there a free available list of the most common english words to remove from text for creating a开发者_如何学C search index?Wikipedia gives the 100 most frequent lemmas: http://en.wikipedia.org/wiki

Is there a free available list of the most common english words to remove from text for creating a开发者_如何学C search index?


Wikipedia gives the 100 most frequent lemmas: http://en.wikipedia.org/wiki/Most_common_words_in_English

That might be good for a start; the article provides some good references.


Here are the ones (plus characters) used in SQL Server 05 noiseword list, i assume the 08 stopwords are simular.

And the MSDN on it here

Hope this helps

0

精彩评论

暂无评论...
验证码 换一张
取 消

关注公众号