information-retrieval
Remove common words but when asked to return an understandable content?
I was wondering if somehow (maybe with an aglorithm) a submitted text like the one below can be summarized (removing the common words)[详细]
2023-04-12 07:09 分类:问答How to smooth unigrams
I have a unigram language model andi want to smooth the counts. Is add one smoothing the only way or can i use some other smoothing also. I dont think we can use knesser nay as that is for Ngrams with[详细]
2023-04-12 02:55 分类:问答Personalized Search with Lucene
I\'d like to ask questions about personalized search. I\'m about to design/implement a personalized search with Lucene. I did some googling about that, but didn\'t seem to find module/开发者_运维问答t[详细]
2023-04-11 13:08 分类:问答unsupervised Named entity recognition (NER) with custom controlled vocabulary for crosslink-suggestions in Java
I\'m looking for a Java library that can do Named entity recognition (NER) with a custom controlled vocabulary, without needing labeled training data first. I searched some on SE, but most questions a[详细]
2023-04-11 12:39 分类:问答Python script to find word frequencies of a given document
I am looking for a simple script that can find frequencies of words for a given document (probably by using portable stemmer开发者_开发问答).[详细]
2023-04-06 05:19 分类:问答The best IR software for my use?
I want to take what people chat about in a chat room and do the following information retrieval: Get the keywords[详细]
2023-04-05 18:17 分类:问答Ranking search keywords
Question is: How to rank keywords that have been used in search queries in my web application based on time and number of search?[详细]
2023-04-05 02:09 分类:问答How to calculate "OnTopicness" of documents using Lucene.NET
Imagine I have a huge database of threads and posts (about 10.000.000 records) from different forum sites including several subforums that serve as my lucene documents.[详细]
2023-04-02 12:54 分类:问答Any ideas of what more web page meta information I can use to classify a page relevance for some theme? [closed]
Closed. This question is opinion-based. It is not currently accepting answers. Want to improve this question? Update the question so it can be answered with facts and citations by editing th[详细]
2023-04-02 09:10 分类:问答How can I select the divs elements that not having another divs inside it?
I\'m using Java and Jsoup to parse HTML pages and I want to get all the divs that not contains another divs inside it to print the text it contains.[详细]
2023-03-28 15:53 分类:问答