information-retrieval
Question Answering with Lucene
For a toy project, I want to implement an automated question answering system 开发者_开发技巧with Lucene and I\'m trying to figure out a reasonable way to implement it. The basic operation is as follo[详细]
2023-02-06 11:53 分类:问答Wikipedia deletion log download
I need wikipedia deletion log for my project. I was able to find deletion logs here http://en.wikipedia.org/w/index.php?title=Special:Log&type=delete&user=&page=&year=&month=-1&am[详细]
2023-02-03 07:15 分类:问答Java word counter
I am having one problem to count words in Java. I have a Map Map<String,StringBuilder> files_and_text = new TreeMap<String,StringBuilder>();[详细]
2023-01-31 11:41 分类:问答Hashtag filtering with Ruby on Rails
New Rails programmer here.There is probably a pretty simple solution to this, but I just can\'t figure it out.Here\'s the deal: lets say I have many posts on a single page. Each post has a content fie[详细]
2023-01-29 08:34 分类:问答How to evaluate a search/retrieval engine using trec_eval?
Is there any body who has used TREC_EVAL? I need a \"Trec_EVAL for dummies\". I\'m trying to evaluate a few search engines to compare parameters like Recall-Precision,ranking quality, etc for my the[详细]
2023-01-26 17:13 分类:问答Eclipse - Number of times compiled over project?
My team just finished a huge project. We\'re going to present it to class in a week, and want to add some interesting stats.[详细]
2023-01-26 01:03 分类:问答What is proper Tokenization algorithm? & Error: TypeError: coercing to Unicode: need string or buffer, list found
I\'m doing an Information Retrieval Task. As part of pre-processing I want to doing. Stopword removal[详细]
2023-01-22 14:15 分类:问答Searching a normal query in an inverted index
I have a full inverted index in form of nested python dictionary. Its structure is : {word : { doc_name : [location_list] } }[详细]
2023-01-20 08:05 分类:问答web information extraction
I want to create a shop开发者_运维知识库ping search engine that shows products from many websites and I wonder how can I retrieve information about products from those sites.[详细]
2023-01-19 21:25 分类:问答C# algorithm for N-gram
I am intending to use the n-gram code from this article. The algorithm produces these tri-gram开发者_StackOverflow results:[详细]
2023-01-18 09:00 分类:问答