开发者

Question on multi-probe Local Sensitive Hashing

开发者 https://www.devze.com 2022-12-25 08:52 出处:网络
sorry to be asking this kind noob question, but because I really need some guidance on how to use Multi probe LSH pretty urgently, so I did not do much research myself. I realize there is a lib call L

sorry to be asking this kind noob question, but because I really need some guidance on how to use Multi probe LSH pretty urgently, so I did not do much research myself. I realize there is a lib call LSHKIT available that implemented that algorithm, but I have trouble trying to figure out how to use it. Right now, I have a few thousand feature vector 296 dimension, each representing an image. The vector is used to query an user input image, to retrieve the most similar image. The method I used to derive the distance between vector is euclidean distance.

I know this might be a rather noob question, but do you guys have knowledge on how should i implement multi probe LSH? I am really very grateful to any answer or response.

-- update --

Tried to create a model for my data with the provided tool fitdata, however it doesn't seem to take in my file. The format I used for the input is in this format,float size : 4, number of data : 20, dimension : 297, and my array of 297 dimenison float array. However it give me this error

gsl: init_source.c:29: ERROR: matrix dimension n1 must be positive intege开发者_开发知识库r
Default GSL error handler invoked.
Aborted

Do you guys have any idea how to create a input for fitdata?

-- update --

Sorry for the late update, after trying out lsh. You can use the text2bin to format the data for fitdata. The text file contain the feature vector of the image or audio file, with each row representating an vector. After which, use mplsh-tune to get the M and W parameter. To construct the index, you can use the scan tool to sample a set of required query and you can use mplsh-run to get the index. Right now i trying to figure out how to use the index and how to link the library into my coding. Any body have any idea on this?


Let me instead point you to spectral hashing which kicks LSH's butt big time. Bonus: They have matlab code on their website, which you can either use or verify your own implementation against. Also, it's much easier to implement.


This implementation of Multi-probe LSH is much easier to use than the C++ library. It also implements LSH Forest.

0

精彩评论

暂无评论...
验证码 换一张
取 消