My goal is to be able to have a person with a mobile phone snap a picture of a local landmark (building or otherwise (ex. gazebo, statue, etc)) on our college campus and be able to identify the landmark and tell them what it is.
For instance, they are walking arou开发者_如何转开发nd and they see a large building with a metal dome. They don't know what it is, but it looks interesting, so they snap a picture and the app tells them that it's the basketball center (and other relevant info).
My limited knowledge in this particular field led me to think of using neural networks and training the program to recognize particular places. If this is the case, please also give me resources for this option, as the extent of my knowledge of NN is that they can be used to recognize things if they are trained. :)
I know of the OpenCV library, but as I am not a C developer, I'd like to know if I need to go down that road before I start. I primarily work in Java, but I'm not opposed to getting my hands dirty.
Thanks!
This is in response to your original question. The best resource would be the O'Reilly book Learning OpenCV
You can read the thing on Google books for free and it uses C along with OpenCV. You can use python or Java to suit your work.
The OpenCV library includes haar training and sample programs on training it for face/text recognition. After that you'll basically have to figure things out. Another useful resource I just stumbled upon is Intel's reference manual for OpenCV. So, good luck!
well using your second method is the much easier one, since you know where from the GPS coordinates and you know which way you're facing (since most mobile devices have an integrated compass and accelerometer). This is used by several Augmented Reality browsers already - if you use Android you might wanna have a look at "Layar"...
The more user friendly way would be via photography, since not every phone has GPS and they always need to turn it on first...
First of all you'd need to get the most salient structures and features of the buildings. OpenCV has some methods for that. Feature extraction is a big topic in image processing. You should probably extract edges on your image, take the prominent features/points and compare these to a database of the features of all the buildings you have.
You could use a neural network for training, but you'd still need a lot of reference pictures to extract data from to get a learning process.
(For comparing with the whole database of other objects you might even wanna have a look at a server-side calculation instead of doing all this on the phone)
Hope that helps...
Doing this as a computer vision task would be very difficult for someone with little computer vision experience - 10 years ago it was an entirely unsolved problem. But to get you started:
Neural networks (or properly, NN with back-propagation-style training) are rather old hat, and no longer the method of choice. Random forests are popular, mostly because they quite flexible, reasonably easy to implement, and have on-average no worse performance that the other classification methods around. Criminisi et al 2011 is the standard paper. http://research.microsoft.com/pubs/155552/decisionForests_MSR_TR_2011_114.pdf
Last time I checked the literature (a few years ago now) there appeared to be two good first choices of image feature. SIFT or sparse Haar wavelets.
Have a look at Criminisi et al 2008 (http://research.microsoft.com/pubs/72423/Criminisi_bmvc2008.pdf) for a random forest and Haar wavelet based object recognition system.
An alternative approach from Fergus et al. 2007, (http://cs.nyu.edu/~fergus/papers/fergus_ijcv.pdf) uses a simple image patch model tied together using a Bayesian network.
OpenCV is probably as good place as any to start to find existing code. Matlab also claims to have good support for these tasks.
精彩评论