I was having trouble coming up with a way to describe the problem area that I want to understand better so I set up the following scenario to help illustrate
Given the following image, how would I go about programming something that could find 开发者_StackOverflow社区all of the happy faces that match the image in position 1 (call it the template image) and disregard sad face images like those in position 2 and 5.
...I'm not looking for anyone to solve it for me, I just need an insightful first step to get me started as it's uncharted territory for me.
What would this be called? What should I be querying google and stack overflow for in order to find helpful information? Does anyone have a library or code snippet that can help get me started?
Also, I'm a .NET / C# programmer by trade so anything that happens to be in my native language is especially appreciated but not a deal-breaker.
Thanks in advance... Mike
The technique in fact depends on the actual scenario. This goes by several names, such as content based retrieval, template matching, image description and such.
My suggestions:
If your scenario is like the faces, rotated at known angles with known sizes, look for simpler techniques, such as the correlation of two images. Do it for each angle and you got it.
If you know that the only variation between images is the rotation, that means you have only the happy and sad faces rotated, without other distortions, you can look for rotation invariant matching methods. The Fourier theory may help you there, and also mappings to polar coordinates associated with correlations.
The worst case, where you have several variations, you will need to look into image descriptors and pattern matching techniques. These also depend on the image type, and there are several of them. If you end up with these, you'll have a scheme with some libraries/code to extract features from the images and a classifier to tell you which are the same and which are not, with some kind of confidence (such as a distance measure between the features vectors).
The simplest technique would probably be template matching. The difference in your example images is pretty small though, so it might be hard to differentiate for example image 1 and 5 in your example.
A possible algorithm is:
- Compute gradient of the image
- For each gradient vector, compute the gradient direction
- Compute the orientation histogram (angle vs frequency) of the gradient vectors
This orientation histogram will be distinct for the "happy" vs the "sad" smiley.
have fun.
A simple poor persons algorithm just to get the job done in this case could be.
- Determine the bounding box of the image and assume the centre of this is the circle.
- Within the circle search for the two eyes as BLOB's. ie objects that contain 20 or pixels in total that fit within a small defined rectangle.
- Once you have to location of the two eyes you can determine the slope of the intersecting line between the two lines and hence the orientation of the face.
- The distance from the point in the middle of the two eyes straight down though the centre of the circle to the mouth return 1 of 2 possible distances. ie sad or happy.
Quick and dirty and hardcoded to this particular image but it would do the job quickly.
The AForge option is probably a better generalised approach.
精彩评论