I need to write and algorithm that can detect which state an application (used to fill out forms) is in based on screenshots.
It has 2 inputs: A: Approximately 2-10 screenshots from an application with different tabs selected. These are made by the user, so i can instruct him to things like "select the upper area of the program" or "select the whole window", but i can not expect pixel-perfect precision. B: a screenshot of one of these states. The forms are filled with different d开发者_C百科ata.The goal is to determine which screenshot from "A" is from the same state as "B".
An example screenshot:
An example based on this screenshot:
A input: 10 screenshots from this program with "Menu","Sale Order","Purchase order",... tabs selected B input: the screenshot above.The task is to determine which of the 10 screenshots matches this image.
I have tried to use an image descriptor algorithm, (SURF) but it has a really high error ratio, since its not made for such tasks.
Anyone has an idea how to make such classification? Should i use some filter (e.g median or blur) on the screenshots, and then run trought some classification algorithm? Or extract some other feature to classify (FFT,histogram,..)?
I guess you can use the tab width instead of the tab label, which is much easier to calculate. For example, {"Menu", "Sale Order", "Purchase Order"} all have different widths.
If you have to look inside the tab, you can attempt some template matching.
Detect the text of each tab, then look at the background color. Alternatively, locate one of the menu icons for pixel-level registration, then do a pointwise sampling to determine which tab is selected.
精彩评论