Recognizing image within image in C#_问答_开发者

I'd like to find an image (needle) within an image (haystack).

To keep things simple I take two screenshots of my desktop. One full size (haystack) and a tiny one (needle). I then loop through the haystack image and try to find the needle image.

capture needle and haystack screenshot
loop through haystack, looking out for haystack[i] == first pixel of needle
[if 2. is true:] loop through the 2nd to last pixel of needle and compare it to haystack[i]

Expected result: the needle image is found at the correct location.

I already got it working for some coordinates/widths/heights (A).

But sometimes bits seem to be "off" and therefore no match is found (B).

What could I be doing wrong? Any suggestions are welcome. Thanks.

var needle_height = 25;
var needle_width = 25;
var haystack_height = 400;
var haystack_width = 500;

A. example input - match

var needle = screenshot(5, 3, needle_width, needle_height); 
var haystack = screenshot(0, 0, haystack_width, haystack_height);
var result = findmatch(haystack, needle);

B. example input - NO match

var needle = screenshot(5, 5, needle_width, needle_height); 
var haystack = screenshot(0, 0, haystack_width, haystack_height);
var result = findmatch(haystack, needle);

1. capture needle and haystack image

private int[] screenshot(int x, int y, int width, int height)
{
Bitmap bmp = new Bitmap(width, height, PixelFormat.Format32bppArgb);
Graphics.FromImage(bmp).CopyFromScreen(x, y, 0, 0, bmp.Size);

var bmd = bmp.LockBits(new Rectangle(0, 0, bmp.Width, bmp.Height), 
  ImageLockMode.ReadOnly, bmp.PixelFormat);
var ptr = bmd.Scan0;

var bytes = bmd.Stride * bmp.Height / 4;
var result = new int[bytes];

Marshal.Copy(ptr, result, 0, bytes);
bmp.UnlockBits(bmd);

return result;
}

2. try to find a match

public Point findmatch(int[] haystack, int[] needle)
{
var firstpixel = needle[0];

for (int i = 0; i < haystack.Length; i++)
{
    if (haystack[i] == firstpixel)
    {
    var y = i / haystack_height;
    var x = i % haystack_width;

    var matched = checkmatch(haystack, needle, x, y);
    if (matched)
        return (new Point(x,y));
    }
}    
return new Point();
}

3. verify full match

public bool checkmatch(int[] haystack, int[] needle, int startx, int starty)
{
    for (int y = starty; y < starty + needle_height; y++)
    {
        for (int x = startx; x < startx + needle_width; x++)
        {
            int haystack_index = y * haystack_width + x;
            int needle_index = (y - starty) * needle_width + x - startx;
            if (haystack[haystack_index] != needle[needle_index])
                return false;
        }
    }
    ret开发者_C百科urn true;
}

Instead of making two screenshots of your desktop with a time interval between them, I would take a screenshot once and cut "needle" and "haystack" from those same bitmap source. Otherwise you have the risk of a change of your desktop contents between the two moments where the screenshots are taken.

EDIT: And when your problem still occurs after that, I would try to save the image to a file and try again with that file using your debugger, giving you a reproducible situation.

First, there is a problem with the findmatch loop. You shouldn't just use the haystack image as an array, because you need to subtract needle's width and height from right and bottom respectively:

public Point? findmatch(int[] haystack, int[] needle)
{
    var firstpixel = needle[0];

    for (int y = 0; y < haystack_height - needle_height; y++)
        for (int x = 0; x < haystack_width - needle_width; x++)
        {
            if (haystack[y * haystack_width + x] == firstpixel)
            {
                var matched = checkmatch(haystack, needle, x, y);
                if (matched)
                    return (new Point(x, y));
            }
        }

    return null;
}

That should probably solve the problem. Also, keep in mind that there might be multiple matches. For example, if "needle" is a completely white rectangle portion of a window, there will most likely be many matches in the entire screen. If this is a possibility, modify your findmatch method to continue searching for results after the first one is found:

public IEnumerable<Point> FindMatches(int[] haystack, int[] needle)
{
    var firstpixel = needle[0];
    for (int y = 0; y < haystack_height - needle_height; y++)
        for (int x = 0; x < haystack_width - needle_width; x++)
        {
            if (haystack[y * haystack_width + x] == firstpixel)
            {
                if (checkmatch(haystack, needle, x, y))
                    yield return (new Point(x, y));
            }
        }
}

Next, you need to keep a habit of manually disposing all objects which implement IDisposable, which you have created yourself. Bitmap and Graphics are such objects, meaning that your screenshot method needs to be modified to wrap those objects in using statements:

private int[] screenshot(int x, int y, int width, int height)
{
    // dispose 'bmp' after use
    using (var bmp = new Bitmap(width, height, PixelFormat.Format32bppArgb))
    {
        // dispose 'g' after use
        using (var g = Graphics.FromImage(bmp))
        {
            g.CopyFromScreen(x, y, 0, 0, bmp.Size);

            var bmd = bmp.LockBits(
                new Rectangle(0, 0, bmp.Width, bmp.Height),
                ImageLockMode.ReadOnly,
                bmp.PixelFormat);

            var ptr = bmd.Scan0;

            // as David pointed out, "bytes" might be
            // a bit misleading name for a length of
            // a 32-bit int array (so I've changed it to "len")

            var len = bmd.Stride * bmp.Height / 4;
            var result = new int[len];
            Marshal.Copy(ptr, result, 0, len);

            bmp.UnlockBits(bmd);

            return result;
        }
    }
}

The rest of the code seems ok, with the remark that it won't be very efficient for certain inputs. For example, you might have a large solid color as your desktop's background, which might result in many checkmatch calls.

If performance is of interest to you, you might want to check different ways to speed up the search (something like a modified Rabin-Karp comes to mind, but I am sure there are some existing algorithms which ensure that invalid candidates are skipped immediately).

I don't think your equations for haystack_index or needle_index are correct. It looks like you take the Scan0 offset into account when you copy the bitmap data, but you need to use the bitmap's Stride when calculating the byte position.

Also, the Format32bppArgb format uses 4 bytes per pixel. It looks like you are assuming 1 byte per pixel.

Here's the site I used to help with those equations: https://web.archive.org/web/20141229164101/http://bobpowell.net/lockingbits.aspx

Format32BppArgb: Given X and Y coordinates, the address of the first element in the pixel is Scan0+(y * stride)+(x*4). This Points to the blue byte. The following three bytes contain the green, red and alpha bytes.

Here are the class reference with example code that works great for my C# application for finding needle in haystack for each frame from a USB camera, in year 2018... I believe Accord is mostly a bunch of C# wrappers for fast C++ code.

Also check out the C# wrapper for Microsoft C++ DirectShow that I use to search for needle within each frame from a USB camera