开发者

Skewing an image using Perspective Transforms

开发者 https://www.devze.com 2022-12-22 19:11 出处:网络
I\'m trying to perform a skew on an image, like one shown here (source: microsoft.com) . I have an array of pixels representing my image and am unsure of what t开发者_开发知识库o do with them.A much

I'm trying to perform a skew on an image, like one shown here

Skewing an image using Perspective Transforms

(source: microsoft.com)

.

I have an array of pixels representing my image and am unsure of what t开发者_开发知识库o do with them.


A much better way to do this is by inverse mapping.

Essentially, you want to "warp" the image, right? Which means every pixel in the source image goes to a predefined point - the predefinition is a transformation matrix which tells you how to rotate, scale, translate, shear, etc. the image which is essentially taking some coordinate (x,y) on your image and saying that, "Ok, the new position for this pixel is (f(x),g(y)).

That's essentially what "warping" does.

Now, think about scaling an image ... say, to ten times the size. So that means, the pixel at (1,1) becomes the pixel at (10,10) - and then the next pixel, (1,2) becomes the pixel (10,20) in the new image. But if you keep doing this, you will have no values for a pixel, (13,13) because, (1.3,1.3) is not defined in your original image and you will have a bunch of holes in your new image - you'll have to interpolate for that value using the four pixels around it in the new image, i.e. (10,10) , (10,20), (20,10), (200,2) - this is called bilinear interpolation.

But here's another problem, suppose your transformation wasn't simple scaling and was affine (like the sample image you've posted)- then (1,1) would become something like (2.34,4.21) and then you'd have to round them in the output image to (2,4) and then you'd have to do bilinear interpolation on the new image to fill in the holes or more complicated interpolation - messy right?

Now, there's no way to get out of interpolation, but we can get away with doing bilinear interpolation, just once. How? Simple, inverse mapping.

Instead of looking at it as the source image going to the new image, think of where the data for the new image will come from in the source image! So, (1,1) in the new image will come from some reverse mapping in the source image, say, (3.4, 2.1) and then do bilinear interpolation on the source image to figure out the corresponding value!

Transformation matrix

Ok, so how do you define a transformation matrix for an affine transformation? This website tells you how to do it by compositing different transformation matrices for rotation, shearing, etc.

Transformations:

Skewing an image using Perspective Transforms

Compositing:

Skewing an image using Perspective Transforms

The final matrix can be achieved by compositing each matrix in the order and you invert it to get the the inverse mapping - use this compute the positions of the pixels in the source image and interpolate.


If you don't feel like re-inventing the wheel, check out the OpenCV library. It implements many useful image processing functions including perspective transformations. Check out the cvWarpPerspective which I've used to accomplish this task quite easily.


As commented by KennyTM you just need an affine transform that is a linear mapping obtained by multiplying every pixel by a matrix M and adding the result to a translation vector V. It's simple math

end_pixel_position = M*start_pixel_position + V

where M is a composition of simple transformations like rotations or scalings and V is a vector that translates every point of your images by adding fixed coefficients to every pixel.

For example if you want to rotate the image you can have a rotation matrix defined as:

    | cos(a) -sin(a) |
M = |                |
    | sin(a)  cos(a) |

where a is the angle you want to rotate your image by.

While scaling uses a matrix of the form:

    | s1   0 |
M = |        |
    | 0   s2 |

where s1 and s2 are scaling factors on both axis.

For translation you just have the vector V:

    | t1 |
V = |    |
    | t2 |

that adds t1 and t2 to pixel coordinates.

You then combine the matrixes in one single transformation, for example if you have either scaling, rotation and translation you'll end up having something like:

| x2 |             | x1 |
|    | = M1 * M2 * |    | + T
| y2 |             | y1 |

where:

  • x1 and y1 are pixel coordinates before applying the transform,
  • x2 and y2 are pixels after the transform,
  • M1 and M2 are matrixes used for scaling and rotation (REMEMBER: the composition of matrixes is not commutative! Usually M1 * M2 * Vect != M2 * M1 * Vect),
  • T is a translation vector use to translate every pixel.
0

精彩评论

暂无评论...
验证码 换一张
取 消