开发者

How to determine 3D position of an object from perspective?

开发者 https://www.devze.com 2022-12-11 17:39 出处:网络
I have a situation as shown in image, where I need to determine x,y of red and blue square in rectangular space but under perspective. Length and width of the rectangle are know.

I have a situation as shown in image, where I need to determine x,y of red and blue square in rectangular space but under perspective. Length and width of the rectangle are know. The rectangles are the same just viewed from different locations. Differ开发者_Python百科ence is that one is viewed with 90 degrees offset to the right of the rectangle.

Thanks...

How to determine 3D position of an object from perspective?


Note that the following is a general solution. I think I've worked the math out right; anybody who knows better should comment (or edit, if you're sure...)

First, you need to calibrate your cameras: where they are, where they're pointing, and what their field of view is. This will come as (or else needs to be reduced to) a projection matrix for each camera, which transforms homogeneous worldspace points into homogeneous view points. If you don't know this a priori, you might be able to figure it out from known common features (say, the projected gray rectangles in your diagram)

Then, you need to identify your objects and their position on each image. Say, find the center of the red and blue rectangles.

This will give you a linear system for each object or feature:

P=[x,y,z,w]^t = world point (homog. 4-vector)
p1=[x1,y1,w1]^t = point on screen 1;  M1= 3x4 projection matrix 1:  p1=M1*P
p2=[x2,y2,w2]^t = point on screen 2;  M2= 3x4 projection matrix 2:  p2=M2*P

Your known data are: p1/w1=(u1,v1,1) and p2/w2=(u2,v2,1); multiply these out by variables w1 and w2, to get:

(u1,v1) are constant  ->  p1=[u1*w1,v1*w1,w1]^t
(u2,v2) are constant  ->  p2=[u2*w2,v2*w2,w2]^t
assume that w=1       ->  P=[x,y,z,1]^t

Finally, you have a system of 6 equations in 5 variables: (x,y,z,w1,w2). Since this system is overdetermined, you can solve it using least squares methods.

One way to understand the overdetermined bit: given a pair of cameras (as described by their matrices), you expect them to show the scene consistently. If one of the cameras is misaligned (i.e., if the camera matrix does not reflect the actual camera perfectly), it may show objects in a location higher or lower than it ought to (say), so that the result of that camera is inconsistent with the other one.

Since you are likely using floating point (and possibly even real-world data), your values will never be perfectly accurate anyway, so you will always need to deal with this problem. Least squares allows you to solve this kind of overdetermined system; it also provides error values that may help in diagnosing and solving data problems.


Do you have basic linear algebra knowlege? If yes, than it's easy.

1) You need the projection matrix of both projections (call them P1 and P2 that are 3x2 matrices) 2) You have the equations : t(x1,y1) = P1 t(x,y,z) and t(x2,y2) = P2 t(x,y,z) (where t is the transposition of the vector) 3) You get a system of 3 unknows and 4 equations

Maybe you don't know the projection matrices, then you have first to find them out.

You can most likely make up something fancier (just one 3 by 4 matrix that should have an pseudo-inverse on its left).

If you don't know anything about linear algebra, well... just ask and I'll develop.

PS: sorry if the math stuff is in bad english


The Affine matrix in image processing should help you determine its exact location if this is a linear system. [x,y] =

[Sxcos0, sin0

-sin0, Sycos0]

http://www.sci.utah.edu/~acoste/uou/Image/project3/ArthurCOSTE_Project3.pdf https://www.cse.unr.edu/~bebis/CS485/Lectures/GeometricTransformations.ppt

0

精彩评论

暂无评论...
验证码 换一张
取 消