Why do we multiply the matrices the way we do? [closed]_问答_开发者

Closed. This question is off-topic. It is not currently accepting answers.

Want to improve this question? Update the question so it's on-topic for Stack Overflow.

开发者_运维问答

Closed 11 years ago.

Improve this question

Why do we do it by multiplying the Row of first with the Column of second. What's the practical use of it and who invented it? Logically 4x2 means four times two or two times four. So why can be matrix multiplication just the dot product of corresponding elements?

It's one of the things that baffles me.

For numbers, 2x4 = 4x2 because they're commutative. Matrices don't commute so commutativity of the underlaying numbers really has nothing to do with it.

The idea is that a vector (by which I'll mean a column vector with the entries written vertically) is an entity in a vector space. This vector space has addition and scalar multiplication defined on it. It also comes with a basis, {e_n}. e_i is just the vector with 1 in the i'th component and 0's elsewhere. Any vector can be written as a linear combination of the {e_n}. For example, in a 2-dimensional space,

|x_1|       |1|       |0|
|x_2| = x_1 |0| + x_2 |1|

A matrix acts on this vector as a linear transformation and yields a new vector. A linear transformation is just a function, T, with T(x + y) = T(x) + T(y) and c T(x) = T(c x) for any vectors, x and y and any real number c (though we can take it over other fields). So a matrix A acts on a vector x and yields another vector y. A x = y.

|a_11 a_12| |x_1|       |y_2|   |x_1 a_11 + x_2 a_12|
|a_21 a_22| |x_2|   =   |y_1| = |x_1 a_21 + x_2 a_22|

But we can view the matrix as a set of the vectors made of it's columns so that's just the same as

x_11 |a_11| + x_2*|a_12|
     |a_22|       |a_22|

So we've re-expressed the definition for the action of a matrix on a vector (m*n matrix times a n*1 matrix) as a linear combination of the columns of the matrix.

This is what allows for us to conflate a matrix with a linear transformation. To express a given linear transformation, T, as a matrix, we just put T(e_i) in the i'th column of the matrix. Call this matrix A_T. Then A_T x = x_1 T(e1) + x_2 T(e2) + ... + x_n T(en). But by linearity of T, if x = x_1 e_1 + x_2 e_2 + ... + x_n e_n, then T(x) = x_1 T(e_1) + x_2 T(e_2) + ... + x_n T(e_n). But this is exactly what we wrote before for A_T. So the law for multiplying a vector by a matrix is required to allow us to represent linear transformations as matrices.

Now let's consider multiplying general matrices. The idea here is composition of linear functions, that is first do T_1 and then do T_2. That is T_2(T_1(x)) for some vector x. We know from above that we can view these as matrix multiplications. That is A_T2 (A_T1 x). Let's look at it in two dimensions because anything else is masochistic and that suffices to get all the ideas across. Let's relabel the matrices as A_t2 = A and A_T1 = B. Then we have

 A(B x) = |a_11 a_12| (|b_11 b_12| |x_1|)
          |a_21 a_22| (|b_21 b_22| |x_2|)

        = |a_11 a_12| |x_1 b_11 + x_2 b_12| 
          |a_21 a_22| |x_1 b_21 + x_2 b_22|

        = |(x_1 b_11 + x_2 b_12) a_11 + (x_1 b_21 + x_2 b_22) a_12|
          |(x_1 b_11 + x_2 b_12) a_21 + (x_1 b_21 + x_2 b_22) a_22|

        = |x_1 (a_11 b_11 + a_12 b_21) + x_2 (a_11 b_12 + a_12 b_22)|
          |x_1 (a_21 b_11 + a_22 b+21) + x_2 (a_21 b_12 + a_22 b_22)| 

        = |(a_11 b_11 + a_12 b_21) (a_11 b_12 + a_12 b_22)| |x1|
          |(a_21 b_11 + a_22 b+21) (a_21 b_12 + a_22 b_22)| |x2|

Which is just matrix multiplication.

PS. Also probably belongs on Math.SO but I'm not voting to close because I answered. It might be too basic for there as well.

It produces cumulative multiplication results for vector planes. You can manipulate categorical data and arrive at generalized results of linear transformations. Concept example