Matrix, algorithm interview question_问答_开发者

This was one of my interview questions.

We have a matrix containing integers (no range provided). The matrix is randomly populated with 开发者_运维知识库integers. We need to devise an algorithm which finds those rows which match exactly with a column(s). We need to return the row number and the column number for the match. The order of of the matching elements is the same. For example, If, i'th row matches with j'th column, and i'th row contains the elements - [1,4,5,6,3]. Then jth column would also contain the elements - [1,4,5,6,3]. Size is n x n. My solution:

RCEQUAL(A,i1..12,j1..j2)// A is n*n matrix
if(i2-i1==2 && j2-j1==2 && b[n*i1+1..n*i2] has [j1..j2])
   use brute force to check if the rows and columns are same.
if (any rows and columns are same)
   store the row and column numbers in b[1..n^2].//b[1],b[n+2],b[2n+3].. store row no,
                                                 // b[2..n+1] stores columns that 
                                                 //match with row 1, b[n+3..2n+2] 
                                                 //those that match with row 2,etc..

else
   RCEQUAL(A,1..n/2,1..n/2);
   RCEQUAL(A,n/2..n,1..n/2);
   RCEQUAL(A,1..n/2,n/2..n);
   RCEQUAL(A,n/2..n,n/2..n);

Takes O(n^2). Is this correct? If correct, is there a faster algorithm?

you could build a trie from the data in the rows. then you can compare the columns with the trie.

this would allow to exit as soon as the beginning of a column do not match any row. also this would let you check a column against all rows in one pass.

of course the trie is most interesting when n is big (setting up a trie for a small n is not worth it) and when there are many rows and columns which are quite the same. but even in the worst case where all integers in the matrix are different, the structure allows for a clear algorithm...

You could speed up the average case by calculating the sum of each row/column and narrowing your brute-force comparison (which you have to do eventually) only on rows that match the sums of columns.

This doesn't increase the worst case (all having the same sum) but if your input is truly random that "won't happen" :-)

This might only work on non-singular matrices (not sure), but...

Let A be a square (and possibly non-singular) NxN matrix. Let A' be the transpose of A. If we create matrix B such that it is a horizontal concatenation of A and A' (in other words [A A']) and put it into RREF form, we will get a diagonal on all ones in the left half and some square matrix in the right half.

Example:

A = 1 2
    3 4

A'= 1 3
    2 4

B = 1 2 1 3
    3 4 2 4

rref(B) = 1  0 0   -2
          0  1 0.5 2.5

On the other hand, if a column of A were equal to a row of A then column of A would be equal to a column of A'. Then we would get another single 1 in of of the columns of the right half of rref(B).

Example

A=
 1     2     3     4     5
 2     6    -3     4     6
 3     8    -7     6     9
 4     1     7    -5     3
 5     2     4    -1    -1

A'=
 1     2     3     4     5
 2     6     8     1     2
 3    -3    -7     7     4
 4     4     6    -5    -1
 5     6     9     3    -1

B = 
 1     2     3     4     5     1     2     3     4     5
 2     6    -3     4     6     2     6     8     1     2
 3     8    -7     6     9     3    -3    -7     7     4
 4     1     7    -5     3     4     4     6    -5    -1
 5     2     4    -1    -1     5     6     9     3    -1

rref(B)=
 1     0     0     0     0    1.000  -3.689  -5.921   3.080   0.495
 0     1     0     0     0        0   6.054   9.394  -3.097  -1.024
 0     0     1     0     0        0   2.378   3.842  -0.961   0.009
 0     0     0     1     0        0  -0.565  -0.842   1.823   0.802
 0     0     0     0     1        0  -2.258  -3.605   0.540   0.662

1.000 in the top row of the right half means that the first column of A matches on of its rows. The fact that the 1.000 is in the left-most column of the right half means that it is the first row.

Without looking at your algorithm or any of the approaches in the previous answers, but since the matrix has n^2 elements to begin with, I do not think there is a method which does better than that :)

IFF the matrix is truely random...

You could create a list of pointers to the columns sorted by the first element. Then create a similar list of the rows sorted by their first element. This takes O(n*logn).

Next create an index into each sorted list initialized to 0. If the first elements match, you must compare the whole row. If they do not match, increment the index of the one with the lowest starting element (either move to the next row or to the next column). Since each index cycles from 0 to n-1 only once, you have at most 2*n comparisons unless all the rows and columns start with the same number, but we said a matrix of random numbers.

The time for a row/column comparison is n in the worst case, but is expected to be O(1) on average with random data.

So 2 sorts of O(nlogn), and a scan of 2*n*1 gives you an expected run time of O(nlogn). This is of course assuming random data. Worst case is still going to be n**3 for a large matrix with most elements the same value.