开发者

member check and date range check in Matlab

开发者 https://www.devze.com 2023-02-18 20:15 出处:网络
I have 2 matrices 开发者_如何学编程with the SAME IDs.I need to extract those rows of IDs from mat1 which have their dates within say ±5 days of the dates in the mat2.Same operation for mat2 as well.P

I have 2 matrices 开发者_如何学编程with the SAME IDs. I need to extract those rows of IDs from mat1 which have their dates within say ±5 days of the dates in the mat2. Same operation for mat2 as well. Please see the data here: UNIQCols = [1 2] ; dateCol = [3] ; valueCol = [4] ; dayRange = +- 15days.

      % UniqCol  Date    Value
mat1 = [2001 2   733427  1001 ;
        2001 2   733793  2002 ;
        2001 2   734582  2003 ;
        3001 1   734220  30   ;
        3001 1   734588  20   ;];
mat2 = [2001 2   733790  7777 ;
        2001 2   734221  2222 ; 
        3001 1   734220  10   ; 
        3001 1   734588  40   ;] ;

ans1 = [2001 2 733793 2002 ; 3001 1 734220 30 ; 3001 1 734588 20 ] ;
ans2 = [2001 2 733790 7777 ; 3001 1 734220 10 ; 3001 1 734588 40 ] ;

This needs to be a vectorized operation! The IDs are ordered in increasing order of dates. Dates are either separated on Q or Annual basis. So the range will be always << (date2-date1) Please help and thanks!


Here is a function based on similar question I mentioned in my comments. Remember your matrices has to be sorted by date.

function match_for_xn = match_by_distance(xn, xm, maxdist)
%#Generates index for elements in vector xn that close to any of elements in
%#vector xm at least by distance maxdist

match_for_xn = false(length(xn), 1);
last_M = 1;
for N = 1:length(xn)
  %# search through M until we find a match.
  for M = last_M:length(xm)
    dist_to_curr = xm(M) - xn(N);
    if abs(dist_to_curr) < maxdist
        match_for_xn(N) = 1;
        last_M = M;
        break
    elseif dist_to_curr > 0
        last_M = M;
        break
    else
      continue
    end

  end %# M
end %# N

And the test script:

mat1 = sortrows([
        2001 2   733427  1001 ;
        2001 2   733793  2002 ;
        2001 2   734582  2003 ;
        3001 1   734220  30   ;
        3001 1   734588  20   ;
       ],3);
mat2 = sortrows([
        2001 2   733790  7777 ;
        2001 2   734221  2222 ; 
        3001 1   734220  10   ; 
        3001 1   734588  40   ;
       ],3);

mat1_index = match_by_distance(mat1(:,3),mat2(:,3),5);
ans1 = mat1(mat1_index,:);
mat2_index = match_by_distance(mat2(:,3),mat1(:,3),5);
ans2 = mat2(mat2_index,:);

I haven't tried any vectorized solution for your problem. If you get any try it against this solution and check the timing and memory consumption (include sorting step).

0

精彩评论

暂无评论...
验证码 换一张
取 消

关注公众号