开发者

Comparing dates and filling in gap times in matlab

开发者 https://www.devze.com 2023-02-15 10:21 出处:网络
I have a data file which contains time data. The 开发者_JS百科list is quite long, 100,000+ points. There is data every 0.1 seconds, and the time stamps are so:

I have a data file which contains time data. The 开发者_JS百科list is quite long, 100,000+ points. There is data every 0.1 seconds, and the time stamps are so:

'2010-10-10 12:34:56'

'2010-10-10 12:34:56.1'

'2010-10-10 12:34:56.2'

'2010-10-10 12:34:53.3'

etc.

Not every 0.1 second interval is necessarily present. I need to check whether a 0.1 second interval is missing, then insert this missing time into the date vector. Comparing strings seems unnecessarily complicated. I tried comparing seconds since midnight:

date_nums=datevec(time_stamps);
secs_since_midnight=date_nums(:,4)*3600+date_nums(:,5)*60+date_nums(:,6);
comparison_secs=linspace(0,86400,864000);
res=(ismember(comparison_secs,secs_since_midnight)~=1);

However this approach doesn't work due to rounding errors. Both the seconds since midnight and the linspace of the seconds to compare it to never quite equal up (due to the tenth of a second resolution?). The intent is to later do an fft on the data associated with the time stamps, so I want as much uniform data as possible (the data associated with the missing intervals will be interpolated). I've considered blocking it into smaller chunks of time and just checking the small chunks one at a time, but I don't know if that's the best way to go about it. Thanks!


Multiply your numbers-of-seconds by 10 and round to the nearest integer before comparing against your range.

There may be more efficient ways to do this than ismember. (I don't know offhand how clever the implementation of ismember is, but if it's The Simplest Thing That Could Possibly Work then you'll be taking O(N^2) time that way.) For instance, you could use the timestamps that are actually present (as integer numbers of 0.1-second intervals) as indices into an array.


Since you're concerned with missing data records and not other timing issues such as a drifting time channel, you could check for missing records by converting the time values to seconds, doing a DIFF and finding those first differences that are greater than some tolerance. This would tell you the indices where the missing records should go. It's then up to you to do something about this. Remember, if you're going to use this list of indices to fill the gaps, process the list in descending index order since inserting the records will cause the index list to be unsynchronized with the data.

>> time_stamps = now:.1/86400:now+1;                    % Generate test data.
>> time_stamps(randi(length(time_stamps), 10, 1)) = []; % Remove 10 random records.
>> t = datenum(time_stamps);                            % Convert to date numbers.
>> t = 86400 * t;                                       % Convert to seconds.
>> index = find(diff(t) > 1.999 * 0.1)' + 1             % Find missing records.

index =

       30855
      147905
      338883
      566331
      566557
      586423
      642062
      654682
      733641
      806963
0

精彩评论

暂无评论...
验证码 换一张
取 消

关注公众号