开发者

Faster function than datenum in MATLAB

开发者 https://www.devze.com 2023-03-01 20:34 出处:网络
Does anybody knows a faster way to convert date strin开发者_JS百科g (2010-12-12 12:21:12.123) to number?It is often instructional to profile the built-in Matlab functions and extract just the internal

Does anybody knows a faster way to convert date strin开发者_JS百科g (2010-12-12 12:21:12.123) to number?


It is often instructional to profile the built-in Matlab functions and extract just the internal functionality of interest.

In your particular case,

dtstr2dtnummx({'2010-12-12 12:21:12.123'},'yyyy-MM-dd HH:mm:ss')

is 3 times faster (takes 30% of the time) than:

datenum({'2010-12-12 12:21:12.123'},'yyyy-mm-dd HH:MM:SS')

where dtstr2dtnummx is an internal function (C:\Program Files\Matlab\R2011a\toolbox\matlab\timefun\private\dtstr2dtnummx.mexw32 on my Windows machine).

To gain access to this internal function, simply add its folder to the Matlab path using the addpath function, or copy the dtstr2dtnummx.mexw32 file to another folder that is already on your Matlab path.

Note that the string format is different between dtstr2dtnummx and datenum, so be careful!

To those interested, the folder above contains other interesting date conversion functions, so explore and enjoy!

Note 5/5/2011: I have now posted an article that expands this answer on http://undocumentedmatlab.com/blog/datenum-performance/


Often you need to take a systems approach. I had a very similar problem, when I was extracting thousands of dates from a DB. It turns out that many modern DBs (Postgres, Sql server & Oracle are the ones that I tried) all can do the conversion from their date representations to the Matlab date representations several order of magnitudes quicker than the text to datenum on the matlab side. If this data is coming from a DB, think DB-side conversion!!


Presumably if you care about the time spent converting dates, you're converting many of them. Even JIT optimizations in recent versions of matlab aside, you'll get much faster results calling

datenum(cellarrayofdates, 'yyyy-mm-dd HH:MM:SS');

than

for i=1:length(cellarrayofdates); datenum(cellarrayofdates{i}, 'yyyy-mm-dd HH:MM:SS'); end

If you're not already doing that, start there, since it allows matlab to reduce the overhead of figuring out your date format each for each call to the function.


I realize that this question is old. However, I managed to make a function that is approximately 30-40 times faster than datenum. Note: There are minor flaws depending on the usage. If anyone wants me to prefect it, just let me know.

Run on 1,792,379 rows:

  • datenum - 11.463186 seconds
  • datenumjck - 0.300503 seconds

Just read your file with textscan and interpret date and time as doubles and input along with date format to my function.

Example:

Assume data is formated as the following:

Data,2016-03-03,16:15:50;686,0.000000,-0.009500
Data,2016-03-03,16:15:50;696,0.000000,0.006500
Data,2016-03-03,16:15:50;706,0.000000,0.004500
Data,2016-03-03,16:15:50;716,0.000000,-0.006000

Read data:

fileID = fopen('myFile.csv','r');
formatSpec = '%*s %f %f %f %f %f %f %f %*[^\n]'; % Ignore first string, save
                                                 % date and time as doubles
                                                 % ignore all other data
data = textscan(fileID,formatSpec,'delimiter',',\t/:;-.\\ ');
fclose(fileID);

Specify date format and use datenumjck():

dateFormat = 'yyyy-mm-dd,HH:MM:SS;FFF';
numDate = datenumjck(data,dateFormat);

Code:

function num = datenumjck(data, dateFormat)

n = size(data{1});
dateFormat = textscan(dateFormat,'%s','delimiter',',/:;-.\\');
dateFormat = dateFormat{1};

k = find(strcmp('yyyy', dateFormat),1);
if ~isempty(k)
    y = data{k};
elseif ~isempty(find(strcmp('yy', dateFormat),1))
    y = data{find(strcmp('yy', dateFormat),1)};
else
    y = zeros(n);
end

k = find(strcmp('mm', dateFormat),1);
if ~isempty(k)
    m = data{k};
elseif ~isempty(find(strcmp('mmm', dateFormat),1))
    month = cellfun(@strfind,...
        repmat({'janfebmaraprmayjunjulaugsepoctnovdec'},...
        size(data),lower(data(find(strcmp('mmm', dateFormat),1)))));
    m = (month+2)/3;
else
    m = zeros(n);
end

k = find(strcmp('dd', dateFormat),1);
if ~isempty(k)
    d = data{k};
else
    d = zeros(n);
end

k = find(strcmp('HH', dateFormat),1);
if ~isempty(k)
    H = data{k};
else
    H = zeros(n);
end

k = find(strcmp('MM', dateFormat),1);
if ~isempty(k)
    M = data{k};
else
    M = zeros(n);
end


k = find(strcmp('SS', dateFormat),1);
if ~isempty(k)
    S = data{k};
else
    S = zeros(n);
end

k = find(strcmp('FFF', dateFormat),1);
if ~isempty(k)
    F = data{k};
else
    F = zeros(n);
end

ms = [0,31, 59, 90, 120, 151, 181, 212, 243, 273, 304, 334];

num = zeros(n);
for k = 1:n
    num(k) = y(k)*365 + ms(m(k)) + d(k) + floor(y(k)/4)...
        - floor(y(k)/100) + floor(y(k)/400) + (mod(y(k),4)~=0)...
        - (mod(y(k),100)~=0) + (mod(y(k),400)~=0)...
        + (H(k)*3600 + M(k)*60 + S(k) + F(k)/1000)/86400 + 1;
end
0

精彩评论

暂无评论...
验证码 换一张
取 消