开发者

Merging time series in mathematica efficiently

开发者 https://www.devze.com 2023-03-13 13:01 出处:网络
what i am trying to accomplish seems ordinary enough for there to be an efficient solution. I am using mathematica and i have a number of different timeseries of the type {{date1, value1},{date1, val

what i am trying to accomplish seems ordinary enough for there to be an efficient solution.

I am using mathematica and i have a number of different timeseries of the type {{date1, value1},{date1, value1}...} - the sort you could pass to DateListPlot.

However, the problem is that 开发者_开发知识库these datasets only partially overlap (some may have data from 95-2004, some from 1999 to 2011 and so on)

Now what I would love to be able to do is to merge these into one big list with a common timeline that is the Union[] of all the dates available. Then there would be arrays for the values, but with zeros where there is no data.

Is there an efficient way to accomplish this? I have hundreds of these timeseries and making something that loops the whole thing is probably not very efficient (and even quite tedious to make)

any help is greatly appreciated!


For instance,

ClearAll[l1, l2];
l1 = {{date1, value1}, {date1, value2}, {date2, value3}, {date4, value4}}
l2 = {{date3, value5}, {date4, value5}, {date1, value6}}

then

DeleteDuplicates[Union[l1, l2], #1[[1]] \[Equal] #2[[1]] &]

yields {{date1, value1}, {date2, value3}, {date3, value5}, {date4, value4}}. This means that if you have two data points for the same date, and they are different, one will be lost. It's not clear (to me) if this is what you need or not, so perhaps you could add more detail.

On the other hand, this

Transpose[{DeleteDuplicates[Last@Last@Reap@Scan[Sow[#[[1]]] &, Union[l1, l2]]],
 Last@Reap[Scan[Sow[#[[2]], #[[1]]] &, Union[l1, l2]]]}]

eliminates duplicate headers and collects the values under each header thus:

{{date1, {value1, value2, value6}}, 
 {date2, {value3}}, 
 {date3, {value5}}, 
 {date4, {value4, value5}}} 

(ie, it collects all values for each date).

Some examples of what you want would be nice.


If I understand your question correctly, you want

l1 = {{date1, value1}, {date1, value2}, {date2, value3}, {date4, value4}}
l2 = {{date3, value5}, {date4, value5}, {date5, value6}}

To become

l1 = {{date1, value1}, {date1, value2}, 
     {date2, value3}, {date3, 0}, {date4, value4}, {date5,0}}
l2 = {{date1, 0}, {date2, 0}, {date3, value5}, {date4, value5}, {date5, value6}}

If so, something like this might work:

If[MemberQ[l1[[All,1]],#],Cases[l1,{#,_}],{#,0}]& /@ Union[l1[[All,1]],l2[[All,2]] ]

Depending on how you want multiple data points on the same date in a given series to be treated, you might need to precede the Cases[] function with Sequence @@ or First@, e.g.

If[MemberQ[l1[[All,1]],#],Sequence @@ Cases[l1,{#,_}],{#,0}]& /@   
  Union[l1[[All,1]],l2[[All,1]] ]

I'm home now, so this one has been checked for syntax errors :-)


Thanks guys. I ended up myself doing the solution that i took the union of all timelines. Saving that in let's say daterange i then used Mapthread in the following way

daterange= Union[DatesOfFirstTimeseries,DatesOfSecondTimeseries];

NewVersionOfFirstTimeSeries = (daterange /. 
     MapThread[Rule, {DatesOfFirstTimeseries, ValuesOfFirstTimeseries}] /. 
    MapThread[
     Rule, {daterange, Table[Indeterminate, {Length[daterange]}]}]);

NewVersionOfSecondTimeSeries = (daterange /. 
     MapThread[Rule, {DatesOfSecondTimeseries, ValuesOfSecondTimeseries}] /. 
    MapThread[
     Rule, {daterange, Table[Indeterminate, {Length[daterange]}]}]);

tjis did what i need, but it really does hurt my aesthetic view of things.

0

精彩评论

暂无评论...
验证码 换一张
取 消