I am writing a program to try and derive value from a timeseries set of stock-trade ideas. They are in a an array of quotedTrade objects below (using JSONSerializer to get the data off of disk) sorted by the startOn date field:
[<DataContract>]
type trade = {
[<field: DataMember(Name="tradeId") >]
tradeId : int ;
[<field: DataMember(Name="analystId") >]
analystId: int
[<field: 开发者_开发知识库DataMember(Name="startOn") >]
startOn : System.DateTime ;
[<field: DataMember(Name="endOn") >]
endOn : System.DateTime option ;
[<field: DataMember(Name="tradeType") >]
tradeType : string ;
[<field: DataMember(Name="securityId") >]
securityId : int ;
[<field: DataMember(Name="ricCode") >]
ricCode : string ;
[<field: DataMember(Name="yahooSymbol") >]
yahooSymbol : string ;
[<field: DataMember(Name="initialPrice") >]
initialPrice : float ;
}
[<DataContract>]
type quotedTrade = {
[<field: DataMember(Name="trade") >]
trade : trade ;
[<field: DataMember(Name="q7") >]
q7: float
[<field: DataMember(Name="q14") >]
q14: float
[<field: DataMember(Name="q21") >]
q21: float
}
I would like to
break view the data two ways
- by analystId
- by ticker (either ric or yahoo symbols)
then iterate over the views with windows of days
perhaps introducing records:
type byAnalyst = {
analystId: int
trades: quotedTrade array
}
type byTicker = {
symbol: string
trades: quotedTrade array
}
and then filter them somehow (sliceByAnalyst, sliceByTicker to be provided later - though sugegstions on a clean solution would be appreciated I am considering the use of Array.Map, Array.Filter functions)
let quotedTrades : quotedTrade array = getTradesFromDisk()
let tradesByAnalyst : byAnalyst array = sliceByAnalyst quotedTrades
let tradesByTicker : byTicker array = sliceByTicker quotedTrades
The main question is around applying a sliding window:
// iterate over each analyst
for tradeByAnalyst in tradesByAnalyst do
// look at trades on a per analyst basis
let mySeries : quotedTrade array= tradeByAnalyst.trades
// each window is an array of trades that occured in a seven day period
let sevenDayWindowsByAnalyst : quotedTrade array = sliceByDays 7 mySeries
// I want to evaluate the seven day window, per this analsyt independently
for sevenDayWindowByAnalyst in sevenDayWindowsByAnalyst do
let someResult = doSomethingWithTradesInWindow sevenDayWindowByAnalyst
The crux is I have a dataset per analyst where a single trade at day 0 is represented as: T0 and a single trade at day 1 is: T1; my orignal set contains 3 trades at day 0, and individual trades on the 1, 3, 5, 8, 10 periords after:
[ T0 T0 T0 T1 T3 T5 T8 T10 ]
returns
[
[ T0 T0 T0 T1 T3 T5 ] // incudes T0 -> T6
[ T1 T3 T5 ] // incudes T1 -> T7
[ T3 T5 T8 ] // incudes T2 -> T8
[ T3 T5 T8 ] // incudes T3 -> T9
[ T5 T8 T10 ] // incudes T4 -> T10
[ T5 T8 T10 ] // incudes T5 -> T11
[ T8 T10 ] // incudes T6 -> T12
[ T8 T10 ] // incudes T7 -> T13
[ T8 T10 ] // incudes T8 -> T14
[ T10 ] // incudes T9 -> T15
[ T10 ] // incudes T10 -> T16
]
Any ideas on the best way to accomplish this would be highly appreciate.
First of all, regarding your first question - how to break the data - you can also use functions from the Seq
module (they work with any collection type like lists, arrays, etc.). To break the data into groups, you could nicely use Seq.groupBy
:
trades
|> Seq.groupBy (fun qt -> qt.trade.analystId)
|> Seq.map (fun (key, values) ->
{ analystId = key; trades = values |> Array.ofSeq )
Further processing of the data can be done again with Seq
functions (like filter
and map
). I think these are preferred over functions for Array
if you want to have the code more general (also some functions are not available in Array
). However, functions from Array
are a bit faster (for larger volumes of data, this could matter).
In the question about sliding window, I do not fully understand what your data representation is. However, if you have (or could construct) a list of all trades (e.g. type list<quotedTrade>
for each analyst, then you could use Seq.windowed
:
trades
|> Seq.windowed 7
|> Seq.map (fun win ->
// all trades in the window as an array are in 'win'
)
The function windowed
creates only windows of the specified length (shorter windows are dropped), so this doesn't do exactly what you wanted. However, I guess that you could pad the data with empty trades to workaround this.
I think you can probably do something like this to get the trades you care about:
let sliceByDays n (qts : quotedTrade seq) =
let dates = qts |> Seq.map (fun qt -> qt.trade.startOn)
let min = dates |> Seq.min
let max = dates |> Seq.max
let days = min |> Seq.unfold (fun d -> if d > max then None else Some(d, d.AddDays(1.)))
[for d in days do
let tradesInRange =
qts
|> Seq.filter (fun qt -> qt.trade.startOn >= d && qt.trade.startOn < d.AddDays(float n))
yield tradesInRange]
This gives you a list of sequences of trades, one sequence per day.
精彩评论