I have a dataframe giving attendances at sports events
Crowd matchDate
2345 1993-01-26
4567 1993-08-01
8888 1994-03-02
1298 1994-11-07
9876 1995-09-01 etc
1237 2011-09-09
The matchdate is a POSIXct class
I want to be able t开发者_开发技巧o create a season factor based on the date such that each season runs from, say, 1st August to 31 July e.g factor 1992/3 would include dates 1992-08-01 to 1993-07-31
ideally it would be a function that I could apply for several analyses, not necessarily with same start and end dates in the year
An example of my comment.
x <- as.Date(1:1000, origin = "2000-01-01")
x <- cut(x, breaks = "quarter")
And then relabel as you please, if necessary.
labs <- paste(substr(levels(x),1,4), "/", 1:4, sep="")
x <- factor(x, labels = labs)
?cut.POSIXct
breaks
a vector of cut points or number giving the number of intervals which x is to be cut into or an interval specification, one of "sec", "min", "hour", "day", "DSTday", "week", "month", "quarter" or "year", optionally preceded by an integer and a space, or followed by "s". (For "Date" objects only interval specifications using "day", "week", "month", "quarter" and "year" are allowed.)
If your question is more related to how you automatically generate the breaks and labels, maybe this will help
DF <- data.frame(matchDate = as.POSIXct(as.Date(sample(5000,100,replace=TRUE), origin="1993-01-01")))
years <- 1992:2011
DF$season <- cut(DF$matchDate,
breaks=as.POSIXct(paste(years,"-08-01",sep="")),
labels=paste(years[-length(years)],years[-length(years)]+1,sep="/"))
精彩评论