So I'm trying to convert dates in the format "2000-01-01" into integers representing the number of days since some arbitrary origin (e.g. 1900/01/01) so I can treat them as integer indices. To do this I wrote a conversion function which works fine on MinGW under Windows XP but not under Vista. I've added some logging code:
int dateStrToInt(string date) {
int ymd[3];
tm tm1, tm0;
istringstream iss(date);
string s;
for (int i = 3; i; --i) {
getline(iss, s, '-');
ymd[3-i] = str2<int>(s);
}
cout << ymd[0] << ' ' << ymd[1] << ' ' << ymd[2] << ' ' << endl;
tm1.tm_year = ymd[0] - 1900;
tm1.tm_mon = ymd[1] - 1;
tm1.tm_mday = ymd[2];
time_t t1 = mktime(&tm1);
tm0.tm_year = 0;
tm0.tm_mon = 0;
tm0.tm_mday = 0;
time_t t0 = mktime(&tm0);
//cout << "times: " << mktime(&origin) << ' ' << mktime(&time) << endl;
cout << "times: " << t0 << ' ' << t1 << endl;
cout << "difftime: " << difftime(t1, t0) << endl;
return difftime(mktime(&tm1), mktime(&tm0)) / (60*60*24);
}
int i = dateStrToInt("2000-01-01");
and the output I get from that is
2000 1 1
times: -1 -1
difftime: 0
which seems clearly wrong. What can I do about this?
EDIT: as the answer below says, there seems to be a problem with years prior to 1970. To avoid this I've handrolled my own day-counting function:
int dateStrToInt(string date) {
int ymd[3];
istringstream iss(date);
string s;
for (int i = 0; i < 3; ++i) {
getline(iss, s, '-');
ymd[i] = str2<int>(s);
}
const static int cum_m_days[12] = {0, 31, 59, 90, 120, 151, 181, 212, 24开发者_如何学运维3, 273, 304, 334};
int year = ymd[0]+10000, month = ymd[1], day = ymd[2];
int days = year*365 + cum_m_days[month-1] + day;
// handle leap years
if (month <= 2)
--year;
days = days + (year/4) - (year/100) + (year/400);
return days;
}
It's not necessarily a good idea leaving all of those other struct tm
fields at their default (random in this case) values.
The standard is not overly explicit about what fields need to be set before calling mktime
but it does say that it sets tm_wday
and tm_yday
based on the other fields, and that those other fields are not restricted to being valid.
One thing the standard does show is example code which sets all fields except those two mentioned above so that's what I'd be aiming for.
Try to change the segment that calculates the times from:
tm1.tm_year = ymd[0] - 1900;
tm1.tm_mon = ymd[1] - 1;
tm1.tm_mday = ymd[2];
time_t t1 = mktime(&tm1);
tm0.tm_year = 0;
tm0.tm_mon = 0;
tm0.tm_mday = 0;
time_t t0 = mktime(&tm0);
to something like:
// Quick and dirty way to get decent values for all fields.
time_t filled_in;
time (&filled_in);
memcpy (&tm1, localtime ( &filled_in ), sizeof (tm1));
memcpy (&tm0, &tm1, sizeof (tm0));
// Now do the modifications to relevant fields, and calculations.
tm1.tm_year = ymd[0] - 1900;
tm1.tm_mon = ymd[1] - 1;
tm1.tm_mday = ymd[2];
time_t t1 = mktime(&tm1);
tm0.tm_year = 0;
tm0.tm_mon = 0;
tm0.tm_mday = 0;
time_t t0 = mktime(&tm0);
In addition, some experimentation with CygWin under XP results in mktime
alway seeming to return -1 for struct tm
structures where the tm_year
is less than two. Whether that's an actual bug or not is questionable since I've often found that implementations don't always support dates before the epoch (Jan 1, 1970).
Some UNIXes did allow you to specify tm_year
values less than 70 and they could often use these "negative" values of time_t
to access years back to 1970.
But, since the standard doesn't really go into that, it's left to the implementation. The relevant bit of the C99 standard (and probably earlier iterations), which carries forward to C++, is found in 7.23.1/4:
The range and precision of times representable in clock_t and time_t are implementation-defined.
The safest bet would be to use a date after the start of the epoch as the baseline date. This is shown in the following code:
#include <iostream>
#include <sstream>
#include <string>
#include <ctime>
#include <cstring>
#include <cstdlib>
int dateStrToInt(std::string date) {
int ymd[3];
tm tm1, tm0;
std::istringstream iss(date);
std::string s;
// Test code.
ymd[0] = 2000; ymd[1] = 1; ymd[2] = 1;
std::cout << ymd[0] << ' ' << ymd[1] << ' ' << ymd[2] << ' ' << std::endl;
time_t filled_in;
time (&filled_in);
std::memcpy (&tm0, localtime ( &filled_in ), sizeof (tm0));
std::memcpy (&tm1, &tm0, sizeof (tm1));
tm1.tm_year = ymd[0] - 1900;
tm1.tm_mon = ymd[1] - 1;
tm1.tm_mday = ymd[2];
time_t t1 = mktime(&tm1);
tm0.tm_year = 1970 - 1900; // Use epoch as base date.
tm0.tm_mon = 0;
tm0.tm_mday = 1;
time_t t0 = mktime(&tm0);
std::cout << "times: " << t0 << ' ' << t1 << std::endl;
std::cout << "difftime: " << difftime(t1, t0) << std::endl;
return difftime(mktime(&tm1), mktime(&tm0)) / (60*60*24);
}
int main (void) {
int i = dateStrToInt("2000-01-01");
double d = i; d /= 365.25;
std::cout << i << " days, about " << d << " years." << std::endl;
return 0;
}
This outputs the expected results:
2000 1 1
times: 31331 946716131
difftime: 9.46685e+08
10957 days, about 29.9986 years.
As an addendum, POSIX has this to say:
4.14 Seconds Since the Epoch
A value that approximates the number of seconds that have elapsed since the Epoch. A Coordinated Universal Time name (specified in terms of seconds (tm_sec), minutes (tm_min), hours (tm_hour), days since January 1 of the year (tm_yday), and calendar year minus 1900, (tm_year)) is related to a time represented as seconds since the Epoch, according to the expression below.
If the year is <1970 or the value is negative, the relationship is undefined. If the year is >=1970 and the value is non-negative, the value is related to a Coordinated Universal Time name according to the C-language expression, where tm_sec, tm_min, tm_hour, tm_yday, and tm_year are all integer types:
tm_sec + tm_min*60 + tm_hour*3600 + tm_yday*86400 +
(tm_year-70)*31536000 + ((tm_year-69)/4)*86400 -
((tm_year-1)/100)*86400 + ((tm_year+299)/400)*86400
The relationship between the actual time of day and the current value for seconds since the Epoch is unspecified.
How any changes to the value of seconds since the Epoch are made to align to a desired relationship with the current actual time is implementation-defined. As represented in seconds since the Epoch, each and every day shall be accounted for by exactly 86400 seconds.
Note: The last three terms of the expression add in a day for each year that follows a leap year starting with the first leap year since the Epoch. The first term adds a day every 4 years starting in 1973, the second subtracts a day back out every 100 years starting in 2001, and the third adds a day back in every 400 years starting in 2001. The divisions in the formula are integer divisions; that is, the remainder is discarded leaving only the integer quotient.
In other words (see "If the year is <1970 or the value is negative, the relationship is undefined"), use dates before 1970 at your own risk.
精彩评论