I swear I'm using the correct date format but I keep getting a parse error when loading into WEKA.
"MonFeb2116:00:00+0000"
"EEEMMMddHH:mm:ssZ"
Here is an example dataset:
@RELATION example
@ATTRIBUTE tweetid STRING
@ATTRIBUTE timestamp DATE "EEEMMMddhh:mm:ssZ"
@ATTRIBUTE I NUMERIC
@ATTRIBUTE a NUMERIC
@ATTRIBUTE cool NUMERIC
@ATTRIBUTE foo NUMERIC
@ATTRIBUTE bar NUMERIC
@ATTRIBUTE temp NUMERIC
@ATTRIBUTE class {POS,NEG}
@DATA
39715973388828673,"MonFeb2116:00:00+0000",0,0,0,0,2,2,?
39716148329197568,"MonFeb2116:00:42+0000",0,1,0,0,0,1,?
39715973388828673,"MonFeb2116:00:51+0000",1,0,0,0,0,0,?
39723030380941312,"MonFeb2116:28:03+0000",0,0,0,0,0,0,?
39723030531944448,"MonFeb2116:28:03+0000",0,0,0,0,0,0,?
39723031433707520,"MonFeb2116:28:03+0000",0,0,0,0,0,0,?
WEKA Error:
unparseable date "MonFeb2116:00:00+0000, read Token[MonFeb2116:00:00+0000], line 21
Have used the API documentation to double check - missing something?
http://download.oracle.com/javase/1.4.2/docs/api/java/text/SimpleDateFormat.html
EDIT -----------
@RELATION example
@ATTRIBUTE tweetid STRING
@ATTRIBUTE timestamp DATE "EEE MMM dd hh:mm:ss Z"
@ATTRIBUTE I NUMERIC
@ATTRIBUTE a NUMERIC
@ATTRIBUTE cool NUMERIC
@ATTRIBUTE foo NUMERIC
@ATTRIBUTE love NUMERIC
@ATTRIBUTE temp NUMERIC
@ATTRIBUTE class {POS,NEG}
@DATA
39715973388828673,"Mon Feb 21 16:00:00 +0000",0,0,0,0,2,2,?
39716148329197568,"Mon Feb 21 16:00:42 +0000",0,1,0,0,0,1,?
39715973388828673,"Mon Feb 21 16:00:51 +0000",1,0,0,0,0,0,?
39723030380941312,"Mon Feb 21 16:28:03 +0000",0,0,0,0,0,0,?
3972开发者_高级运维3030531944448,"Mon Feb 21 16:28:03 +0000",0,0,0,0,0,0,?
39723031433707520,"Mon Feb 21 16:28:03 +0000",0,0,0,0,0,0,?
Formatted date to separate tokens with space. Still not playing ball in WEKA...
Which default locale are you using? Using an English locale, the String "MonFeb2116:00:00+0000"
should be parseable with the pattern "EEEMMMddHH:mm:ssZ"
. Note however, that the year will default to 1970, if not present in the pattern or parsed string. That is probably not what you really want.
Well, I don't know whether it'll sort everything out or not, but try changing hh
(12-hour format) to HH
(24-hour format). I'm not sure whether it'll be able to read a "day of the week / month name" without any spaces even so... do you have to get the value in that format? If you could put a space after the 3rd and 6th characters it would help...
精彩评论