开发者

How to parse time stamps with Unicode characters in Java or Perl?

开发者 https://www.devze.com 2022-12-13 08:42 出处:网络
I\'m trying to make my code as generic as possible. I\'m trying to parse install time of a product installation. I will have two files in the product, one that has time stamp I need to parse and other

I'm trying to make my code as generic as possible. I'm trying to parse install time of a product installation. I will have two files in the product, one that has time stamp I need to parse and other file tells the language of the installation.

This is how I'm pa开发者_如何学JAVArsing the timestamp

public class ts {
    public static void main (String[] args){
    String installTime = "2009/11/26 \u4e0b\u5348 04:40:54";
    //This timestamp I got from the first file. Those unicode charecters are some Chinese charecters...AM/PM I guess
    //Locale = new Locale();//don't set the language yet
    SimpleDateFormat df = (SimpleDateFormat)DateFormat.getDateTimeInstance(DateFormat.DEFAULT,DateFormat.DEFAULT);
    Date instTime = null;
    try {
        instTime = df.parse(installTime);
    } catch (ParseException e) {
        // TODO Auto-generated catch block
        e.printStackTrace();
    }
        System.out.println(instTime.toString());
    }
}

The output I get is

       Parsing Failed
    java.text.ParseException: Unparseable date: "2009/11/26 \u4e0b\u5348 04:40:54"
     at java.text.DateFormat.parse(Unknown Source)
     at ts.main(ts.java:39)
    Exception in thread "main" java.lang.NullPointerException
     at ts.main(ts.java:45)

It throws exception and at the end when I print it, it shows some proper date... wrong though. I would really appreciate if you could clarify me on these doubts

  1. How to parse timestamps that have unicode characters if this is not the proper way?

  2. If parsing is failed, how could instTime able to hold some date, wrong though? I know its some chinese,Korean time stamps so I set the locale to zh and ko as follows.. even then same error comes again

    Locale = new Locale("ko");

    Locale = new Locale("ja");

    Locale = new Locale("zh");

How can I do the same thing in Perl? I can't use Date::Manip package; Is there any other way?


Your example datetime stamp is not conforming to CLDR, so we have to define a pattern manually.

use utf8;
use DateTime::Format::CLDR ();

my $cldr = DateTime::Format::CLDR->new(
    locale   => 'zh_CN',
    pattern  => 'yyyy/MM/dd a HH:mm:ss',
    on_error => 'croak',
);

$cldr->parse_datetime('2009/11/26 下午 04:40:54'); # returns a DateTime object


Try this:

import java.text.DateFormat;
import java.util.Date;
import java.util.Locale;

public class ts {
public static void main(final String[] args) {
    String installTime = "2009/11/26 \u4e0b\u5348 04:40:54";
    Locale[] locales = DateFormat.getAvailableLocales();
    for (Locale locale : locales) {
        try {
            Date instTime = DateFormat.getDateInstance(DateFormat.LONG, locale).parse(
                    installTime);
            System.out.println("BINGO! Worked with " + locale);
            System.out.println(instTime);
        } catch (Exception ex) {
        }
    }
}
}

Output:

BINGO! Worked with ja_JP
Thu Nov 26 00:00:00 GMT 2009
BINGO! Worked with ja
Thu Nov 26 00:00:00 GMT 2009


No, U+FFFF is not a valid character in any plane, so your string "installTime" contains undefined garbage.

Edit: furthermore, the code you posted is not the code you're running, since the code you posted (properly) results in

java.text.ParseException: Unparseable date: "2009/11/26 ~K~M~H 04:40:54"
0

精彩评论

暂无评论...
验证码 换一张
取 消