开发者

Parse date embedded in other Text in Java

开发者 https://www.devze.com 2023-03-13 22:05 出处:网络
I need to parse date imbedded in some arbitrary text as follows \"hello world, good Day Thu Mar 03 07:13:56 GMT 2011\"

I need to parse date imbedded in some arbitrary text as follows

"hello world, good Day Thu Mar 03 07:13:56 GMT 2011"

I know the pattern of the date (below), however I'm not sure how to parse it from the text string above. How do开发者_开发问答 I do it?

String format = "E MMM dd HH:mm:ss z yyyy";
new SimpleDateFormat(format).parse(date);


You can use the DateFormat class!

Assuming you know what index the date is in the text,

String text = "hello world, good Day Thu Mar 03 07:13:56 GMT 2011";
String dateText = text.substring(22);
DateFormat df = DateFormat.getDateInstance();
Date date = df.parse(dateText);

The parse method should be able to construct a date object from the string if it is well formatted.

Here is the documentation

EDIT

Knowing that the date is ALWAYS at the end of the string and that the date portion is always 28 characters long(?) ... you could cut out the end of the string and parse it to a date.

String text = "hello world, good Day Thu Mar 03 07:13:56 GMT 2011";
String dateText = text.substring(text.length()-28); //28 is the date portion
DateFormat df = DateFormat.getDateInstance();
Date date = df.parse(dateText);


Use regex to extract the date from the expression. In this case:

([Mon|Thu|{rest of days}] [Jan|Feb|{rest of months} .... \d\d\d\d)

The parenthese () defines a group that can be retrieved with getGroup().


If you know the position in the input string where the date starts, you can do something like this:

String input = "hello world, good Day Thu Mar 03 07:13:56 GMT 2011";
String format = "E MMM dd HH:mm:ss z yyyy";
new SimpleDateFormat(format).parse(input, new ParsePosition("hello world, good Day ".length()));

If you don't know the position, you could use a regular expression to find the date in your format.


Here is one workaround:

    String date = "hello world, good Day Thu Mar 03 07:13:56 GMT 2011";
    date = date.replaceAll("^(?:.*)(Mon|Tue|Wed|Thu|Fri|Sat|Sun|Sunday)", "$1");
    System.out.println(date);


This isn't bullet proof, but it should server you well. It will match a date anywhere in any String that "looks like" a date:

    String input = "hello world, good Day Thu Mar 03 07:13:56 GMT 2011 foo bar";
    String regex = "(Mon|Tue|Wed|Thu|Fri|Sat|Sun) (Jan|Feb|Mar|Apr|May|Jun|Jul|Aug|Sep|Oct|Nov|Dec) \\d\\d \\d\\d:\\d\\d:\\d\\d [A-Z]{3} [12]\\d\\d\\d";
    Matcher matcher = Pattern.compile(regex).matcher(input);
    if (!matcher.find())
        throw new IllegalArgumentException("Couldn't find a date");
    String datestr = matcher.group();
    Date date = new SimpleDateFormat("E MMM dd HH:mm:ss z yyyy").parse(datestr);


Here's a brute force method that absolutely works:

public static Date parseDate(String input)
{
    SimpleDateFormat format = new SimpleDateFormat("E MMM dd HH:mm:ss z yyyy");
    for (int i = 0; i < input.length() - 29; i++)
    {
        try
        {
            return format.parse(input.substring(i, i + 29));
        }
        catch (ParseException ignore) {}
    }
    throw new IllegalArgumentException();
}

It just scans along the string trying every start position until it parses a date

0

精彩评论

暂无评论...
验证码 换一张
取 消