SimpleDateFormat.parse () ignores the number of characters in the template

I am trying to parse a String date, which can have a tree of different formats. Despite the fact that the string does not have to match the second pattern, it somehow does and therefore returns an incorrect date.

What is my code:

import java.text.ParseException; import java.text.SimpleDateFormat; import java.util.Date; public class Start { public static void main(String[] args) { SimpleDateFormat sdf = new SimpleDateFormat("dd.MM.yyyy"); try{ System.out.println(sdf.format(parseDate("2013-01-31"))); } catch(ParseException ex){ System.out.println("Unable to parse"); } } public static Date parseDate(String dateString) throws ParseException{ SimpleDateFormat sdf = new SimpleDateFormat("dd.MM.yyyy"); SimpleDateFormat sdf2 = new SimpleDateFormat("dd-MM-yyyy"); SimpleDateFormat sdf3 = new SimpleDateFormat("yyyy-MM-dd"); Date parsedDate; try { parsedDate = sdf.parse(dateString); } catch (ParseException ex) { try{ parsedDate = sdf2.parse(dateString); } catch (ParseException ex2){ parsedDate = sdf3.parse(dateString); } } return parsedDate; } } 

When I 05.07.0036 2013-01-31 I get the conclusion 05.07.0036 .

If I try to parse 31-01-2013 or 31.01.2013 , I get 31.01.2013 as expected.

I realized that the program will give me exactly the same result if I install such templates as follows:

 SimpleDateFormat sdf = new SimpleDateFormat("dMy"); SimpleDateFormat sdf2 = new SimpleDateFormat("dMy"); SimpleDateFormat sdf3 = new SimpleDateFormat("yMd"); 

Why does it ignore the number of characters in my pattern?

+4
java date parsing simpledateformat
Apr 15 '13 at 11:53 on
source share
5 answers

There are several serious issues with SimpleDateFormat. The default value for mitigation can give answers to garbage, and I can not think of a case where mitigation is of any use. This should never have been the default setting. But disabling lenient is only part of the solution. You can still get garbage results that are hard to catch when testing. See comments in the code below.

Here is the SimpleDateFormat extension that forces strict pattern matching. This should have been the default behavior for this class.

 import java.text.DateFormatSymbols; import java.text.ParseException; import java.text.ParsePosition; import java.text.SimpleDateFormat; import java.util.Date; import java.util.Locale; /** * Extension of SimpleDateFormat that implements strict matching. * parse(text) will only return a Date if text exactly matches the * pattern. * * This is needed because SimpleDateFormat does not enforce strict * matching. First there is the lenient setting, which is true * by default. This allows text that does not match the pattern and * garbage to be interpreted as valid date/time information. For example, * parsing "2010-09-01" using the format "yyyyMMdd" yields the date * 2009/12/09! Is this bizarre interpretation the ninth day of the * zeroth month of 2010? If you are dealing with inputs that are not * strictly formatted, you WILL get bad results. You can override lenient * with setLenient(false), but this strangeness should not be the default. * * Second, setLenient(false) still does not strictly interpret the pattern. * For example "2010/01/5" will match "yyyy/MM/dd". And data disagreement like * "1999/2011" for the pattern "yyyy/yyyy" is tolerated (yielding 2011). * * Third, setLenient(false) still allows garbage after the pattern match. * For example: "20100901" and "20100901andGarbage" will both match "yyyyMMdd". * * This class restricts this undesirable behavior, and makes parse() and * format() functional inverses, which is what you would expect. Thus * text.equals(format(parse(text))) when parse returns a non-null result. * * @author zobell * */ public class StrictSimpleDateFormat extends SimpleDateFormat { protected boolean strict = true; public StrictSimpleDateFormat() { super(); setStrict(true); } public StrictSimpleDateFormat(String pattern) { super(pattern); setStrict(true); } public StrictSimpleDateFormat(String pattern, DateFormatSymbols formatSymbols) { super(pattern, formatSymbols); setStrict(true); } public StrictSimpleDateFormat(String pattern, Locale locale) { super(pattern, locale); setStrict(true); } /** * Set the strict setting. If strict == true (the default) * then parsing requires an exact match to the pattern. Setting * strict = false will tolerate text after the pattern match. * @param strict */ public void setStrict(boolean strict) { this.strict = strict; // strict with lenient does not make sense. Really lenient does // not make sense in any case. if (strict) setLenient(false); } public boolean getStrict() { return strict; } /** * Parse text to a Date. Exact match of the pattern is required. * Parse and format are now inverse functions, so this is * required to be true for valid text date information: * text.equals(format(parse(text)) * @param text * @param pos * @return */ @Override public Date parse(String text, ParsePosition pos) { int posIndex = pos.getIndex(); Date d = super.parse(text, pos); if (strict && d != null) { String format = this.format(d); if (posIndex + format.length() != text.length() || !text.endsWith(format)) { d = null; // Not exact match } } return d; } } 
+12
Oct 21 '13 at 19:26
source share

This is described in SimpleDateFormat javadoc:

For formatting, the number of letters of the templates is the minimum number of digits, and shorter numbers with zero addition to this amount. For parsing, the number of pattern letters is ignored unless it is required to separate two adjacent fields.

+2
Apr 15 '13 at 12:05
source share

A workaround might be to check the yyyy-MM-dd format with a regular expression:

 public static Date parseDate(String dateString) throws ParseException { SimpleDateFormat sdf = new SimpleDateFormat("dd.MM.yyyy"); SimpleDateFormat sdf2 = new SimpleDateFormat("dd-MM-yyyy"); SimpleDateFormat sdf3 = new SimpleDateFormat("yyyy-MM-dd"); Date parsedDate; try { if (dateString.matches("\\d{4}-\\d{2}-\\d{2}")) { parsedDate = sdf3.parse(dateString); } else { throw new ParseException("", 0); } } catch (ParseException ex) { try { parsedDate = sdf2.parse(dateString); } catch (ParseException ex2) { parsedDate = sdf.parse(dateString); } } return parsedDate; } 
+2
Apr 15 '13 at 12:31
source share

Thanks @Teetoo. This helped me find a solution to my problem:

If I want the syntax function to match the pattern, I have to set the "soft" ( SimpleDateFormat.setLenient ) of my SimpleDateFormat to false :

 SimpleDateFormat sdf = new SimpleDateFormat("dMy"); sdf.setLenient(false); SimpleDateFormat sdf2 = new SimpleDateFormat("dMy"); sdf2.setLenient(false); SimpleDateFormat sdf3 = new SimpleDateFormat("yMd"); sdf3.setLenient(false); 

This will still analyze the date if I use only one letter of the template for each segment, but it recognizes that 2013 cannot be a day, and therefore it does not match the second template. In combination with length checking, I get exactly what I want.

0
Apr 15 '13 at 12:23
source share

java.time

java.time is a modern date and time Java interface that behaves as you expected. So this is a matter of simply translating your code:

 private static final DateTimeFormatter formatter1 = DateTimeFormatter.ofPattern("dd.MM.yyyy"); private static final DateTimeFormatter formatter2 = DateTimeFormatter.ofPattern("dd-MM-yyyy"); private static final DateTimeFormatter formatter3 = DateTimeFormatter.ofPattern("yyyy-MM-dd"); public static LocalDate parseDate(String dateString) { LocalDate parsedDate; try { parsedDate = LocalDate.parse(dateString, formatter1); } catch (DateTimeParseException dtpe1) { try { parsedDate = LocalDate.parse(dateString, formatter2); } catch (DateTimeParseException dtpe2) { parsedDate = LocalDate.parse(dateString, formatter3); } } return parsedDate; } 

(I placed the formatter outside of your method so that it would not be re-created for each call. You can put it inside if you want.)

Let's try this:

  LocalDate date = parseDate("2013-01-31"); System.out.println(date); 

Exit:

2013-01-31

For numbers, DateTimeFormatter.ofPattern takes the number of letters of the template as the minimum field width. Furthermore, it is assumed that the day of the month is never more than two digits. Thus, when trying to format dd-MM-yyyy he successfully parsed 20 as the day of the month, and then threw a DateTimeParseException because after 20 there was no hyphen (dash). Then the method continued to try the next formatter.

What went wrong in your code

The SimpleDateFormat class you tried to use is notoriously problematic and, fortunately, has long been deprecated. You have encountered one of many problems with this. Repeating an important sentence from the documentation on how it handles numbers from a Teetoo answer:

In parsing, the number of letters in the pattern is ignored unless it is necessary to separate two adjacent fields.

Thus, new SimpleDateFormat("dd-MM-yyyy") successfully parses 2013 as the day of the month, 01 as the month, and 31 as the year. Then we should have expected it to throw an exception, because in January 31 there were no 2013 days. But SimpleDateFormat with default settings does not. It just keeps counting the days over the next months and years and ends on July 5, 36, five and a half years later, the result you observed.

Link

Oracle Tutorial: Date Time explaining how to use java.time.

0
May 26 '19 at 17:39
source share



All Articles