Inverse date parsing

we run a REST-webservice that consumes different data, my current problem relates to the date received as a String, and is parsed using java.text.SimpleDateFormat (java 8):

We got a lot (> 50k) of "incorrect" formatted strings, which were parsed by SimpleDateFormat anyway.

SimpleDateFormat is configured with the template "yyyy-MM-dd". We got Strings the other way around: "dd-MM-yyyy".

For example, the line "07-07-1950" was analyzed before the date "0012-10-31" (starting in July in the 7th year, 1950 days were added).

We fixed the implementation, so these lines are now parsed as expected. But we have all the corrupt dates in the system. Last question:

Is there a way to conclude from the date "0012-10-31" on possible source inputs (for example, "07-07-1950", "07-06-1980" and, possibly, more ...)?

Regards

+4
source share
4 answers

Based on Martin Akerman's Answer :

First of all, I simplified the code a bit.

public static Map<String, Set<LocalDate>> createDateMapping(LocalDate min, LocalDate max) throws ParseException {
    DateFormat targetFormat = new SimpleDateFormat("yyyy-MM-dd");
    DateTimeFormatter wrongFormat = DateTimeFormatter.ofPattern("dd-MM-yyyy");

    final Map<String, Set<LocalDate>> inputMappings = new LinkedHashMap<>();

    for (LocalDate date = min; !date.isAfter(max); date = date.plusDays(1)) {
        final String incorrectlyFormattedDate = date.format(wrongFormat);
        final String key = targetFormat.format(targetFormat.parse(incorrectlyFormattedDate));
        if (!inputMappings.containsKey(key)) {
            inputMappings.put(key, new TreeSet<>());
        }
        inputMappings.get(key).add(date);
    }

    return inputMappings;
}

Easy fixing of invalid dates depends on the range of valid dates.
For example, if max=2016-12-31, then the following table shows the number of unique dates that are fixed / ambiguous depending onmin

min         fixable ambiguous
-----------------------------
1990-01-01  9862    0
1980-01-01  8827    2344
1970-01-01  5331    5918
1960-01-01  1832    9494
1950-01-01  408     10950
1940-01-01  314     11054
1930-01-01  218     11160
1920-01-01  165     11223
1910-01-01  135     11263
1900-01-01  105     11303

30 , , 30 ,

    LocalDate max = LocalDate.of(2016, Month.DECEMBER, 31);
    LocalDate min = max.minusYears(30);
    Map<String, Set<LocalDate>> invalidDateMapping = createDateMapping(min, max);
    long reversibleCount = invalidDateMapping.entrySet().stream().filter(e -> e.getValue().size() == 1).count(); // 10859
    long ambiguousCount = invalidDateMapping.size() - reversibleCount; // 50
+1

:

Calendar , "wron" g .

public static Map<String, Collection<String>> createDateMapping() throws ParseException
{
    final DateFormat targetFormat = new SimpleDateFormat("yyyy-MM-dd");
    final DateFormat wrongFormat = new SimpleDateFormat("dd-MM-yyyy");

    //starting today
    final Calendar cal = Calendar.getInstance();

    final Map<String, Collection<String>> inputMappings = new HashMap<>();

    //rolling down to year zero is quite time consuming, back to year 1899 should be enough...
    while (cal.get(Calendar.YEAR) > 1899)
    {
        //creating the "wrong" date string
        final String formattedDate = wrongFormat.format(cal.getTime());
        final String key = targetFormat.format(targetFormat.parse(formattedDate));

        if (!inputMappings.containsKey(key))
        {
            inputMappings.put(key, new ArrayList<>());
        }

        inputMappings.get(key).add(targetFormat.format(cal.getTime()));

        //roll calendar to previous day
        cal.roll(Calendar.DAY_OF_YEAR, false);

        if (cal.get(Calendar.DAY_OF_YEAR) == 1)
        {
            //roll down the year manually, since it is not rolled down automatically
            cal.roll(Calendar.DAY_OF_YEAR, false);

            //roll down the day again, to start at the last day of the year again
            cal.roll(Calendar.YEAR, false);
        }
    }

    return inputMappings;
}

:

final Map<String, Collection<String>> dateMapping = createDateMapping();

System.out.println(dateMapping.get("0012-10-31"));//[2011-05-07, 1980-06-07, 1950-07-07, 1919-08-07]

, , , - , .

+2

, , , , . , , , , , .

, , .

, , 12. , "" 12. , ( ) 2016, 5,5 . , 18 19 , , , .

, , , . , . .

0

Have you tried setting SimpleDateFormat Lent to false

    package test;           

    import java.text.ParseException;            
    import java.text.SimpleDateFormat;          
    import java.util.Date;          

    public class Test {         

        public static void main(String[] args) throws ParseException {          
            SimpleDateFormat dateFormat1 = new SimpleDateFormat("yyyy-MM-dd");          
            SimpleDateFormat dateFormat2 = new SimpleDateFormat("dd-MM-yyyy");          
            dateFormat1.setLenient(false);          
            dateFormat2.setLenient(false);          
            Date d = null;          
            String invalidDate = "07-06-1980";          
        try {           
            d = dateFormat1.parse(invalidDate);         
        } catch (Exception e) {         
            System.out.println("reversed date " + invalidDate);         
            d = dateFormat2.parse(invalidDate);         
        }           

        System.out.println(parsed date " + dateFormat1.format(d));          
    }           
}           

modified date 07-06-1980

analyzed date 1980-06-07

-1
source

Source: https://habr.com/ru/post/1648355/


All Articles