Select all rows that do not match the date format?

I am trying to migrate / update an old table that allows null varchar in the supposed date field. I want to find all lines that do not match this format: %e-%b-%y . How can I fulfill this request?

** EDIT: I should note that the field contains several β€œCANCEL”, zero or other string values ​​instead of the more general eby format. I am looking for these lines, so I can update them to the required format (% e-% b-% y).

+4
source share
3 answers

Another approach is to try to recover as many dates as possible in different formats using STR_TO_DATE() , which will return NULL if the extracted value is invalid, and COALESCE() to bind the different date formats.

To display only rows with fatal dates:

 SELECT * FROM table1 WHERE COALESCE(STR_TO_DATE(NULLIF(dt, ''), '%e-%b-%Y'), STR_TO_DATE(NULLIF(dt, ''), '%e-%b-%y'), STR_TO_DATE(NULLIF(dt, ''), '%Y-%m-%d'), STR_TO_DATE(NULLIF(dt, ''), '%m/%d/%Y'), STR_TO_DATE(NULLIF(dt, ''), '%m/%d/%y')) IS NULL; 

To find out what you got after converting dates:

 SELECT *, COALESCE(STR_TO_DATE(NULLIF(dt, ''), '%e-%b-%Y'), STR_TO_DATE(NULLIF(dt, ''), '%e-%b-%y'), STR_TO_DATE(NULLIF(dt, ''), '%Y-%m-%d'), STR_TO_DATE(NULLIF(dt, ''), '%m/%d/%Y'), STR_TO_DATE(NULLIF(dt, ''), '%m/%d/%y')) new_date FROM table1; 

Note:

  • You can link as many format lines as you need.
  • Use the four-digit %y formats before the two digits %y . Otherwise, you will receive the wrong dates.

If you must have the following sample data:

  |  ID |  DT |
 | ---- | ------------- |
 |  1 |  CANCELLED |
 |  2 |  02-Mar-12 |
 |  3 |  (null) |
 |  4 |  5-Aug-13 |
 |  5 |  |
 |  6 |  2013-09-12 |
 |  7 |  23/10/2013 |
 |  8 |  13-Aug-2012 |

Then the second query produces the following output:

  |  ID |  DT |  NEW_DATE |
 | ---- | ------------- | ------------------------------ ---- |
 |  1 |  CANCELLED |  (null) |
 |  2 |  02-Mar-12 |  March, 02 2012 00: 00: 00 + 0000 |
 |  3 |  (null) |  (null) |
 |  4 |  5-Aug-13 |  August, 05 2013 00: 00: 00 + 0000 |
 |  5 |  |  (null) |
 |  6 |  2013-09-12 |  September, 12 2013 00: 00: 00 + 0000 |
 |  7 |  23/10/2013 |  October, 23 2013 00: 00: 00 + 0000 |
 |  8 |  13-Aug-2012 |  August, 13 2012 00: 00: 00 + 0000 |

Here is the SQLFiddle demo

+1
source

You can use regular expressions in MySQL; see http://dev.mysql.com/doc/refman/5.1/en/regexp.html#operator_not-regexp

Here's an expression that returns strings where the date field (dt) is null or does not match 1-2 digits + dash + 3 alphabetic characters + dash + 2 digits (e.g. 06-Sep-13)

 select * from table_name where dt is null or dt not rlike '[[:digit:]]{1,2}-[[:alpha:]]{3}-[[:digit:]]{2}'; 
+1
source

This is based on a comment from Orbling . You can do the following:

 SELECT * FROM my_table WHERE DATE_FORMAT(CAST(date_field, DATE), '%e-%b-%y') <> date_field 

What this does is get the date_field field, try to convert it to a date, and then convert that date back to a string that compares with the original string. If the two do not match, then a string is reported. If conversions do not work, your MySQL client may report a warning, but you can safely ignore them.

This is a very strict check: it will report any line where you could not recreate the original situation by formatting the date accordingly. In particular, it will complain about differences in leading zeros, trailing spaces, etc. If this is a problem, you can either find a less stringent check (possibly based on checking the urgency of the date in combination with some regular expression), or make some simple pattern matching to identify and correct these strings to fit the format you want. How to find all lines matching __-_-____ and entering 0 after the first dash.

+1
source

Source: https://habr.com/ru/post/1500998/


All Articles