Python strptime finnish

I have a Finnish date representation (tiistaina, 27. lokakuuta 2015) that I need to convert to a datetime object. However, the names of days and months are not recognized by the datetime library in Python

I would expect something like the following:

import locale from datetime import datetime locale.setlocale(locale.LC_TIME, 'fi_FI') the_date = datetime.strptime('tiistaina, 27. lokakuuta 2015', '%A, %d. %B %Y') 

However, this leads to:

 Traceback (most recent call last): File "<stdin>", line 1, in <module> File "/Library/Frameworks/Python.framework/Versions/3.4/lib/python3.4/_strptime.py", line 500, in _strptime_datetime tt, fraction = _strptime(data_string, format) File "/Library/Frameworks/Python.framework/Versions/3.4/lib/python3.4/_strptime.py", line 337, in _strptime (data_string, format)) ValueError: time data 'tiistaina, 27. lokakuuta 2015' does not match format '%A, %d. %B %Y' 

I think this is because Python expects the day to become thiistai instead of tiistai na , and the month to be lokakuu instead of lokakuu na

http://people.uta.fi/~km56049/finnish/timexp.html seems to suggest that, depending on the context, there are different ways to present the day or month in Finnish.

How can I use tiistaina, 27. lokakuuta 2015 string tiistaina, 27. lokakuuta 2015 for a datetime object?

+1
source share
2 answers

'%A, %d. %B %Y' '%A, %d. %B %Y' also creates another time string on my system:

 #!/usr/bin/env python import locale from datetime import datetime #NOTE: locale name is platform-dependent locale.setlocale(locale.LC_TIME, 'fi_FI.UTF-8') print(datetime(2015, 10, 27).strftime('%A, %d. %B %Y')) # -> tiistai, 27. lokakuu 2015 

You can use PyICU to parse a localized date / time string into this format :

 #!/usr/bin/env python # -*- coding: utf-8 -*- from datetime import datetime import icu # PyICU tz = icu.ICUtzinfo.getDefault() # any ICU timezone will do here df = icu.SimpleDateFormat('EEEE, dd. MMMM yyyy', icu.Locale('fi_FI')) df.setTimeZone(tz.timezone) ts = df.parse(u'tiistaina, 27. lokakuuta 2015') print(datetime.fromtimestamp(ts, tz).date()) # -> 2015-10-27 

Related: Parsing date in Python and finding the correct locale_setting setting

This works, but PyICU is a big dependency, and you should read C ++ docs for most things.


There is a dateparser module that should work if you add Finnish data to a simple yaml configuration - similar to how it is done for other languages . Here is a working example for the Dutch language:

 #!/usr/bin/env python import dateparser # $ pip install dateparser print(dateparser.parse(u'dinsdag, 27. oktober 2015', date_formats=['%A, %d. %B %Y'], languages=['nl']).date()) # -> 2015-10-27 

Related: Parsing a French date in python

+1
source

The names of the days of the week and month are replaced in the nominative case for %A and %B respectively; however, the date format has a DOW in the essential case, and the month in the partial. Finnish declension is quite difficult in the general case, but for this case you can suffix the DOW name with na to get the required essive, and ta per month to get partial.

So the format is strptime '%Ana, %d. %Bta %Y' '%Ana, %d. %Bta %Y' with the fi_FI language system fi_FI guaranteed to work for all your dates:

 >>> datetime.datetime.strptime('tiistaina, 27. lokakuuta 2015', '%Ana, %d. %Bta %Y') datetime.datetime(2015, 10, 27, 0, 0) 
0
source

Source: https://habr.com/ru/post/957911/


All Articles