How to create a regular emoticon dictionary in python?

I have a list of emoticon codes inside a file UTF32.red.codesin text format. Simple file content

\U0001F600
\U0001F601
\U0001F602
\U0001F603 
\U0001F604
\U0001F605
\U0001F606
\U0001F609
\U0001F60A
\U0001F60B

Based on the question , my idea is to create a regular expression from the contents of the file to catch emoticons. This is my minimum working example.

import re

with open('UTF32.red.codes','r') as emof:
   codes = [emo.strip() for emo in emof]
   emojis = re.compile(u"(%s)" % "|".join(codes))

string = u'string to check \U0001F601'
found = emojis.findall(string)

print found

foundalways empty. Where am I mistaken? I am using python 2.7

+4
source share
2 answers

This code works with python 2.7

import re
with open('UTF32.red.codes','rb') as emof:
    codes = [emo.decode('unicode-escape').strip() for emo in emof]
    emojis = re.compile(u"(%s)" % "|".join(map(re.escape,codes)))

search = ur'string to check \U0001F601'
found = emojis.findall(search)

print found
0
source

python 3 ( print found print(found)). python 2.7 , re (. ).

python 2, regex, pip2 install regex. import regex, re. regex. (.. regex.compile regex.findall) . .

+1

Source: https://habr.com/ru/post/1623452/


All Articles