How to create a regular emoticon dictionary in python?

Question

How to create a regular emoticon dictionary in python?

I have a list of emoticon codes inside a file UTF32.red.codesin text format. Simple file content

\U0001F600
\U0001F601
\U0001F602
\U0001F603 
\U0001F604
\U0001F605
\U0001F606
\U0001F609
\U0001F60A
\U0001F60B

Based on the question , my idea is to create a regular expression from the contents of the file to catch emoticons. This is my minimum working example.

import re

with open('UTF32.red.codes','r') as emof:
   codes = [emo.strip() for emo in emof]
   emojis = re.compile(u"(%s)" % "|".join(codes))

string = u'string to check \U0001F601'
found = emojis.findall(string)

print found

foundalways empty. Where am I mistaken? I am using python 2.7

+4

python python-2.7 regex python-unicode

emanuele Jan 08 '16 at 16:11

source share

2 answers

python 3 ( print found print(found)). python 2.7 , re (. ).

python 2, regex, pip2 install regex. import regex, re. regex. (.. regex.compile regex.findall) . .

+1

vrs 08 . '16 17:21

emanuele · Accepted Answer · 2016-01-11T09:17:31+0000

This code works with python 2.7

import re
with open('UTF32.red.codes','rb') as emof:
    codes = [emo.decode('unicode-escape').strip() for emo in emof]
    emojis = re.compile(u"(%s)" % "|".join(map(re.escape,codes)))

search = ur'string to check \U0001F601'
found = emojis.findall(search)

print found

How to create a regular emoticon dictionary in python?

More articles: