C ++ read arabic text from file

In C ++, I have a text file that contains Arabic text, for example:

شكلك بتعرف تقرأ عربي يا ابن الذين

and I want to analyze each line of this file in a line and use string functions on it (e.g. substr, length, at ... etc.), and then print some parts of it into the output file.

I tried to do this, but it prints some garbage characters, such as "\ c7 \ 'e1 \' de \ 'd1 \" Is there a library to support Arabic characters?

edit: just add the code:

#include <iostream> #include <fstream> using namespace std; int main(){ ifstream ip; ip.open("d.rtf"); if(ip.is_open() != true){ cout<<"open failed"<<endl; return 0; } string l; while(!ip.eof()){ getline(ip, l); cout<<l<<endl; } return 0; } 

Note. I still need to add processing code, for example

 if(l == "كلام بالعربي"){ string s = l.substr(0, 4); cout<<s<<" is what you are looking for"<<endl; } 
+6
source share
3 answers

You need to find out what text the file encodes. For example, to read a UTF-8 file as wchar_t, you can (C ++ 11):

 std::wifstream fin("text.txt"); fin.imbue(std::locale("en_US.UTF-8")); std::wstring line; std::getline(fin, line); std::wcout << line << std::endl; 
+2
source

The best way to handle this, in my opinion, is to use some kind of UNICODE helper. Strings in C or even in C ++ are just an array of bytes. When you do, for example, strlen() [C] or somestring.length() [C ++], you will only have the number of os bytes of this string, not the number of os characters.

Some helper functions can be used to help you, for example mbstowcs() . But my opinion is that they are old and easy to use.

Another way is to use C ++ 11, which theoretically supports many things related to UTF-8. But I have never seen it work perfectly, at least if you need to be multi-platform.

The best solution I found is to use the ICU library . With this, I can easily work with UTF-8 strings and with the same “charm” that works with regular std::string . You have a class of strings with methods for length, substrings, etc ... and it is very portable. I use it on Windows, Mac and Linux.

+1
source

You can use Qt .

A simple example:

 #include <QDebug> #include <QTextStream> #include <QFile> int main() { QFile file("test.txt"); file.open(QIODevice::ReadOnly | QIODevice::Text); QTextStream stream(&file); QString text=stream.readAll(); if(text == "شكلك بتعرف تقرأ عربي يا ابن الذين") qDebug()<<",,,, "; } 
0
source

Source: https://habr.com/ru/post/969865/


All Articles