I download the file from OECD http://stats.oecd.org/Index.aspx?datasetcode=CRS1 ('CRS 2013 data.txt') by choosing Export-> Related files. I want to work with this file in Ubuntu (14.04 LTS).
When I run:
dos2unix CRS\ 2013\ data.txt
I see:
dos2unix: Binary symbol 0x0004 found at line 1703 dos2unix: Skipping binary file CRS 2013 data.txt
I check the file encoding:
file --mime-encoding CRS\ 2013\ data.txt
and look:
CRS 2013 data.txt: utf-16le
I do:
iconv -l | grep utf-16le
which does not return anything I do:
iconv -l | grep utf-16le
which returns:
UTF-16LE
Then I run:
iconv --verbose -f UTF-16LE -t UTF-8 CRS\ 2013\ data.txt -o crs_2013_data_temp.txt
and check:
file --mime-encoding crs_2013_data_temp.txt
and look:
crs_2013_data_temp.txt: utf-8
Then I try:
dos2unix crs_2013_data_temp.txt
and get:
dos2unix: Binary symbol 0x04 found at line 1703 dos2unix: Skipping binary file crs_2013_data_temp.txt
Then I will try to force it:
dos2unix -f crs_2013_data_temp.txt
This works, that is, dos2unix completes the conversion without having to give out / complain, but when I open the file, I see entries like "FoÃ" Ťa and à Śajnià Ťe ".
My question is why? Is it because the specification is not visible dos2unix? Because he is absent? Have I not made the right to transfer? How to convert this file (right?) So that I can read it.