Change file encoding to utf-8 via vim in script

I just got knocked down after our server was upgraded from Debian 4 to 5. We switched to the UTF-8 environment and now we have problems with the correct text printing in the browser, because all the files are in encodings without utf8, such as iso-8859-1, ascii, etc.

I tried many different scripts.

The first one I tried is "iconv". This file does not work, it modifies the contents, but the files in it are still not utf8.

The same problem with enca, encamv, convmv and some other tools that I installed via apt-get.

Then I found python code that uses the Chardet universal detector module to detect the encoding of the file (which works fine), but using the unicode class or the codec class to save it, since utf-8 does not work without any error.

The only way to find the file and its contents converted to UTF-8 is vi.

These are the steps I am doing for a single file:

vi filename.php :set bomb :set fileencoding=utf-8 :wq 

Here it is. This works great. But how to do it through a script. I would like to write a script (linux shell) that moves around a directory that takes all the php files and then converts them using vi with the commands above. Since I need to run vi application, I do not know how to do this:

"vi --run-command=':set bomb, :set fileencoding=utf-8' filename.php"

Hope someone can help me.

+49
file encoding vi utf-8 character-encoding
Feb 22 '10 at 15:08
source share
3 answers

This is the easiest way that I easily know from the command line:

 vim +"argdo se bomb | se fileencoding=utf-8 | w" $(find . -type f -name *.php) 

Or even better, if you expect the number of files to be quite large:

 find . -type f -name *.php | xargs vim +"argdo se bomb | se fileencoding=utf-8 | w" 
+23
Feb 22 '10 at 15:17
source share

You can put your commands in a file, name it script.vim :

 set bomb set fileencoding=utf-8 wq 

Then you call Vim with the -S (source) parameter to execute the script in the file you want to fix. To do this on a bunch of files, you could do

 find . -type f -name "*.php" -exec vim -S script.vim {} \; 

You can also put Vim commands on the command line using the + option, but I think it can be more readable like this.

Note. I have not tested this.

+16
Feb 22 '10 at 15:21
source share

In fact, you may need set nobomb (BOM = byte order), especially in the world of [non windows].

for example, I had a script that did not work, because the byte order was noted at the beginning. Usually this is not displayed in editors (even with the list installed in vi) or on the console, so it is difficult to detect.

The file looked like

 #!/usr/bin/perl ... 

But trying to run it, I get

 ./filename ./filename: line 1: #!/usr/bin/perl: No such file or directory 

Not displayed, but at the beginning of the file is a 3-byte specification. So, as for linux, the file does not start in C #!

Decision

 vi filename :set nobomb :set fileencoding=utf-8 :wq 

This removes the specification at the beginning of the file, making it correct utf8.

NB Windows uses the specification to define a text file as utf8, not ANSI. Linux (and the official specification) does not.

+3
Oct 22 '14 at 8:15
source share



All Articles