Find Non-UTF8 File Names on Linux File System

O / S = Fedora Code 9.

I have several files hiding in my file system LANG = en_US: UTF-8, which were loaded with unrecognizable characters in their file name.

I need to search the file system and return all file names that have at least one character that is not in the standard range (a-zA-Z0-9 and.-_, Etc.)

I try to follow but no luck.

find . | egrep [^a-zA-Z0-9_\.\/\-\s]

All suggestions are welcome.

Greetings

AP.

+3
source share
4 answers

convmv . , ( , ).

+12
find . | perl -ane '{ if(m/[[:^ascii:]]/) { print } }'
+5

. | egrep [^ a-zA-Z0-9 _./-\s]

, !

bash , -. [^ group].

, , , UTF-8. , UTF-8, . Python 2.x, :

import os.path
def walk(dir):
    for child in os.listdir(dir):
        child= os.path.join(dir, child)
        if os.path.isdir(child):
            for descendant in walk(child):
                yield descendant
        yield child

for path in walk('.'):
    try:
        u= unicode(path, 'utf-8')
    except UnicodeError:
        # print path, or attempt to rename file
+1

OP, Superuser (. ), , " convmv", comvmv.

0

Source: https://habr.com/ru/post/1704407/


All Articles