Reading bytes from many files

So, I have this code to check the file type for each file in a directory. Just need to read the first 4 bytes and check the pattern.

The code looks a bit confusing and very slow, but I can't find a faster way to do this in Nim.

What am I doing wrong?

  import os

  var
    buf {.noinit.}: array[4, char]

  let out_pat = ['{', '\\', 'r', 't']
  var
    flag = true
    num_read = 0

  var dirname = "/some/path/*"

  for path in walkFiles(dirname):
      num_read = open(path).readChars(buf, 0, 4)
      for i in 0..num_read-1:
        if buf[i] != out_pat[i]:
          flag = false
      if flag:
        echo path
      flag = true

for comparison, Python code, which is 2 times faster:

def find_rtf(dir_):
    for path in glob.glob(dir_):
        with open(path,'rb') as f:
            if f.read(4) == b'{\\rt':
                print(path)
find_rtf("/some/path/*")

and regular cli, which is about 10 times faster than Python, but has some pipe errors when it encounters 10 ^ 6 + files

time find ./ -type f -print0 | LC_ALL=C xargs -0 -P 6 -n 100 head -c 5 -v| grep "{\\\rt" -B 1
+3
source share
1 answer

On my system (Linux), the Nim version is twice as fast as Python. But maybe my files are just wrong. What operating system do you work on?

, , 4 . :

import os

const
  out_pat = ['{', '\\', 'r', 't']
  dirname = "/some/path/*"

for path in walkFiles(dirname):
  var buf: array[4, char]
  let file = open(path)
  defer: close(file) # Always close file when it goes out of scope
  discard file.readChars(buf, 0, 4)
  if buf == out_pat:
    echo path

, nim -d:release c foobar.nim.

, 6 . -P 1 -P 6 , Nim .

+4

Source: https://habr.com/ru/post/1621371/


All Articles