Speed ​​up a bash script that uses multiple search commands

I have a bash script to add some project files to git and then sync this branch as the number of files has increased. I noticed that the script has become much slower, so I want to find out if I am doing this correctly.

This is the script section in which the files are added:

echo "Adding files..." find . -name '*.js' -exec git add {} \; find . -name '*.html' -exec git add {} \; find . -name '*.css' -exec git add {} \; find . -name '*.py' -exec git add {} \; find . -name '*.txt' -exec git add {} \; find . -name '*.jpg' -exec git add {} \; find . -name '*.sh' -exec git add {} \; echo "Commit" git commit -m "'$1'" 

I’m not sure if one call will be found faster than all these separate commands, but I made it so that it is easier to delete some file types or add new ones.

I would greatly appreciate any suggestion to make this more efficient, using commands differently or using different teams - this is a perfectly acceptable answer.

+4
source share
5 answers
 find . \( -name '*.js' -o \ -name '*.html' -o \ -name '*.css' -o \ -name '*.py' -o \ -name '*.txt' -o \ -name '*.jpg' -o \ -name '*.sh' \) -exec git add {} + 

This means that you only look at the directory structure once, which is the main way to speed up "multiple find s"; you replace "multiple" with "one". + is a POSIX 2008 addition to find , but makes it more like xargs for everything. If you is not available, consider using -print and xargs (or, if you have spaces in the name, and you have the GNU find and xargs , then -print0 and xargs -0 , but if you have them, you ( probably, but you see the comment) also have the designation + ).

+4
source

If git supports adding multiple files to a single command, the easiest way is to use the + suffix for -exec :

 find . -name '*.js' -exec git add {} \+ 

This collects a large number of files and transfers them to the entire team on one command line.

So what will be done:

 git add a.js b.js c.js d.js 

instead

 git add a.js git add b.js git add c.js git add d.js 

If you process hundreds or thousands of files, this will significantly affect the execution time.

To combine all file templates into a single find , use the find or operator command:

 find . \( -name '*.js' -o \ -name '*.html' -o \ -name '*.css' -o \ -name '*.py' -o \ -name '*.txt' -o \ -name '*.jpg' -o \ -name '*.sh' \) -exec git add {} + 

\ ( and ) needed to protect them from their special shell value. Instead, you can use quotation marks: '(' , ')' .

find has several complex options, and you need to work a little to study them and get to know them, but over the years I have saved a lot of effort by dropping the difficult find and not struggle with filtering file names via grep and awk, etc.

One of my favorite patterns for scanning through the maven / subversion java project while ignoring uninteresting files:

 find . \( \( \( -iname .svn -o -iname target -o -iname classes \) -type d -prune -false \) -o \( <your filter expression> \) \) -exec grep -li xxx {} + 
+6
source

If you

  • have bash 4
  • only search by name (not by other criteria)

You can also use this:

 shopt -s globstar git add **/*.{js,html,css,py,txt,jpg,sh} 


Notes:
  • Bracket expansion is done before the file name extension, so this is equivalent to writing

     git add **/*.js **/*.html etc... 
  • globstar allows the recursive file name extension with the ** keyword.

+2
source

The git add command can do this without any other shell scripts.

 git add -- '*.js' '*.html' '*.css' ... 
+2
source

It could be faster:

 F='\.js$|\.html$|\.css$|\.py$|\.txt$|\.jpg$|\.sh$' find . | egrep $F | xargs git add 

or some variations of it if you expect spaces or other special characters in file names.

+1
source

Source: https://habr.com/ru/post/1445159/


All Articles