Match lines from a file based on line numbers in another file

I have two files: one contains addresses (line numbers) and other data, for example:

address file:

2 4 6 7 1 3 5 

data file

 1.000451451 2.000589214 3.117892278 4.479511994 5.484514874 6.784499874 7.021239396 

I want to randomize a data file based on the number of address files so I get:

 2.000589214 4.479511994 6.784499874 7.021239396 1.000451451 3.117892278 5.484514874 

I want to do this in either python or bash, but haven't found any solution yet.

+5
source share
3 answers

With awk :

 awk 'NR==FNR {a[NR]=$0; next} {print a[$0]}' data.txt addr.txt 
  • NR==FNR {a[NR]=$0; next} NR==FNR {a[NR]=$0; next} creates an associative array a with the key, which is the number of the record (string), and the values ​​are the whole record, this only applies to the first file ( NR==FNR ), which is data.txt . next does awk to go to the next line without further processing the record

  • {print a[$0]} prints the value from the array with the key being the line (record) number of the currect file ( addr.txt )

Example:

 % cat addr.txt 2 4 6 7 1 3 5 % cat data.txt 1.000451451 2.000589214 3.117892278 4.479511994 5.484514874 6.784499874 7.021239396 % awk 'NR==FNR {a[NR]=$0; next} {print a[$0]}' data.txt addr.txt 2.000589214 4.479511994 6.784499874 7.021239396 1.000451451 3.117892278 5.484514874 
+2
source

If you don't mind sed , we can use process overriding to achieve this easily:

 sed -nf <(sed 's/$/p/' addr.txt) data.txt 
  • -n suppresses default printing
  • -f makes sed read commands from process substitution <(...)
  • <(sed 's/$/p/' addr.txt) creates sed commands to print based on line numbers in addr.txt

Gives output:

 2.000589214 4.479511994 6.784499874 7.021239396 1.000451451 3.117892278 5.484514874 
+3
source

You can also do this in Python , as in this example:

 with open("address_file", 'r') as f1, open("data_file", "r") as f2: data1 = f1.read().splitlines() data2 = f2.read().splitlines() for k in data1: # Handle exceptions if there is any try: print(data2[int(k)-1]) except Exception: pass 

Edit: As suggested by @heemayl, here is another solution using only one list :

 with open("file1", 'r') as f1, open("file2", 'r') as f2: data = f2.read().splitlines() for k in f1.read().splitlines(): print(data[int(k)-1]) 

Both output:

 2.000589214 4.479511994 6.784499874 7.021239396 1.000451451 3.117892278 5.484514874 
0
source

Source: https://habr.com/ru/post/1268690/


All Articles