Match lines from a file based on line numbers in another file

Question

Match lines from a file based on line numbers in another file

I have two files: one contains addresses (line numbers) and other data, for example:

address file:

2 4 6 7 1 3 5

data file

 1.000451451 2.000589214 3.117892278 4.479511994 5.484514874 6.784499874 7.021239396

I want to randomize a data file based on the number of address files so I get:

 2.000589214 4.479511994 6.784499874 7.021239396 1.000451451 3.117892278 5.484514874

I want to do this in either python or bash, but haven't found any solution yet.

+5

python bash awk

hassan Jun 09 '17 at 4:05

source share

3 answers

If you don't mind sed , we can use process overriding to achieve this easily:

 sed -nf <(sed 's/$/p/' addr.txt) data.txt

-n suppresses default printing
-f makes sed read commands from process substitution <(...)
<(sed 's/$/p/' addr.txt) creates sed commands to print based on line numbers in addr.txt

Gives output:

 2.000589214 4.479511994 6.784499874 7.021239396 1.000451451 3.117892278 5.484514874

+3

codeforester Jun 09 '17 at 4:42

source share

You can also do this in Python , as in this example:

 with open("address_file", 'r') as f1, open("data_file", "r") as f2: data1 = f1.read().splitlines() data2 = f2.read().splitlines() for k in data1: # Handle exceptions if there is any try: print(data2[int(k)-1]) except Exception: pass

Edit: As suggested by @heemayl, here is another solution using only one list :

 with open("file1", 'r') as f1, open("file2", 'r') as f2: data = f2.read().splitlines() for k in f1.read().splitlines(): print(data[int(k)-1])

Both output:

 2.000589214 4.479511994 6.784499874 7.021239396 1.000451451 3.117892278 5.484514874

0

Chiheb Nexus Jun 09 '17 at 4:35

source share

heemayl · Accepted Answer · 2017-06-09T04:13:06+0000

With awk :

 awk 'NR==FNR {a[NR]=$0; next} {print a[$0]}' data.txt addr.txt

NR==FNR {a[NR]=$0; next} NR==FNR {a[NR]=$0; next} creates an associative array a with the key, which is the number of the record (string), and the values are the whole record, this only applies to the first file ( NR==FNR ), which is data.txt . next does awk to go to the next line without further processing the record
{print a[$0]} prints the value from the array with the key being the line (record) number of the currect file ( addr.txt )

Example:

 % cat addr.txt 2 4 6 7 1 3 5 % cat data.txt 1.000451451 2.000589214 3.117892278 4.479511994 5.484514874 6.784499874 7.021239396 % awk 'NR==FNR {a[NR]=$0; next} {print a[$0]}' data.txt addr.txt 2.000589214 4.479511994 6.784499874 7.021239396 1.000451451 3.117892278 5.484514874

Match lines from a file based on line numbers in another file

More articles: