I have a large .txt file, 300 GB, to be more precise, and I would like to put all the individual rows from the first column that match my template in another .txt file.
awk '{print $1}' file_name | grep -o '/ns/.*' | awk '!seen[$0]++' > test1.txt
This is what I tried, and as far as I can see, it works fine, but the problem is that after a while I get the following error:
awk: program limit exceeded: maximum number of fields size=32767 FILENAME="file_name" FNR=117897124 NR=117897124
Any suggestions?
The error message tells you:
line(117897124) has to many fields (>32767).
You better check:
sed -n '117897124{p;q}' file_name
Use cutto retrieve the 1st column:
cut
cut -d ' ' -f 1 < file_name | ...
. ' ' . $'\t'.
' '
$'\t'
" " - "" , , , .
, awk grep :
awk
grep
sed -n 's/\(^pattern...\).*/\1/p' some_file | awk '!seen[$0]++' > test1.txt
awk ( sed , , , , ).
sed
, awk , 117,897,124. .
117,897,124
, - script, split, , 100,000,000 .
split
100,000,000
, , , limits, awk. , unlimited , , , ...
limits
unlimited
( temp.swp), Vim, vime regex , reimx vim http://thewebminer.com/regex-to-vim
, awk. , , 1 , , :
awk 'BEGIN{FS=RS} {sub(/[[:space:]].*/,"")} /\/ns\// && !seen[$0]++' file_name
, :
awk 'BEGIN{FS=RS} {sub(/[[:space:]].*/,"")} /\/ns\//' file_name | sort -u
, , .
Source: https://habr.com/ru/post/1542492/More articles:Async-Await error with local variable cleaning - garbage-collectionMerge 2 data frames if the value within the range is rPygame on Android - pythonHow can I create a process using Unix.create_process in OCaml? - unixhow to get user phone number using loginViewFetchedUserInfo? - iosКомментарий от Maven archetype - mavenCache maps using ggmap - cachingWhen / where should I check the minimum version of Python? - pythonRequires faster calculation of (approximate) variance - c ++How can I detect 404 errors for page resources? - seleniumAll Articles