Getting an output error with command head in shell script

Question

getting an error with head command. There are two files with size 48bytes in directory and its printing once along with the directory path instead of twice. Why the head command is not considering the first two files i.e head -n 2? Is there any other method to do?

My code:

find "$dir" -type f -printf '%s %p\n' | sort -n -r | head -n 2|
             {
              read -r file dir
              printf "size: %d\n\t%s\n" "$file" "$dir"
              }

My output error:

size: 48
      testdir/file7.txt

The testdir directory contains two files with same size 48 but throwing only once along with directory path instead of twice

My desired output:

size: 48
      testdir/file7.txt
      testdir/file1.txt

PM 2Ring · Accepted Answer · 2015-05-21 12:45:04Z

There's no error with head -n 2; you can check that by removing the | and the subsequent code.

The problem is that the code between the braces is only executed once - it's not a loop. And read only reads data from a single line of input. So you need to make some sort of loop to print data for multiple files.

You could use a while loop, or you could take advantage of the built-in loop of awk to read & print the data. Eg, the awk command below only prints the size info if the size of the current file is different to the size of the previous file.

awk 'BEGIN{size=-1}; {if($1!=size){size=$1; printf "size: %d\n", size}; printf "\t%s\n", $2}'

We don't really need to explicitly initialize size, since it's automatically initialized to the empty string, but it's nice to be explicit about these things, IMHO.

That awk command replaces the

{
    read -r file dir
    printf "size: %d\n\t%s\n" "$file" "$dir"
}

section of your code. In other words, you can use

find "$dir" -type f -printf '%s %p\n' |
sort -n -r | head -n 2 |
awk 'BEGIN{size=-1}; 
{if($1!=size){size=$1; printf "size: %d\n", size}; 
printf "\t%s\n", $2}'

You can put it all on one line, or split it over multiple lines. It's also possible to put the awk program into its own file, but there's no need to do that for such a tiny program.

Note that you can make the -n option to head as large as you like, and the awk program will behave as expected. Also note that awk is very fast - it's much more efficient than using read and printf.

FWIW, awk code for simple text processing is often significantly faster than equivalent Python code, so even though many consider awk to be antiquated it's still quite popular.

To print the data for only the largest file(s) in a directory, you can do this:

find . -type f -printf '%s %p\n' | 
sort -nr | 
awk 'NR==1{size=$1;printf "size: %d\n", size};
$1!=size{exit};
{printf "\t%s\n", $2}'

The NR==1 says to execute the following block (the stuff inside the {}) only when the Number of the Record equal 1 - a record is just a line. So we get the size of the first file, which is the largest file (thanks to the precedingsort command), save it in the size variable, and then print the size.

$1!=size{exit} says to exit the program as soon as we read a line where the data in the 1st field doesn't match what we've saved in the size variable.

The last block {printf "\t%s\n", $2} prints the pathname for each file.

There are various ways to print both the largest and smallest files found by the find command. One way would be to read all the data into awk, storing it in an array, sort the array, and then print the data for the files of maximum & minimum size. But I'm going to adopt a simpler strategy here, and re-cycle my existing code. To do this more efficiently, I'll put the awk program into a file. Save this file to a directory in your command PATH and give it execute permission.

field1match.awk

#!/usr/bin/awk -f

# print only the records whose 1st field matches that of the 1st record
# Written by PM 2Ring 2015.05.21

NR==1{size=$1; printf "size: %d\n", size}
$1!=size{exit}
{printf "\t%s\n", $2}

And here's the command line which uses tee to duplicate the output from find and then sort it and print it using process substitution:

find "$dir" -type f -printf '%s %p\n' | 
tee > >(sort -n | field1match.awk) >(sort -rn | field1match.awk)

so you mean to say instead of the whole code of mine i have to you use yours. — buddha sreekanth, May 20 at 12:01
@buddhasreekanth: My awk code just replaces your read... printf block. I've updated my answer. — PM 2Ring, May 20 at 12:14
Your code working perfectly. could you please explain your code briefly.I dont want to use head command. If i do so, how come it will gives automatically file size. — buddha sreekanth, May 20 at 12:28
@buddhasreekanth: If you don't want to use the head command why is it in the command line you originally posted??? head simply prints the first n lines of a file or stdin (with n being 10 by default, unless you change that using the -n option). The file size and pathname data are being generated by the find command, specifically the %s and %p formatting commands in find's -printf option. sort -n -r sorts that data line by line, in reverse numeric order. And then head controls how many lines of that data are printed. (to be continued). — PM 2Ring, May 20 at 12:42
@buddhasreekanth: To understand how awk works, please see the awk man page. But briefly, awk reads input & performs specified actions on it. By default, it operates line by line. awk can automatically split the words in the input line into variables with numeric names. $0 is the whole input line, $1 is the 1st word, $2 is the 2nd word, etc. It can also use named variables. So my awk code tests the1st word on each input line (which contains the file size) & if it's different to the 1st word from the previous line, it gets printed. And then the pathname gets printed. — PM 2Ring, May 20 at 12:48

asked	3 months ago
viewed	69 times
active	3 months ago

current community

your communities

more stack exchange communities

Getting an output error with command head in shell script

1 Answer 1

Your Answer

Not the answer you're looking for? Browse other questions tagged shell-script or ask your own question.

Hot Network Questions

current community

your communities

more stack exchange communities

Getting an output error with command head in shell script

1 Answer 1

Your Answer

Sign up or log in

Post as a guest

Not the answer you're looking for? Browse other questions tagged shell-script or ask your own question.

Related

Hot Network Questions