Using
the xargs Command
Ed Schaefer
Many UNIX professionals think the xargs command, construct
and execute argument lists, is only useful for processing long lists
of files generated by the find command. While xargs dutifully
serves this purpose, xargs has other uses. In this article, I describe
xargs and the historical "Too many arguments" problem,
and present eight xargs "one-liners":
- Find the unique owners of all the files in a directory.
- Echo each file to standard output as it deletes.
- Duplicate the current directory structure to another directory.
- Group the output of multiple UNIX commands on one line.
- Display to standard output the contents of a file one word
per line.
- Prompt the user whether to remove each file individually.
- Concatenate the contents of the files whose names are contained
in file into another file.
- Move all files from one directory to another directory, echoing
each move to standard output as it happens.
Examining the "Too Many Arguments" Problem
In the early days of UNIX/xenix, it was easy to overflow the command-line
buffer, causing a "Too many arguments" failure. Finding
a large number of files and piping them to another command was enough
to cause the failure. Executing the following command, from Unix
Power Tools, first edition (O'Reilly & Associates):
pr -n 'find . -type f -mtime -1 -print'|lpr
will potentially overflow the command line given enough files. This
command provides a list of all the files edited today to pr, and pipes
pr's output to the printer. We can solve this problem with xargs:
find . -type f -mtime -1 -print|xargs pr -n |lp
With no options, xargs reads standard input, but only writes
enough arguments to standard output as to not overflow the command-line
buffer. Thus, if needed, xargs forces multiple executions of
pr -n|lp.
While xargs controls overflowing the command-line buffer, the
command xargs services may overflow. I've witnessed the following
mv command fail -- not the command-line buffer --
with an argument list too long error:
find ./ -type f -print | xargs -i mv -f {} ./newdir
Limit the number of files sent to mv at a time by using the xargs
-l option. (The xargs -i () syntax is explained later in
the article). The following command sets a limit of 56 files at time,
which mv receives:
find ./ -type f -print | xargs -l56 -i mv -f {} ./newdir
The modern UNIX OS seems to have solved the problem of the find
command overflowing the command-line buffer. However, using the find
-exec command is still troublesome. It's better to do this:
# remove all files with a txt extension
find . -type f -name "*.txt" -print|xargs rm
than this:
find . -type f -name "*.txt" -exec rm {} \; -print
Controlling the call to rm with xargs is more efficient
than having the find command execute rm for each object
found.
xargs One-Liners
The find-xargs command combination is a powerful tool.
The following example finds the unique owners of all the files in
the /bin directory:
# all on one line
find /bin -type f -follow | xargs ls -al | awk ' NF==9 { print $3 }
'|sort -u
If /bin is a soft link, as it is with Solaris, the -follow
option forces find to follow the link. The xargs command
feeds the ls -al command, which pipes to awk. If the
output of the ls -al command is 9 fields, print field 3 --
the file owner. Sorting the awk output and piping to the uniq
command ensures unique owners.
You can use xargs options to build extremely powerful commands.
Expanding the xargs/rm example, let's assume the requirement
exists to echo each file to standard output as it deletes:
find . -type f -name "*.txt" | xargs -i ksh -c "echo deleting {}; rm {}"
The xargs -i option replaces instances of {} in a command
(i.e., echo and rm are commands).
Conversely, instead of using the -i option with {},
the xargs -I option replaces instances of a string. The above
command can be written as:
find . -type f -name "*.txt" | xargs -I {} ksh -c "echo deleting {}; rm {}"
The new, third edition of Unix Power Tools by Powers et al.
provides an xargs "one-liner" that duplicates a directory
tree. The following command creates in the usr/project directory,
a copy of the current working directory structure:
find . -type d -print|sed 's@^@/usr/project/@'|xargs mkdir
The /usr/project directory must exist. When executing, note the error:
mkdir: Failed to make directory "/usr/project/"; File exists
which doesn't prevent the directory structure creation. Ignore
it. To learn how the above command works, you can read more in Unix
Power Tools, third edition, Chapter 9.17 (O'Reilly &
Associates).
In addition to serving the find command, xargs can be a
slave to other commands. Suppose the requirement is to group the
output of UNIX commands on one line. Executing:
logname; date
displays the logname and date on two separate lines. Placing commands
in parentheses and piping to xargs places the output of both commands
on one line:
(logname; date)|xargs
Executing the following command places all the file names in the current
directory on one line, and redirects to file "file.ls":
ls |xargs echo > file.ls
Use the xargs number of arguments option, -n, to display
the contents of "file.ls" to standard output, one name per
line:
cat file.ls|xargs -n1 # from Unix in a Nutshell
In the current directory, use the xargs -p option to prompt
the user to remove each file individually:
ls|xargs -p -n1 rm
Without the -n option, the user is prompted to delete all the
files in the current directory.
Concatenate the contents of all the files whose names are contained
in file:
xargs cat < file > file.contents
into file.contents.
Move all files from directory $1 to directory $2, and use the
xargs -t option to echo each move as it happens:
ls $1 | xargs -I {} -t mv $1/{} $2/{}
The xargs -I argument replaces each {} in the string
with each object piped to xargs.
Conclusion
When should you use xargs? When the output of a command is the
command-line options of another command, use xargs in conjunction
with pipes. When the output of a command is the input of another
command, use pipes.
References
Powers, Shelley, Peek, Jerry, et al. 2003. Unix Power Tools.
Sebastopol, CA: O'Reilly & Associates.
Robbins, Arnold. 1999. Unix in a Nutshell. Sebastopol,
CA: O'Reilly & Associates.
Ed Schaefer is a frequent contributor to Sys Admin. He is a
software developer and DBA for Intel's Factory Integrated Information
Systems, FIIS, in Aloha, Oregon. Ed also hosts the monthly Shell
Corner column on UnixReview.com. He can be reached at: olded@ix.netcom.com.
|