Last week I mentioned the |
(pipeline) operator, and gave a quick example of
how to use it: ls -lS | less
.
What that does is take the output of the ls
command and sends it to the less
command as input. This is what pipelines or “pipes” are for. They allow you to
take the output of one program and send it to another program for more
processing. You can do this over and over again, for example:
$ ls -la | tr -s " " | cut -d " " -f 3,9 | sort
If we take this piece by piece, we can see how a pipeline works.
The first step ls -la
produces a directory listing of the current working
directory:
$ ls -la
total 260
drwxr-xr-x 4 gabe gabe 512 May 4 12:14 .
drwxr-xr-x 3 root wheel 512 Oct 25 2015 ..
-rw-r--r-- 1 gabe gabe 87 Aug 16 2015 .Xdefaults
-rw-r--r-- 1 gabe gabe 773 Aug 16 2015 .cshrc
-rw-r--r-- 1 gabe gabe 103 Aug 16 2015 .cvsrc
-rw-r--r-- 1 gabe gabe 398 Aug 16 2015 .login
-rw-r--r-- 1 gabe gabe 175 Aug 16 2015 .mailrc
-rw-r--r-- 1 gabe gabe 218 Aug 16 2015 .profile
drwx------ 2 gabe gabe 512 Oct 25 2015 .ssh
-rw-r--r-- 1 gabe gabe 10014 May 4 12:14 dmesg.txt
drwxr-xr-x 2 gabe gabe 512 Dec 5 2015 etc
-rw-r--r-- 1 gabe gabe 11914 Dec 16 2015 index.html
-rw------- 1 gabe gabe 5267 Oct 25 2015 mbox
-rw-r--r-- 1 gabe gabe 82300 May 3 13:27 pxeboot
Normally the output of a command will be displayed in the terminal for me to
see. If I want to do something with that output other than just look at it, I
can “pipe” it to another program using the |
operator. In this case, I pipe
it to the tr -s " "
command.
The tr
command “translates” some input and then outputs the translated result.
In this case, the -s
flag tells translate
to “squash” multiple instances of
the following character so there is only one. I’m telling the tr
command to
remove extra spaces in the output of the ls
command, so that now my output
looks like this:
$ ls -la | tr -s " "
total 260
drwxr-xr-x 4 gabe gabe 512 May 4 12:14 .
drwxr-xr-x 3 root wheel 512 Oct 25 2015 ..
-rw-r--r-- 1 gabe gabe 87 Aug 16 2015 .Xdefaults
-rw-r--r-- 1 gabe gabe 773 Aug 16 2015 .cshrc
-rw-r--r-- 1 gabe gabe 103 Aug 16 2015 .cvsrc
-rw-r--r-- 1 gabe gabe 398 Aug 16 2015 .login
-rw-r--r-- 1 gabe gabe 175 Aug 16 2015 .mailrc
-rw-r--r-- 1 gabe gabe 218 Aug 16 2015 .profile
drwx------ 2 gabe gabe 512 Oct 25 2015 .ssh
-rw-r--r-- 1 gabe gabe 10014 May 4 12:14 dmesg.txt
drwxr-xr-x 2 gabe gabe 512 Dec 5 2015 etc
-rw-r--r-- 1 gabe gabe 11914 Dec 16 2015 index.html
-rw------- 1 gabe gabe 5267 Oct 25 2015 mbox
-rw-r--r-- 1 gabe gabe 82300 May 3 13:27 pxeboot
Notice that all extra spaces in the original output are now just a single space.
This is useful since the next command cut
allows me to split my input into
columns, and only pick the columns I care about. cut -d " " -f 3,9
is saying
take the input I am passing you and cut it into columns at every space -d " "
then select fields 3 and 9 discarding the rest -f 3,9
. So, at this point what
I’m left with is:
$ ls -la | tr -s " " | cut -d " " -f 3,9
gabe .
root ..
gabe .Xdefaults
gabe .cshrc
gabe .cvsrc
gabe .login
gabe .mailrc
gabe .profile
gabe .ssh
gabe dmesg.txt
gabe etc
gabe index.html
gabe mbox
gabe pxeboot
Finally, I’d like to sort that list, so I pass it along to the sort
program
which will just sort the rows by the first column (you can tell it to do many
other types of sorting as well by passing different options). So, now our output
looks like this:
$ ls -la | tr -s " " | cut -d " " -f 3,9 | sort
gabe .
gabe .Xdefaults
gabe .cshrc
gabe .cvsrc
gabe .login
gabe .mailrc
gabe .profile
gabe .ssh
gabe dmesg.txt
gabe etc
gabe index.html
gabe mbox
gabe pxeboot
root ..
Building a pipeline is often a work of trial and error, you have some input and you would like some output, so you start using the various commands you know to massage the input until it looks like the output you want. The nice thing about this, is you can see the output at any step of the way so you know if you are on track or not. Pipelines are often used to transform data and are especially useful for analyzing log files, dealing with csv (comma separated value) data, or transforming data in one format to another.
New Terms
- tr - A program to “translate” characters from the given input. For example,
to translate all instances of the letter a, to upper case, you would
simply write:
ls -al | tr a A
- cut - Used to “cut” out selected columns in a given input, so if I only want
column 1 and column 3, of a space separated input, I could say:
ls -la | cut -d " " -f 1,3
- sort - The sort command is used to sort a given input by line.
- pipe / pipeline - Indicated by the
|
(vertical bar) operator, it allows you to pass the output of one command as the input to another command.