Wednesday, September 28, 2016

find but exclude directory and exclude user

find . ! -path "/path/folder/*" -name "*" ! -user kenmsipe -ls

find all files that are not located in /path/folder/* and do not have a user of kenmsipe

Thursday, September 22, 2016

fsck options




fsck options



 

y



a



r



n



 


-y     For  some filesystem-specific checkers, the -y option will cause the fs-specific fsck to always

attempt to fix any detected filesystem corruption automatically.  Sometimes an  expert  may  be able  to  do  better driving the fsck manually.  Note that not all filesystem-specific checkers
implement this option.  In particular fsck.minix(8) and fsck.cramfs(8) do not  support  the  -y
option as of this writing.






-a     Automatically repair the filesystem without any questions (use this option with caution).  Note that  e2fsck(8) supports -a for backward compatibility only.  This option is mapped to e2fsck's -p option which is safe to use, unlike the -a option that some filesystem checkers support.








-r     Interactively  repair the filesystem (ask for confirmations).  Note: It is generally a bad idea
to use this option if multiple fsck's are being run  in  parallel.   Also  note  that  this  is
e2fsck's default behavior; it supports this option for backward compatibility reasons only.






-n     For some filesystem-specific checkers, the -n option will cause the fs-specific fsck  to  avoid
attempting  to repair any problems, but simply report such problems to stdout.  This is however not true for all filesystem-specific checkers.  In particular, fsck.reiserfs(8) will not report
any corruption if given this option.  fsck.minix(8) does not support the -n option at all.





Monday, September 19, 2016

System V–type system: init process

On a System V–type system, the init process does the following as it gets the system up and running:

Runs the /etc/rc.d/sysinit script to prepare the system

Processes /etc/inittab to determine the appropriate runlevel and scripts

Runs the scripts in the appropriate runlevel directory in /etc/rc.d/

Runs the /etc/rc.d/rc.local script

BSD-type system: init process

On a BSD-type system, the init process completes the following tasks as it initializes the system:


Runs the /etc/init.d/boot script to prepare the system

Processes /etc/inittab to determine the appropriate runlevel and scripts

Runs the scripts in the appropriate runlevel directory in /etc/init.d

Runs the /etc/init.d/boot.local script

Shell commands you can use to manage kernel modules

Shell commands you can use to manage kernel modules. These include the following:

lsmod  {Views loaded kernel modules}

modinfo {Views module information}

depmod  {Builds a module dependency list}

insmod {Installs a kernel module but doesn’t factor in module dependencies}

modprobe {Installs or removes a kernel module while taking module dependencies into account}

rmmod {Removes a kernel module but doesn’t factor in module dependencies}

Command-line tools to view information about the hardware

You can also use the following command-line tools to view information about the hardware in your system:

hdparm /dev/device

sg_scan

sginfo –l

hwinfo

lshw

lsusb

lspci

/sys/ directory

In addition to /proc, the /sys/ directory also provides information about the hardware installed in the system. The top level of the /sys directory contains many subdirectories, including:

/sys/block

/sys/bus

/sys/class

/sys/devices

/sys/module

The /proc directory

Within the /proc directory, you can view hardware information in the following files and directories:


cpuinfo

devices

dma

interrupt

iomem

modules

version

/scsi/

/bus/devices

/ide/

Sunday, September 11, 2016

Systemctl

https://www.freedesktop.org/wiki/
 www/ Software/ systemd/   FrequentlyAskedQuestions 


Frequently Asked Questions
Also check out the Tips & Tricks!

Q: How do I change the current runlevel?
A: In systemd runlevels are exposed via "target units". You can change them like this:
# systemctl isolate runlevel5.target

Note however, that the concept of runlevels is a bit out of date, and it is usually nicer to use modern names for this. e.g.:
# systemctl isolate graphical.target

This will only change the current runlevel, and has no effect on the next boot.

Q: How do I change the default runlevel to boot into?


A: The symlink /etc/systemd/system/default.target controls where we boot into by default. Link it to the target unit of your choice. For example, like this:
# ln -sf /usr/lib/systemd/system/multi-user.target /etc/systemd/system/default.target

or
# ln -sf /usr/lib/systemd/system/graphical.target /etc/systemd/system/default.target

Q: How do I figure out the current runlevel?


A: Note that there might be more than one target active at the same time. So the question regarding the runlevel might not always make sense. Here's how you would figure out all targets that are currently active:
$ systemctl list-units --type=target

If you are just interested in a single number, you can use the venerable runlevel command, but again, its output might be misleading.

Q: I want to change a service file, but rpm keeps overwriting it in /usr/lib/systemd/system all the time, how should I handle this?
A: The recommended way is to copy the service file from /usr/lib/systemd/system to /etc/systemd/system and edit it there. The latter directory takes precedence over the former, and rpm will never overwrite it. If you want to use the distributed service file again you can simply delete (or rename) the service file in /etc/systemd/system again.

Q: My service foo.service as distributed by my operating system vendor is only started when (a connection comes in or some hardware is plugged in). I want to have it started always on boot, too. What should I do?


A: Simply place a symlink from that service file in the multi-user.target.wants/ directory (which is where you should symlink everything you want to run in the old runlevel 3, i.e. the normal boot-up without graphical UI. It is pulled in by graphical.target too, so will be started for graphical boot-ups, too):
# ln -sf /usr/lib/systemd/system/foobar.service /etc/systemd/system/multi-user.target.wants/foobar.service
# systemctl daemon-reload

Q: I want to enable another getty, how would I do that?


A: Simply instantiate a new getty service for the port of your choice (internally, this places another symlink for instantiating another serial getty in the getty.target.wants/ directory).
# systemctl enable serial-getty@ttyS2.service
# systemctl start serial-getty@ttyS2.service

Note that gettys on the virtual console are started on demand. You can control how many you get via the NAutoVTs= setting in logind.conf(7). Also see this blog story.

Q: How to I figure out which service a process belongs to?
A: You may either use ps for that:
$ alias psc='ps xawf -eo pid,user,cgroup,args'
$ psc
...

Or you can even check /proc/$PID/cgroup directly. Also see this blog story.

Q: Why don't you use inotify to reload the unit files automatically on change?

A: Unfortunately that would be a racy operation. For an explanation why and how we tried to improve the situation, see the bugzilla report about this.

Q: I have a native systemd service file and a SysV init script installed which share the same basename, e.g. /usr/lib/systemd/system/foobar.service vs. /etc/init.d/foobar -- which one wins?

A: If both files are available the native unit file always takes precedence and the SysV init script is ignored, regardless whether either is enabled or disabled. Note that a SysV service that is enabled but overridden by a native service does not have the effect that the native service would be enabled, too. Enabling of native and SysV services is completely independent. Or in other words: you cannot enable a native service by enabling a SysV service by the same name, and if a SysV service is enabled but the respective native service is not, this will not have the effect that the SysV script is executed.

Q: How can I use journalctl to display full (= not truncated) messages even if less is not used?
A: Use:
# journalctl --full

Q: Whenever my service tries to acquire RT scheduling for one of its threads this is refused with EPERM even though my service is running with full privileges. This works fine on my non-systemd system!

A: By default, systemd places all systemd daemons in their own cgroup in the "cpu" hierarchy. Unfortunately, due to a kernel limitation, this has the effect of disallowing RT entirely for the service. See My Service Can't Get Realtime! for a longer discussion and what to do about this.

Q: My service is ordered after network.target but at boot it is still called before the network is up. What's going on?

A: That's a long story, and that's why we have a wiki page of its own about this: Running Services After the Network is up

Q: My systemd system always comes up with /tmp as a tiny tmpfs. How do I get rid of this?
A: That's also a long story, please have a look on API File Systems
 Last edited Sun May 26 10:17:02 2013

Thursday, September 8, 2016

Processing Text Streams

Processing Text Streams
When you’re processing text streams within a script or when piping output at the shell prompt, there may be times when you need to filter the output of one command so that only certain portions of the text stream are actually passed along to the stdin of the next command. You can use a variety of tools to do this. In the last part of this chapter, we’ll look at using the following commands:

cut
expand and unexpand
fmt
join and paste
nl
od
•pr    
sed and awk
sort
split
tr
uniq
wc

cut
The cut command is used to print columns or fields that you specify from a file to the standard output. By default, the tab character is used as a delimiter. The following options can be used with cut:

blist Select only these bytes.
clist Select only these characters.
ddelim Use the specified character instead of tab for the field delimiter.
flist Select only the specified fields. Print any line that contains no delimiter character,
unless the –s option is specified.
s Do not print lines that do not contain delimiters.

For example, you could use the cut command to display all group names from the /etc/group file. Remember, the name of each group is contained in the first field of each line of the file.
However, the group file uses colons as the delimiter between fields, so you must specify a colon instead of a tab as the delimiter. The command to do this is cut –d: –f1 /etc/group
.

expand and unexpand
The expand command is used to process a text stream and remove all instances of the tab character and replace them with the specified number of spaces (the default is eight). You can use the –t number option to specify a different number of spaces. The syntax is
expand –t number filename
.
In Figure 14-3, the tab characters in the tabfile file are replaced with five spaces.

You can also use the unexpand command. The unexpand command works in the opposite manner as the expand command. It converts spaces in a text stream into tab characters. By default, eight contiguous spaces are converted into tabs. However, you can use the –t option to specify a different number of spaces.

It’s important to note that, by default, unexpand will only convert leading spaces at the beginning of each line. To force it to convert all spaces of the correct number to tabs, you must include the –a option with the unexpand command.


fmt
You can use the fmt command to reformat a text file. It is commonly used to change the wrapping of long lines within the file to a more manageable width. The syntax for using fmt is fmt option filename

For example, you could use the –w option with the fmt command to narrow the text of a file
to 80 columns by entering fmt –w 80 filename

join and paste
The join command prints a line from each of two specified input files that have identical join fields. The first field is the default join field, delimited by white space. You can specify a different join field using the –j field option.

For example, suppose you have two files. The first file (named firstnames) contains the following content:
1 Mike
2 Jenny
3 Joe
The second file (named lastnames) contains the following content:
1 Johnson
2 Doe
3 Jones
You can use the join command to join the corresponding lines from each file by entering
join –j 1 firstnames lastnames

. This is shown here:
rtracy@openSUSE:~> join -j 1 firstnames lastnames
1 Mike Johnson
2 Jenny Doe
3 Joe Jones


The paste command works in much the same manner as the join command. It pastes togethe
corresponding lines from one or more files into columns. By default, the tab character is used to
separate columns. You can use the –dn option to specify a different delimiter character. You can also use the –s option to put the contents of each file into a single line.

For example, you could use the paste command to join the corresponding lines from the firstnames and lastnames files by entering

paste firstnames lastnames

. An example is shown here:
rtracy@openSUSE:~> paste firstnames lastnames
1 Mike   1 Johnson
2 Jenny  2 Doe
3 Joe    3 Jones


nl
The nl command determines the number of lines in a file. When you run the command, the
output is written with a line number added to the beginning of each line in the file. The syntax is nl filename

For example, in the example shown here, the nl command is used to add a number to the beginning of each line in the tabfile.txt file:

rtracy@openSUSE:~> nl tabfile.txt
     1 This file uses tabs.
     2         This line used a tab.
     3         This line used a tab.
     4 After using expand, the tabs will be replaced with spaces.


od
The od (octal dump) command is used to dump a file, including binary files. This utility can
dump a file in several different formats, including octal, decimal, floating point, hex, and character format. The output from od is simple text, so you can use the other stream-processing tools we’ve been looking at to further filter it.

The od command can be very useful. For example, you can perform a dump of a file to locate stray characters in a file. Suppose you created a script file using an editor on a different operating system (such as Windows) and then tried to run it on Linux. Depending on which editor you used, there may be hidden formatting characters within the script text that aren’t displayed by your text editor. However, they will be read by the bash shell when you try to run the script, thus causing errors. When you look at the script in an editor, everything seems fine.

You could use the od command to view a dump of the script to isolate where the problem-
causing characters are located in the file. The syntax for using od is od options filename
. Some of the more commonly used options include the following:

–b  Octal dump
–d  Decimal dump
–x  Hex dump
–c  Character dump

For example, “Hello World” script has been created in the LibreOffice word processor and saved as an .odt file. As such, it has a myriad of hidden characters embedded in the text.

These characters obviously cannot be viewed from within LibreOffice. However, they can be viewed using the od –c helloworld.odt command.


pr
The pr command is used to format text files for printing. It formats the file with pagination, headers, and columns. The header contains the date and time, filename, and page number. You can use the following options with pr:

–d  Double-space the output.
–l  page_length  Set the page length to the specified number of lines. The default is 66.
–o margin  Offset each line with the specified number of spaces. The default margin is 0.


sed and awk
The sed command is a stream text editor. Unlike the interactive text editors that you’ve already learned how to use in this book, such as vi, a stream editor takes a stream of text as its stdin and then performs operations on it that you specify. Then, sed sends the results to stdout. You can use the following commands with sed:

s  Replaces instances of a specified text string with another text string. The syntax for
using the s command is sed s/term1/term2/

For example,  I’ve used the cat command to display a file in the tux user’s home directory named lipsum.txt. I then use cat to read lipsum.txt and then pipe the stdout to the stdin of the sed command and specify that the term “ipsum” be replaced with “IPSUM.”

d  Deletes the specified text. For example, to delete every line of text from the stdin that
contains the term “eos,” you would enter sed /eos/d
.
Remember, sed doesn’t actually modify the source of the information—in this case, the lipsum.txt file. It takes its stdin, makes the changes, and sends it to the stdout. If you want to save the changes made by sed, you need to redirect its stdout to a file using >
For example, I could redirect the output from the command in Figure 14-8 to a file named lipsum_out.txt by entering cat lipsum.txt | sed s/ipsum/IPSUM/ > lipsum_out.txt at the shell prompt.


In addition to sed, you can also use awk to manipulate output. Like sed, awk can be used to
receive output from another command as its stdin and manipulate it in a manner you specify.

However, the way awk does this is a little bit different. The awk command treats each line of text it receives as a record. Each word in the line, separated by a space or tab character, is treated as a separate field within the record.

For example, consider the following text file:

Lorem ipsum dolor sit amet, consectetur adipisicing elit,
sed do eiusmod tempor incididunt ut labore et dolore magna aliqua.
Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris
nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in
reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur.
Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia
deserunt mollit anim id est laborum

According to awk, this file has seven records because it has seven separate lines of text. Each line of text has a carriage return/linefeed character at the end that creates a new line. This is the character awk uses to define the end of a record. The first record has eight fields, the second record has 11 fields, and so on.

Notice that white space, not punctuation, delimits the fields. Each field is referenced as $
field_number.

For example, the first field of any record is referenced as $1, the second as $2, and so on.
Using awk, we can specify a field in a specific record and manipulate it in some manner. The syntax for using awk is awk ‘pattern{manipulation}’

For example, we could enter
cat lipsum2.txt | awk ‘{print $1,$2,$3}’

to print out the first three words (“fields”) of each line (“records”).
Because we didn’t specify a pattern to match on, awk simply prints out the first three words of every line.

You can also include a pattern to specify exactly which records to search on. For example,
suppose we only wanted to display the first three fields of any record that includes the text “do” somewhere in the line. To do this, you add a pattern of /do/ to the command.
You can also add your own text to the output. Just add it to the manipulation part of the command within quotes. In fact, you can also add control characters to output as well. Use the following:

\t  Inserts a tab character
\n  Adds a newline character
\f  Adds a form feed character
\r  Adds a carriage return character

For example, in Figure 14-11, I’ve entered
cat lipsum.txt | awk ‘/do/ {print "Field 1: "$1"\t", "Field 2: "$2"\t",
"Field 3: "$3"\t"}’

which causes each field to be labeled Field 1, Field 2, and Field 3
.
 It also inserts a tab character between each field. As with sed, awk doesn’t modify the original file. It sends its output to stdout (the screen). If you want to send it to a file, you can redirect it using >


sort
The sort command sorts the lines of a text file alphabetically. The output is written to the standard output. Some commonly used options for the sort command include the following:

–f Fold lowercase characters to uppercase characters.
–M  Sort by month.
–n  Sort numerically.
–r  Reverse the sort order.

For example, the sort –n –r firstnames

command sorts the lines in the firstnames file numerically in reverse order. This is shown here:

rtracy@openSUSE:~> sort –n –r firstnames
3 Joe
2 Jenny
1 Mike

The sort command can be used to sort the output of other commands (such as ps) by piping
the standard output of the first command to the standard input of the sort command.


split
The split command splits an input file into a series of files (without altering the original input file). The default is to split the input file into 1,000-line segments. You can use the –n option to specify a different number of lines.

For example, the split –1 firstnames outputfile_ command can be used to split the firstnames file into three separate files, each containing a single line.


tr
The tr command is used to translate or delete characters. However, be aware that this command does not work with files. To use it with files, you must first use a command such as cat to send the text stream to the standard input of tr. The syntax is

tr options X Y

Some commonly used options for the tr command include the following:

–c  Use all characters not in X.
–d  Delete characters in X; do not translate.
–s  Replace each input sequence of a repeated character that is listed in X with a single occurrence of that character.
–t  First truncate X to the length of Y.

For example, to translate all lowercase characters in the lastnames file to uppercase characters, you could enter cat lastnames | tr a-z A-Z
, as shown in this example:

rtracy@openSUSE:~> cat lastnames | tr a-z A-Z
1 JOHNSON
2 DOE
3 JONES

uniq
The uniq command reports or omits repeated lines. The syntax is
uniq options input output

 You can use the following options with the uniq command:

–d  Only print duplicate lines.
–u  Only print unique lines.

For example, suppose our lastnames file contained duplicate entries:
1 Johnson
1 Johnson
2 Doe
3 Jones

You could use the uniq lastnames command to remove the duplicate lines. This is shown in
the following example:

rtracy@openSUSE:~> uniq lastnames
1 Johnson
2 Doe
3 Jones

Be aware that the uniq command only works if the duplicate lines are adjacent to each other.

If the text stream you need to work with contains duplicate lines that are not adjacent, you can use the sort command to first make them adjacent and then pipe the output to the standard input of uniq.


wc
The wc command prints the number of newlines, words, and bytes in a file. The syntax is

wc options files

. You can use the following options with the wc command:

–c  Print the byte counts.
–m  Print the character counts.
–l  Print the newline counts.
–L  Print the length of the longest line.
–w  Print the word counts.


For example, to print all counts and totals for the firstnames file, you would use the
wc firstnames

 command, as shown in this example:

rtracy@openSUSE:~> wc firstnames
  3 6 21 firstnames





Processing Text Streams

Processing Text Streams
When you’re processing text streams within a script or when piping output at the shell prompt, there may be times when you need to filter the output of one command so that only certain portions of the text stream are actually passed along to the stdin of the next command. You can use a variety of tools to do this. In the last part of this chapter, we’ll look at using the following commands:

cut
expand and unexpand
fmt
join and paste
nl
od
•pr    
sed and awk
sort
split
tr
uniq
wc

cut
The cut command is used to print columns or fields that you specify from a file to the standard output. By default, the tab character is used as a delimiter. The following options can be used with cut:

blist Select only these bytes.
clist Select only these characters.
ddelim Use the specified character instead of tab for the field delimiter.
flist Select only the specified fields. Print any line that contains no delimiter character,
unless the –s option is specified.
s Do not print lines that do not contain delimiters.

For example, you could use the cut command to display all group names from the /etc/group file. Remember, the name of each group is contained in the first field of each line of the file.
However, the group file uses colons as the delimiter between fields, so you must specify a colon instead of a tab as the delimiter. The command to do this is cut –d: –f1 /etc/group
.

expand and unexpand
The expand command is used to process a text stream and remove all instances of the tab character and replace them with the specified number of spaces (the default is eight). You can use the –t number option to specify a different number of spaces. The syntax is
expand –t number filename
.
In Figure 14-3, the tab characters in the tabfile file are replaced with five spaces.

You can also use the unexpand command. The unexpand command works in the opposite manner as the expand command. It converts spaces in a text stream into tab characters. By default, eight contiguous spaces are converted into tabs. However, you can use the –t option to specify a different number of spaces.

It’s important to note that, by default, unexpand will only convert leading spaces at the beginning of each line. To force it to convert all spaces of the correct number to tabs, you must include the –a option with the unexpand command.


fmt
You can use the fmt command to reformat a text file. It is commonly used to change the wrapping of long lines within the file to a more manageable width. The syntax for using fmt is fmt option filename

For example, you could use the –w option with the fmt command to narrow the text of a file
to 80 columns by entering fmt –w 80 filename

join and paste
The join command prints a line from each of two specified input files that have identical join fields. The first field is the default join field, delimited by white space. You can specify a different join field using the –j field option.

For example, suppose you have two files. The first file (named firstnames) contains the following content:
1 Mike
2 Jenny
3 Joe
The second file (named lastnames) contains the following content:
1 Johnson
2 Doe
3 Jones
You can use the join command to join the corresponding lines from each file by entering
join –j 1 firstnames lastnames

. This is shown here:
rtracy@openSUSE:~> join -j 1 firstnames lastnames
1 Mike Johnson
2 Jenny Doe
3 Joe Jones


The paste command works in much the same manner as the join command. It pastes togethe
corresponding lines from one or more files into columns. By default, the tab character is used to
separate columns. You can use the –dn option to specify a different delimiter character. You can also use the –s option to put the contents of each file into a single line.

For example, you could use the paste command to join the corresponding lines from the firstnames and lastnames files by entering

paste firstnames lastnames

. An example is shown here:
rtracy@openSUSE:~> paste firstnames lastnames
1 Mike   1 Johnson
2 Jenny  2 Doe
3 Joe    3 Jones


nl
The nl command determines the number of lines in a file. When you run the command, the
output is written with a line number added to the beginning of each line in the file. The syntax is nl filename

For example, in the example shown here, the nl command is used to add a number to the beginning of each line in the tabfile.txt file:

rtracy@openSUSE:~> nl tabfile.txt
     1 This file uses tabs.
     2         This line used a tab.
     3         This line used a tab.
     4 After using expand, the tabs will be replaced with spaces.


od
The od (octal dump) command is used to dump a file, including binary files. This utility can
dump a file in several different formats, including octal, decimal, floating point, hex, and character format. The output from od is simple text, so you can use the other stream-processing tools we’ve been looking at to further filter it.

The od command can be very useful. For example, you can perform a dump of a file to locate stray characters in a file. Suppose you created a script file using an editor on a different operating system (such as Windows) and then tried to run it on Linux. Depending on which editor you used, there may be hidden formatting characters within the script text that aren’t displayed by your text editor. However, they will be read by the bash shell when you try to run the script, thus causing errors. When you look at the script in an editor, everything seems fine.

You could use the od command to view a dump of the script to isolate where the problem-
causing characters are located in the file. The syntax for using od is od options filename
. Some of the more commonly used options include the following:

–b  Octal dump
–d  Decimal dump
–x  Hex dump
–c  Character dump

For example, “Hello World” script has been created in the LibreOffice word processor and saved as an .odt file. As such, it has a myriad of hidden characters embedded in the text.

These characters obviously cannot be viewed from within LibreOffice. However, they can be viewed using the od –c helloworld.odt command.


pr
The pr command is used to format text files for printing. It formats the file with pagination, headers, and columns. The header contains the date and time, filename, and page number. You can use the following options with pr:

–d  Double-space the output.
–l  page_length  Set the page length to the specified number of lines. The default is 66.
–o margin  Offset each line with the specified number of spaces. The default margin is 0.


sed and awk
The sed command is a stream text editor. Unlike the interactive text editors that you’ve already learned how to use in this book, such as vi, a stream editor takes a stream of text as its stdin and then performs operations on it that you specify. Then, sed sends the results to stdout. You can use the following commands with sed:

s  Replaces instances of a specified text string with another text string. The syntax for
using the s command is sed s/term1/term2/

For example,  I’ve used the cat command to display a file in the tux user’s home directory named lipsum.txt. I then use cat to read lipsum.txt and then pipe the stdout to the stdin of the sed command and specify that the term “ipsum” be replaced with “IPSUM.”

d  Deletes the specified text. For example, to delete every line of text from the stdin that
contains the term “eos,” you would enter sed /eos/d
.
Remember, sed doesn’t actually modify the source of the information—in this case, the lipsum.txt file. It takes its stdin, makes the changes, and sends it to the stdout. If you want to save the changes made by sed, you need to redirect its stdout to a file using >
For example, I could redirect the output from the command in Figure 14-8 to a file named lipsum_out.txt by entering cat lipsum.txt | sed s/ipsum/IPSUM/ > lipsum_out.txt at the shell prompt.


In addition to sed, you can also use awk to manipulate output. Like sed, awk can be used to
receive output from another command as its stdin and manipulate it in a manner you specify.

However, the way awk does this is a little bit different. The awk command treats each line of text it receives as a record. Each word in the line, separated by a space or tab character, is treated as a separate field within the record.

For example, consider the following text file:

Lorem ipsum dolor sit amet, consectetur adipisicing elit,
sed do eiusmod tempor incididunt ut labore et dolore magna aliqua.
Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris
nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in
reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur.
Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia
deserunt mollit anim id est laborum

According to awk, this file has seven records because it has seven separate lines of text. Each line of text has a carriage return/linefeed character at the end that creates a new line. This is the character awk uses to define the end of a record. The first record has eight fields, the second record has 11 fields, and so on.

Notice that white space, not punctuation, delimits the fields. Each field is referenced as $
field_number.

For example, the first field of any record is referenced as $1, the second as $2, and so on.
Using awk, we can specify a field in a specific record and manipulate it in some manner. The syntax for using awk is awk ‘pattern{manipulation}’

For example, we could enter
cat lipsum2.txt | awk ‘{print $1,$2,$3}’

to print out the first three words (“fields”) of each line (“records”).
Because we didn’t specify a pattern to match on, awk simply prints out the first three words of every line.

You can also include a pattern to specify exactly which records to search on. For example,
suppose we only wanted to display the first three fields of any record that includes the text “do” somewhere in the line. To do this, you add a pattern of /do/ to the command.
You can also add your own text to the output. Just add it to the manipulation part of the command within quotes. In fact, you can also add control characters to output as well. Use the following:

\t  Inserts a tab character
\n  Adds a newline character
\f  Adds a form feed character
\r  Adds a carriage return character

For example, in Figure 14-11, I’ve entered
cat lipsum.txt | awk ‘/do/ {print "Field 1: "$1"\t", "Field 2: "$2"\t",
"Field 3: "$3"\t"}’

which causes each field to be labeled Field 1, Field 2, and Field 3
.
 It also inserts a tab character between each field. As with sed, awk doesn’t modify the original file. It sends its output to stdout (the screen). If you want to send it to a file, you can redirect it using >


sort
The sort command sorts the lines of a text file alphabetically. The output is written to the standard output. Some commonly used options for the sort command include the following:

–f Fold lowercase characters to uppercase characters.
–M  Sort by month.
–n  Sort numerically.
–r  Reverse the sort order.

For example, the sort –n –r firstnames

command sorts the lines in the firstnames file numerically in reverse order. This is shown here:

rtracy@openSUSE:~> sort –n –r firstnames
3 Joe
2 Jenny
1 Mike

The sort command can be used to sort the output of other commands (such as ps) by piping
the standard output of the first command to the standard input of the sort command.


split
The split command splits an input file into a series of files (without altering the original input file). The default is to split the input file into 1,000-line segments. You can use the –n option to specify a different number of lines.

For example, the split –1 firstnames outputfile_ command can be used to split the firstnames file into three separate files, each containing a single line.


tr
The tr command is used to translate or delete characters. However, be aware that this command does not work with files. To use it with files, you must first use a command such as cat to send the text stream to the standard input of tr. The syntax is

tr options X Y

Some commonly used options for the tr command include the following:

–c  Use all characters not in X.
–d  Delete characters in X; do not translate.
–s  Replace each input sequence of a repeated character that is listed in X with a single occurrence of that character.
–t  First truncate X to the length of Y.

For example, to translate all lowercase characters in the lastnames file to uppercase characters, you could enter cat lastnames | tr a-z A-Z
, as shown in this example:

rtracy@openSUSE:~> cat lastnames | tr a-z A-Z
1 JOHNSON
2 DOE
3 JONES

uniq
The uniq command reports or omits repeated lines. The syntax is
uniq options input output

 You can use the following options with the uniq command:

–d  Only print duplicate lines.
–u  Only print unique lines.

For example, suppose our lastnames file contained duplicate entries:
1 Johnson
1 Johnson
2 Doe
3 Jones

You could use the uniq lastnames command to remove the duplicate lines. This is shown in
the following example:

rtracy@openSUSE:~> uniq lastnames
1 Johnson
2 Doe
3 Jones

Be aware that the uniq command only works if the duplicate lines are adjacent to each other.

If the text stream you need to work with contains duplicate lines that are not adjacent, you can use the sort command to first make them adjacent and then pipe the output to the standard input of uniq.


wc
The wc command prints the number of newlines, words, and bytes in a file. The syntax is

wc options files

. You can use the following options with the wc command:

–c  Print the byte counts.
–m  Print the character counts.
–l  Print the newline counts.
–L  Print the length of the longest line.
–w  Print the word counts.


For example, to print all counts and totals for the firstnames file, you would use the
wc firstnames

 command, as shown in this example:

rtracy@openSUSE:~> wc firstnames
  3 6 21 firstnames