You are on page 1of 20

grep

Chapter 10
grep (global regular expression print)

• grep is a family of utilities that is used to search


an input file for all lines that match a specified
regular expression and write them to standard
output.
• To write scripts that operate correctly, it is
important that you understand how the grep
utilities work.
• In this set of slides, RE or regexp will stand for
regular expression.

2
Overview of grep

3
grep Operation
Repeat on each line of input until the end of input:
• Copy the next input line into a buffer called the
pattern space.
• Apply the regular expression to the pattern space.
• If there is a match, copy the line from the pattern
space to the standard output.

4
grep
Flowchart

5
grep Example

Continued 6
Example (cont)

7
Analyzing grep
Note the following about grep:
grep is a search facility. It searches only for the
existence of a line that contains a match on the
supplied regular expression.
grep only can send a line to standard output if a
match occurs or ignore it if it doesn’t find a
match.
Only the regular expression match can be used
to select lines.
grep is a filter that can be used on the left or the
right of a pipe.

8
grep Limitations

Although these limitations on grep can be


overcome by using other utilities,
grep CANNOT
Be used to add, delete, or change a line in the
input file.
Be used to print only part of a line.
Read only part of a file.
Select a line based on the contents of a previous
or later line. The buffer holds only the current
line.

9
The grep Family

There are three utilities in the grep family

10
fgrep (fast grep)
Uses ONLY sequence operators in a pattern and
strings with no special characters such as wildcards,
character classes, anchored characters, or groupings.
However, as its name suggests, this utility is the
fastest to execute as the matching algorithms are
optimized for this restricted set of REs.
Example:
$ fgrep “seven” file1
Fgrep won’t handle complex patterns such as
$ fgrep “seven | one” file1
It is best on all of the grep commands to enclose the RE
in either double or single quotes to avoid problems.
grep –F is the same as fgrep. 11
grep – The Original

Handles most of the RE.


This one is generally slower than egrep which is
the most powerful.
So why use grep ever?
 It is the only algorithm in the family that

allows the saving of results of a match for


later use.

12
grep Family Options
-c prints only a count of the number of lines
matching the pattern.
-i ignores upper/lowercase in matching text
-l prints a list of files that contain at least one
line matching the pattern.
-n show number of each line before the line.
-s silent mode
-v inverse output. Prints lines that do not match
pattern
-x prints only lines that entirely match pattern
-f list of strings to be matched are in file
13
egrep – (extended grep) The Most Powerful

It allows more complex patterns than grep,


although it doesn’t have the save option.
grep –E is the same as egrep.
Unfortunately, there are various versions of the
grep family commands and the differences
between them can be quite confusing (see the
man grep results).
Many people tend to use egrep and see if they
need the save option.
fgrep should be used if the RE is super simple as
it is significantly faster than the others.
14
Examples – testfile is a text file
1. Select the lines that have exactly three
characters
egrep ‘^…$’ testfile
The ... matches 3 characters between the start of
the line, ^, and the end of the line, $.

2. Select the lines from the file that have at least


three characters
egrep ‘…’ testfile
The ... matches any 3 characters in the line.

15
Examples (cont)
3. Select the lines from the file that have three or fewer
characters and show their line numbers.
egrep –vn ‘….’ testfile
-n shows line number; -v takes opposite of having 4
characters – i.e. 3 or fewer.
4. Count the number of blank lines in the file
egrep –c ‘^$’ testfile
5. Count the number of nonblank lines in the file
egrep –c ‘.’ testfile
6. Select the lines from the file that have the string UNIX
fgrep ‘UNIX’ testfile

16
Examples (cont)
7. Select the lines from the file that have only the
string UNIX
egrep ‘^UNIX$’ testfile
8. Select the lines from the file that have the pattern
UNIX at least two times
egrep ‘UNIX.*UNIX’
9. Copy the file to the monitor, but delete the blank
lines.
egrep –v ‘^$’ testfile
-v takes the opposite of the pattern, i.e. match any
line that is NOT blank.

17
Examples (cont)
10. Select the lines from the file that have at least two
digits without any character in between
egrep ‘[0-9][0-9]’ testfile
11. Select the lines from the file whose first nonblank
character is A
egrep ‘^ *A’ testfile
12. Select the lines from the file that do not start with
A to G and show line number
egrep –n ‘^[^A-G]’ testfile
13. Find out if John is currently logged into the system
who | grep ‘John’

18
Examples (cont)

14. Search for the string ‘include’ in all the files in


the working directory that end with .c (i.e. C
source code files.) and displays with line
number.
egrep –n include *.c
15. Search for the string “one night” and display
matched lines and 3 lines above each one and 3
below - i.e. display the line in context. (Linux
only command)
egrep -3 “one night” testfile

19
Searching File for Content

When we know the directory that contains the file, we


can use grep by itself.
The option –l prints out the filename of any file that
has at least one line matching the grep expression.
grep –l ‘UNIX’ *
When we do not know where the file is located, we use
find command with the execute criterion.
find ~ -type f –exec grep –l “UNIX” {} \;
The {} \; are needed by the –exec command which asks
that grep be executed with the option –l and the
argument “UNIX”

20

You might also like