You are on page 1of 10

String Matching

Globbing is the process of


expanding a non-specific
file name containing a 
wildcard character into a set
of specific file names that
exist in storage on a
computer, server, or
network.
Globbing arbitrary strings
#include <fnmatch.h>
int fnmatch(const char *pattern, const char *string, int flags);

The pattern is a standard glob expression with four special characters , modified
by the flags argument:
1. * Matches any string, including an empty one.

2. ? Matches exactly one character, any character.

3. [ Starts a list of characters to match, or, if the next character is ^ , a list of

characters not to match. The whole list matches, or does not match, a single
character. The list is terminated by a ] .
4. \ Causes the next character to be interpreted literally instead of as a special

character.
Flags
 FNM_NOESCAPE Treat \ as an ordinary
character, not a special character.
 FNM_PATHNAME Do not match / characters in
string with a *, ? , or even a [/] sequence in pattern;
match it only with a literal, nonspecial / .
 FNM_PERIOD A leading character in pattern
matches a . character in string only if it is the first
character in string
Regular expressions
 Regular expressions have two flavors:
 basic regular expressions (BREs) grep
 extended regular expressions (EREs) egrep.
Regular expression matching

#include <regex.h>
• int regcomp(regex_t *preg, const char *regex, int
cflags);
• int regexec(const regex_t *preg, const char
*string, size_t nmatch, regmatch_t pmatch[], int
eflags);
• void regfree(regex_t *preg);
• size_t regerror(int errcode, const regex_t *preg,
char *errbuf, size_t errbuf_size);
Before comparing a string to a regular expression, you
need to compile it with the regcomp() function.

The regex_t *preg holds all the state for the regular
expression.

regex_t structure has only one member re_nsub,which


specifies the number of parenthesized sub expressions in
the regular expression.
CFlags

REG_EXTENDED If set, use ERE syntax instead of BRE syntax.

REG_ICASE If set, do not differentiate between upper- and lowercase.

REG_NOSUB If set, do not keep track of substrings. The regexec() function


then ignores the nmatch and pmatch arguments.

REG_NEWLINE If REG_NEWLINE is not set, the newline character is treated


essentially the same as any other character. The ^ and $ characters match only
the beginning and end of the entire string, not adjacent newline characters .
#include <regex.h>

int regexec(const regex_t *preg, const chat


*string, size_t nmatch, regmatch_t pmatch[], int
eflags);
The regexec() function tests a string against a
compiled regular expression.
EFLAGS

REG_NOTBOL If set, the first character of


the string does not match a ^ character.

REG_NOTEOL If set, the final character of


the string does not match a $ character.
An array of regmatch_t structures is used to represent the location of
subexpressions in the regular expression:

typedef struct {

regoff_t rm_so; /* byte index within string of start of match */

regoff_t rm_eo; /* byte index within string of end of match */

} regmatch_t;

You might also like