You are on page 1of 3

20.

6 AN EXAMI'LE: A SIMPLE TEXT E DITOR


logical concept that you can represent in your program as a (linked) list or as a
vector. The closest SU analog to our everyday concept of a list (e.g., a to-do
list,
a list of groceries, or a schedule) is a sequence, and most sequences are best rep·
resented as vectors.
20.6.1 Lines
How do we decide what's a "line" in our document? There are three obvious
choices:
1. Rely on newline indicators (e.g., '\n') in user input.
2. Somehow parse the document and use some "natural" punctuation (e.g., .) .
3. Split any line that grows beyond a given length (e.g., 50 characters) into
two.
There are undoubtedly also some less obvious choices. For simplicity, we use
al#ternative 1 here.
We will represent a document in our editor as an object of class Document.
Stripped of all refmements, our document type looks like this:
typedef vector<char> Line; II a li ne is a vector of characters
struct Document {
};
list<Line> line; II a document is a l ist of lines
II line[i] is the ith line
Document() { line.push_back(Line()); }
Every Document starts out with a single empty line: Document's constructor
makes an empty line and pushes it into the list of lines.
Reading and splitting into lines can be done like this:
istream& operator>>(istream& is, Document& d)
{
char ch;
while (is>>ch) {
d.line.back().push_back(ch); II add the character
if (ch=='\n')
return is;
d.line.push_back(Line()); II add another l ine
707
708 CHAPTER 20 • CONTAINER S AND ITE RATO RS
Both vector and list have a member function back() that returns a reference to
the last element. To use it, you have to be sure that there really is a last
element
for back() to refer to - don't use it on an empty container. That's why we defined
an empty Document to have one empty Line. Note that we store every character
from input, even the newline characters ('\n'). Storing those newline characters
greatly simplifies output, but you have to be careful how you define a character
count Gust counting characters will give a number that includes space and new#line
characters).
20.6.2 Iteration
If the document was just a vector<char> it would be simple to iterate over it.
How do we iterate over a list of lines? Obviously, we can iterate over the list
using list<Line>: : iterator. However, what if we wanted to visit the characters
one
after another without any fuss about line breaks? We could provide an iterator
specifically designed for our Document:
class Text_iterator { II keep track of l ine and character position within a line
list<Line>: :iterator In;
Line: : iterator pos;
public:
};
II start the iterator at l ine ll's character position pp:
Text_iterator(list<Line>: : iterator II, Line: : iterator pp)
:ln(ll), pos(pp) {}
char& operator•() { return •pos; }
Text_iterator& operator++();
boo I operator==(const Text_iterator& other) const
{ return ln==other.ln && pos==other.pos; }
bool operator!=(const Text_iterator& other) const
{ return ! ( *this=other); }
Text_iterator& Text_iterator: :operator++()
{
if (pos==( *ln).end()) {
++In; II proceed to next line
pos = (*ln).begin();
++pos;
return •this;
II proceed to next character
20.6 AN EXAMPLE: A SIMPLE TEXT EDI TOR
To make Text_iterator useful, we need to equip class Document with conven#tional
begin() and end() functions :
struct Document {
list<Line> line;
Text_iterator begin() II iirst character oi first line
{ return Text_iterator(line.begin(), (*line.begin()).begin()); }
Text_iterator end() II one beyond the last line
{ return Text_iterator(line.end(), (*line.end()).end()); }
} ;
We need the curious (*line.begin()).begin() notation because we want the begin#ning
of what line.begin() points to; we could alternatively have used line.begin()->
begin() because the standard library iterators support ->.
We can now iterate over the characters of a document like this:
void print(Document& d)
{
for (Text_iterator p = d.begin(); pl=d.end(); ++p) cout << •p;
print(my_doc);
Presenting the document as a sequence of characters is useful for many things,
but usually we traverse a document looking for something more specific than a
character. For example, here is a piece of code to delete line n:
void erase _line( Document& d, int n)
{
if (n<O II d.line.size()<=n) return; II ignore out-oi-range l ines
d.line.erase(advance(d.line.begin(), n));
The call advance(n) moves an iterator n elements forward; advance() is a stan#dard
library function, but we could have implemented it ourselves like this:
template<class Iter> Iter advance( Iter p, int n)
{
while (n>O) { ++p; --n; }
return p;
II go forward
709
10 CHAPTER 20 • CON TAIN ERS AND ITE RATORS
Note that advance() can be used to simulate subscripting. In fact, for a vector
called v, •advance(v.begin(),n) is roughly equivalent to v[n). Note that "roughly"
means that advance() laboriously moves past the first n-1 elements one by one,
whereas the subscript goes straight to the nth element. For a list, we have to usc
the laborious method. It's a price we have to pay for the more flexible layout of
the clements of a list.
For an iterator that can move both forward and backward. such as the itera#tor for
list, a negative argument to the standard library advance() will move the
iterator backward. For an iterator that can handle subscripting, such as the
itcra#tor for a vector. the standard library advance() will go directly to the
right ele#ment rather than slowly moving along using ++. Clearly, the standard
library
advance() is a bit smarter than ours. That's worth noticing: typically, the stall·
dard library facilities have had more care and time spent on them than we could
afford, so prefer the standard facilities to "home brew."
T RY THI S
• Rewrite advance() so that it will "go backward" when you give it a negative
argument.
Probably, a search is the kind of iteration that is most obvious to a user. We
search
for individual words (such as milkshake or Gavin), for sequences of letters that
can't easily be considered words (such as secret\nhomestead - i.e., a line ending
with secret followed by a line starting with homestead), for regular expressions
(e.g .. [bBJ\w•ne - i.e., all upper- or lowercase 8 followed by 0 or more letters
fol#lowed by ne; see Chapter 23), etc. Let's show how to hal1dle the second case,
fmd#ing a string, using our Document layout. We use a simple - non-optimal
algorithm:
Fmd the first character of our search string in the document.
See if that character and the following characters match our search
string.
If so, we are finished : if not, we look for the next occurrence of that first
character.
For generality, we adopt the STL convention of defming the text in which to search
as a sequence defined by a pair of iterators. That way we Call use our search
func#tion for allY part of a document as well as a complete document. If we fmd an
oc#currence of our string in the document, we return all iterator to its frrst
character; if
we don't ftnd an occurrence, we return an iterator to the end of the sequence:

You might also like