[8]One
of the biggest time-wasters in C is character arrays: keeping track of the
difference between static quoted strings and arrays created on the stack and the
heap, and the fact that sometimes you’re passing around a char* and
sometimes you must copy the whole
array.
(This is the general problem of
shallow copy vs. deep copy.) Especially because string
manipulation is so common, character arrays are a great source of
misunderstandings and bugs.
Despite this, creating string classes
remained a common exercise for beginning C++ programmers for many years. The
Standard C++ library string class solves the problem of character array
manipulation once and for all, keeping track of memory even during
assignments and copy-constructions. You simply don’t need to think about
it.
This chapter examines the Standard C++
string class, beginning with a look at what constitutes a C++ string and
how the C++ version differs from a traditional C character array. You’ll
learn about operations and manipulations using string objects, and see
how C++ strings accommodate variation in character sets and string data
conversion.
Handling text is perhaps one of the
oldest of all programming applications, so it’s not surprising that the
C++ string draws heavily on the ideas and terminology that have long been
used for this purpose in C and other languages. As you begin to acquaint
yourself with C++ strings this fact should be reassuring, in the respect
that no matter what programming idiom you choose, there are really only about
three things you can do with a string: create or modify the sequence of
characters stored in the string, detect the presence or absence of
elements within the string, and translate between various schemes for
representing string characters.
In C, a string is simply an array of
characters that always includes a binary zero (often called the null
terminator) as its final array element. There are two significant
differences between C++ strings and their C progenitors. First, C++
string objects associate the array of characters which constitute the
string with methods useful for managing and operating on it. A
string also contains certain “housekeeping” information about
the size and storage location of its data. Specifically, a C++ string
object knows its starting location in memory, its content, its length in
characters, and the length in characters to which it can grow before the
string object must resize its internal data buffer. This gives rise to
the second big difference between C char arrays and C++ strings.
C++ strings do not include a null terminator, nor do the C++
string handling member functions rely on the existence of a null
terminator to perform their jobs. C++ strings greatly reduce the
likelihood of making three of the most common and destructive C programming
errors: overwriting array bounds, trying to access arrays through uninitialized
or incorrectly valued pointers, and leaving pointers “dangling”
after an array ceases to occupy the storage that was once allocated to
it.
The exact implementation of memory layout
for the string class is not defined by the C++ Standard. This architecture is
intended to be flexible enough to allow differing implementations by compiler
vendors, yet guarantee predictable behavior for users. In particular, the exact
conditions under which storage is allocated to hold data for a string object are
not defined. String allocation rules were formulated to allow but not require a
reference-counted implementation, but whether or not the implementation uses
reference counting, the semantics must be the same. To put this a bit
differently, in C, every char array occupies a unique physical region of
memory. In C++, individual string objects may or may not occupy unique
physical regions of memory, but if reference counting is used to avoid storing
duplicate copies of data, the individual objects must look and act as though
they do exclusively own unique regions of storage. For example:
//: C04:StringStorage.cpp //{L} ../TestSuite/Test #include <string> #include <iostream> #include "../TestSuite/Test.h" using namespace std; class StringStorageTest : public Test { public: void run() { string s1("12345"); // Set the iterator indicate the first element string::iterator it = s1.begin(); // This may copy the first to the second or // use reference counting to simulate a copy string s2 = s1; test_(s1 == s2); // Either way, this statement may ONLY modify first *it = '0'; cout << "s1 = " << s1 << endl; cout << "s2 = " << s2 << endl; test_(s1 != s2); } }; int main() { StringStorageTest sst; sst.setStream(&cout); sst.run(); return sst.report(); } ///:~
Reference counting may serve to make an
implementation more memory efficient, but it is transparent to users of the
string
class.
Creating and initializing strings
is a straightforward proposition, and fairly flexible as well. In the example
shown below, the first string, imBlank, is declared but contains
no initial value. Unlike a C char array, which would contain a random and
meaningless bit pattern until initialization, imBlank does contain
meaningful information. This string object has been initialized to hold
“no characters,” and can properly report its 0 length and absence of
data elements through the use of class member functions.
The next string, heyMom, is
initialized by the literal argument "Where are my socks?". This form of
initialization uses a quoted character array as a parameter to the string
constructor. By contrast, standardReply is simply initialized with an
assignment. The last string of the group, useThisOneAgain, is
initialized using an existing C++ string object. Put another way, this
example illustrates that string objects let you:
//: C04:SmallString.cpp //{L} ../TestSuite/Test //{-msc} Execution error #include <string> using namespace std; int main() { string imBlank; string heyMom("Where are my socks?"); string standardReply = "Beamed into deep " "space on wide angle dispersion?"; string useThisOneAgain(standardReply); } ///:~
These are the simplest forms of
string initialization, but there are other variations which offer more
flexibility and control. You can :
//: C04:SmallString2.cpp //{L} ../TestSuite/Test //{-msc} Execution error #include <string> #include <iostream> using namespace std; int main() { string s1 ("What is the sound of one clam napping?"); string s2 ("Anything worth doing is worth overdoing."); string s3("I saw Elvis in a UFO."); // Copy the first 8 chars string s4(s1, 0, 8); // Copy 6 chars from the middle of the source string s5(s2, 15, 6); // Copy from middle to end string s6(s3, 6, 15); // Copy all sorts of stuff string quoteMe = s4 + "that" + // substr() copies 10 chars at element 20 s1.substr(20, 10) + s5 + // substr() copies up to either 100 char // or eos starting at element 5 "with" + s3.substr(5, 100) + // OK to copy a single char this way s1.substr(37, 1); cout << quoteMe << endl; } ///:~
The string member function
substr( ) takes a starting position as its first argument and the
number of characters to select as the second argument. Both of these arguments
have default values and if you say substr( ) with an empty argument
list you produce a copy of the entire string, so this is a convenient way
to duplicate a string.
Here’s what the string
quoteMe contains after the initialization shown above :
"What is that one clam doing with Elvis in a UFO.?"
Notice the final line of example above.
C++ allows string initialization techniques to be mixed in a single
statement, a flexible and convenient feature. Also note that the last
initializer copies just one character from the source
string.
Another slightly more subtle
initialization technique involves the use of the string iterators
string.begin( ) and string.end( ). This treats a
string like a container object (which you’ve seen primarily
in the form of vector so far in this book – you’ll see many
more containers soon) which has iterators indicating the start and end of
the “container.” This way you can hand a string constructor
two iterators and it will copy from one to the other into the new
string:
//: C04:StringIterators.cpp //{L} ../TestSuite/Test //{-msc} Execution error #include <string> #include <iostream> using namespace std; int main() { string source("xxx"); string s(source.begin(), source.end()); cout << s << endl; } ///:~
The iterators are not restricted to
begin( ) and end( ), so you can choose a subset of
characters from the source string.
C++ strings may not be
initialized with single characters or with ASCII or other integer values.
//: C04:UhOh.cpp //{L} ../TestSuite/Test //{-msc} Execution error #include <string> using namespace std; int main() { // Error: no single char inits //! string nothingDoing1('a'); // Error: no integer inits //! string nothingDoing2(0x37); } ///:~
If you’ve programmed in C, you are
accustomed to the convenience of a large family of functions for writing,
searching, rearranging, and copying char arrays. However, there are two
unfortunate aspects of the Standard C library functions for handling char
arrays. First, there are three loosely organized families of them: the
“plain” group, the group that manipulates the characters
without respect to case, and the ones which require you to supply a count
of the number of characters to be considered in the operation at hand. The
roster of function names in the C char array handling library literally
runs to several pages, and though the kind and number of arguments to the
functions are somewhat consistent within each of the three groups, to use them
properly you must be very attentive to details of function naming and parameter
passing.
The second inherent trap of the standard
C char array tools is that they all rely explicitly on the assumption
that the character array includes a null terminator. If by oversight or error
the null is omitted or overwritten, there’s very little to keep the C
char array handling functions from manipulating the memory beyond the
limits of the allocated space, sometimes with disastrous results.
C++ provides a vast improvement in the
convenience and safety of string objects. For purposes of actual string
handling operations, there are a modest two or three dozen member function
names. It’s worth your while to become acquainted with these. Each
function is overloaded, so you don’t have to learn a new string
member function name simply because of small differences in their
parameters.
One of the most valuable and convenient
aspects of C++ strings is that they grow as needed, without intervention on the
part of the programmer. Not only does this make string handling code inherently
more trustworthy, it also almost entirely eliminates a tedious
“housekeeping” chore – keeping track of the bounds of the
storage in which your strings live. For example, if you create a string object
and initialize it with a string of 50 copies of ‘X’, and later store
in it 50 copies of “Zowie”, the object itself will reallocate
sufficient storage to accommodate the growth of the data. Perhaps nowhere is
this property more appreciated than when the strings manipulated in your code
change in size, and you don’t know how big the change is. Appending,
concatenating, and inserting strings often give rise to this circumstance, but
the string member functions append( ) and insert( )
transparently reallocate storage when a string grows.
//: C04:StrSize.cpp //{L} ../TestSuite/Test #include <string> #include <iostream> using namespace std; int main() { string bigNews("I saw Elvis in a UFO. "); cout << bigNews << endl; // How much data have we actually got? cout << "Size = " << bigNews.size() << endl; // How much can we store without reallocating cout << "Capacity = " << bigNews.capacity() << endl; // Insert this string in bigNews immediately // before bigNews[1] bigNews.insert(1, " thought I "); cout << bigNews << endl; cout << "Size = " << bigNews.size() << endl; cout << "Capacity = " << bigNews.capacity() << endl; // Make sure that there will be this much space bigNews.reserve(500); // Add this to the end of the string bigNews.append("I've been working too hard."); cout << bigNews << endl; cout << "Size = " << bigNews.size() << endl; cout << "Capacity = " << bigNews.capacity() << endl; } ///:~
Here is the output:
I saw Elvis in a UFO. Size = 21 Capacity = 31 I thought I saw Elvis in a UFO. Size = 32 Capacity = 63 I thought I saw Elvis in a UFO. I've been working too hard. Size = 66 Capacity = 511
This example demonstrates that even
though you can safely relinquish much of the responsibility for allocating and
managing the memory your strings occupy, C++ strings provide you
with several tools to monitor and manage their size. The size( ),
resize( ), capacity( ), and reserve( )
member functions can be very useful when its necessary to work back and forth
between data contained in C++ style strings and traditional null terminated C
char arrays. Note the ease with which we changed the size of the storage
allocated to the string.
The exact fashion in which the
string member functions will allocate space for your data is dependent on
the implementation of the library. When one implementation was tested with the
example above, it appeared that reallocations occurred on even word boundaries,
with one byte held back. The architects of the string class have
endeavored to make it possible to mix the use of C char arrays and C++
string objects, so it is likely that figures reported by StrSize.cpp for
capacity reflect that in this particular implementation, a byte is set aside to
easily accommodate the insertion of a null terminator.
insert( ) is particularly
nice because it absolves you of making sure the insertion of characters in a
string won’t overrun the storage space or overwrite the characters
immediately following the insertion point. Space grows and existing characters
politely move over to accommodate the new elements. Sometimes, however, this
might not be what you want to happen. If the data in string needs to retain the
ordering of the original characters relative to one another or must be a
specific constant size, use the replace( ) function to overwrite a
particular sequence of characters with another group of characters. There are
quite a number of overloaded versions of replace( ), but the
simplest one takes three arguments: an integer telling where to start in the
string, an integer telling how many characters to eliminate from the original
string, and the replacement string (which can be a different number of
characters than the eliminated quantity). Here’s a very simple
example:
//: C04:StringReplace.cpp // Simple find-and-replace in strings //{L} ../TestSuite/Test #include <string> #include <iostream> using namespace std; int main() { string s("A piece of text"); string tag("$tag$"); s.insert(8, tag + ' '); cout << s << endl; int start = s.find(tag); cout << "start = " << start << endl; cout << "size = " << tag.size() << endl; s.replace(start, tag.size(), "hello there"); cout << s << endl; } ///:~
The tag is first inserted into
s (notice that the insert happens before the value indicating the
insert point, and that an extra space was added after tag), then it is
found and replaced.
You should actually check to see if
you’ve found anything before you perform a replace( ).
The above example replaces with a char*, but there’s an
overloaded version that replaces with a string. Here’s a
more complete demonstration replace( )
//: C04:Replace.cpp //{L} ../TestSuite/Test #include <string> #include <iostream> using namespace std; void replaceChars(string& modifyMe, string findMe, string newChars){ // Look in modifyMe for the "find string" // starting at position 0 int i = modifyMe.find(findMe, 0); // Did we find the string to replace? if(i != string::npos) // Replace the find string with newChars modifyMe.replace(i,newChars.size(),newChars); } int main() { string bigNews = "I thought I saw Elvis in a UFO. " "I have been working too hard."; string replacement("wig"); string findMe("UFO"); // Find "UFO" in bigNews and overwrite it: replaceChars(bigNews, findMe, replacement); cout << bigNews << endl; } ///:~
Now the last line of output from
replace.cpp looks like this:
I thought I saw Elvis in a wig. I have been working too hard.
If replace doesn’t find the search
string, it returns npos. npos is a static constant member of the
basic_string class.
Unlike insert( ),
replace( ) won’t grow the string’s storage space
if you copy new characters into the middle of an existing series of array
elements. However, it will grow the storage space if you make a
“replacement” that writes beyond the end of an existing array.
Here’s an example:
//: C04:ReplaceAndGrow.cpp //{L} ../TestSuite/Test #include <string> #include <iostream> using namespace std; int main() { string bigNews("I saw Elvis in a UFO. " "I have Been working too hard."); string replacement("wig"); // The first arg says "replace chars // beyond the end of the existing string": bigNews.replace(bigNews.size(), replacement.size(), replacement); cout << bigNews << endl; } ///:~
The call to replace( ) begins
“replacing” beyond the end of the existing array. The output looks
like this:
I saw Elvis in a UFO. I have been working too hard.wig
Notice that replace( )
expands the array to accommodate the growth of the string due to
“replacement” beyond the bounds of the existing
array.
You may have been hunting through this
chapter trying to do something relatively simple like replace all the instances
of one character with a different character. Upon finding the above section on
replacing, you thought you found the answer but then you started seeing groups
of characters and counts and other things that looked a bit too complex.
Doesn’t string have a way to just replace one character with
another everywhere?
The string class by itself
doesn’t solve all possible problems. The remainder are relegated to the
STL algorithms, because the string class can look just like an STL
container (the STL algorithms work with anything that looks like an STL
container). All the STL algorithms work on a “range” of elements
within a container. Usually that range is just “from the beginning of the
container to the end.” A string object looks like a container of
characters: to get the beginning of the range you use
string::begin( ) and to get the end of the range you use
string::end( ). The following example shows the use of the STL
replace( ) algorithm to replace all the instances of ‘X’
with ‘Y’:
//: C04:StringCharReplace.cpp //{L} ../TestSuite/Test #include <string> #include <algorithm> #include <iostream> using namespace std; int main() { string s("aaaXaaaXXaaXXXaXXXXaaa"); cout << s << endl; replace(s.begin(), s.end(), 'X', 'Y'); cout << s << endl; } ///:~
Notice that this replace( )
is not called as a member function of string. Also, unlike the
string::replace( ) functions which only perform one replacement, the
STL replace is replacing all instances of one character with
another.
The STL replace( ) algorithm
only works with single objects (in this case, char objects), and will not
perform replacements of quoted char arrays or of string
objects.
Since a string looks like an STL
container, there are a number of other STL algorithms that can be applied to it,
which may solve other problems you have that are not directly addressed by the
string member functions. See Chapter XX for more information on the STL
algorithms.
One of the most delightful discoveries
awaiting a C programmer learning about C++ string handling is how simply
strings can be combined and appended using operator+ and
operator+=. These operators make combining strings
syntactically equivalent to adding numeric data.
//: C04:AddStrings.cpp //{L} ../TestSuite/Test #include <string> #include <iostream> using namespace std; int main() { string s1("This "); string s2("That "); string s3("The other "); // operator+ concatenates strings s1 = s1 + s2; cout << s1 << endl; // Another way to concatenates strings s1 += s3; cout << s1 << endl; // You can index the string on the right s1 += s3 + s3[4] + "oh lala"; cout << s1 << endl; } ///:~
The output looks like
this:
This This That This That The other This That The other ooh lala
operator+ and operator+=
are a very flexible and convenient means of combining string
data. On the right hand side of the statement, you can use almost any type that
evaluates to a group of one or more
characters.
The find family of string
member functions allows you to locate a character or group of characters within
a given string. Here are the members of the find family and their general
usage:
string find member
function |
What/how it finds
|
find( ) |
Searches a string for a specified
character or group of characters and returns the starting position of the first
occurrence found or npos if no match is found. (npos is a const of
–1 and indicates that a search failed.) |
find_first_of( ) |
Searches a target string and returns the
position of the first match of any character in a specified group. If no
match is found, it returns npos. |
find_last_of( ) |
Searches a target string and returns the
position of the last match of any character in a specified group. If no
match is found, it returns npos. |
find_first_not_of( ) |
Searches a target string and returns the
position of the first element that doesn’t match any
character in a specified group. If no such element is found, it returns
npos. |
find_last_not_of( ) |
Searches a target string and returns the
position of the element with the largest subscript that doesn’t
match of any character in a specified group. If no such element is found,
it returns npos. |
rfind( ) |
Searches a string from end to beginning
for a specified character or group of characters and returns the starting
position of the match if one is found. If no match is found, it returns
npos. |
String searching member functions and
their general uses
The simplest use of find( )
searches for one or more characters in a string. This overloaded
version of find( ) takes a parameter that specifies the character(s)
for which to search, and optionally one that tells it where in the string to
begin searching for the occurrence of a substring. (The default position at
which to begin searching is 0.) By setting the call to find inside a
loop, you can easily move through a string, repeating a search in order to find
all of the occurrences of a given character or group of characters within the
string.
Notice that we define the string object
sieveChars using a constructor idiom which sets the initial size of the
character array and writes the value ‘P’ to each of its
member.
//: C04:Sieve.cpp //{L} ../TestSuite/Test #include <string> #include <iostream> using namespace std; int main() { // Create a 50 char string and set each // element to 'P' for Prime string sieveChars(50, 'P'); // By definition neither 0 nor 1 is prime. // Change these elements to "N" for Not Prime sieveChars.replace(0, 2, "NN"); // Walk through the array: for(int i = 2; i <= (sieveChars.size() / 2) - 1; i++) // Find all the factors: for(int factor = 2; factor * i < sieveChars.size();factor++) sieveChars[factor * i] = 'N'; cout << "Prime:" << endl; // Return the index of the first 'P' element: int j = sieveChars.find('P'); // While not at the end of the string: while(j != sieveChars.npos) { // If the element is P, the index is a prime cout << j << " "; // Move past the last prime j++; // Find the next prime j = sieveChars.find('P', j); } cout << "\n Not prime:" << endl; // Find the first element value not equal P: j = sieveChars.find_first_not_of('P'); while(j != sieveChars.npos) { cout << j << " "; j++; j = sieveChars.find_first_not_of('P', j); } } ///:~
The output from Sieve.cpp looks
like this:
Prime: 2 3 5 7 11 13 17 19 23 29 31 37 41 43 47 Not prime: 0 1 4 6 8 9 10 12 14 15 16 18 20 21 22 24 25 26 27 28 30 32 33 34 35 36 38 39 40 42 44 45 46 48 49
find( ) allows you to walk
forward through a string, detecting multiple occurrences of a character
or group of characters, while find_first_not_of( ) allows you to
test for the absence of a character or group.
The find member is also useful for
detecting the occurrence of a sequence of characters in a
string:
//: C04:Find.cpp // Find a group of characters in a string //{L} ../TestSuite/Test #include <string> #include <iostream> using namespace std; int main() { string chooseOne("Eenie, meenie, miney, mo"); int i = chooseOne.find("een"); while(i != string::npos) { cout << i << endl; i++; i = chooseOne.find("een", i); } } ///:~
Find.cpp produces a single line of
output :
8
This tells us that the first
‘e’ of the search group “een” was found in the word
“meenie,” and is the eighth element in the string. Notice that
find passed over the “Een” group of characters in the word
“Eenie”. The find member function performs a case
sensitive search.
There are no functions in the string
class to change the case of a string, but these functions can be easily
created using the Standard C library functions toupper( ) and
tolower( ), which change the case of one character at a time. A few
small changes will make Find.cpp perform a case insensitive
search:
//: C04:NewFind.cpp //{L} ../TestSuite/Test #include <string> #include <iostream> using namespace std; // Make an uppercase copy of s: string upperCase(string& s) { char* buf = new char[s.length()]; s.copy(buf, s.length()); for(int i = 0; i < s.length(); i++) buf[i] = toupper(buf[i]); string r(buf, s.length()); delete buf; return r; } // Make a lowercase copy of s: string lowerCase(string& s) { char* buf = new char[s.length()]; s.copy(buf, s.length()); for(int i = 0; i < s.length(); i++) buf[i] = tolower(buf[i]); string r(buf, s.length()); delete buf; return r; } int main() { string chooseOne("Eenie, meenie, miney, mo"); cout << chooseOne << endl; cout << upperCase(chooseOne) << endl; cout << lowerCase(chooseOne) << endl; // Case sensitive search int i = chooseOne.find("een"); while(i != string::npos) { cout << i << endl; i++; i = chooseOne.find("een", i); } // Search lowercase: string lcase = lowerCase(chooseOne); cout << lcase << endl; i = lcase.find("een"); while(i != lcase.npos) { cout << i << endl; i++; i = lcase.find("een", i); } // Search uppercase: string ucase = upperCase(chooseOne); cout << ucase << endl; i = ucase.find("EEN"); while(i != ucase.npos) { cout << i << endl; i++; i = ucase.find("EEN", i); } } ///:~
Both the upperCase( ) and
lowerCase( ) functions follow the same form: they allocate storage
to hold the data in the argument string, copy the data and change the
case. Then they create a new string with the new data, release the buffer
and return the result string. The c_str( ) function cannot be
used to produce a pointer to directly manipulate the data in the string
because c_str( ) returns a pointer to const. That is,
you’re not allowed to manipulate string data with a pointer, only
with member functions. If you need to use the more primitive char array
manipulation, you should use the technique shown above.
The output looks like
this:
Eenie, meenie, miney, mo EENIE, MEENIE, MINEY, MO eenie, meenie, miney, mo 8 eenie, meenie, miney, mo 0 8 EENIE, MEENIE, MINEY, MO 0 8
The case insensitive searches found both
occurrences on the “een” group.
NewFind.cpp isn’t the best
solution to the case sensitivity problem, so we’ll revisit it when we
examine string comparisons.
Sometimes it’s necessary to search
through a string from end to beginning, if you need to find the data in
“last in / first out “ order. The string member function
rfind( ) handles this job.
//: C04:Rparse.cpp // Reverse the order of words in a string //{L} ../TestSuite/Test #include <string> #include <iostream> #include <vector> using namespace std; int main() { // The ';' characters will be delimiters string s("now.;sense;make;to;going;is;This"); cout << s << endl; // To store the words: vector<string> strings; // The last element of the string: int last = s.size(); // The beginning of the current word: int current = s.rfind(';'); // Walk backward through the string: while(current != string::npos){ // Push each word into the vector. // Current is incremented before copying to // avoid copying the delimiter: ++current; strings.push_back( s.substr(current, last - current)); // Back over the delimiter we just found, // and set last to the end of the next word: current -= 2; last = current; // Find the next delimiter current = s.rfind(';', current); } // Pick up the first word - it's not // preceded by a delimiter strings.push_back(s.substr(0, last - current)); // Print them in the new order: for(int j = 0; j < strings.size(); j++) cout << strings[j] << " "; } ///:~
Here’s how the output from
Rparse.cpp looks:
now.;sense;make;to;going;is;This This is going to make sense now.
rfind( ) backs through the
string looking for tokens, reporting the array index of matching characters or
string::npos if it is unsuccessful.
The find_first_of( ) and
find_last_of( ) member functions can be conveniently put to work to
create a little utility that will strip whitespace characters off of both ends
of a string. Notice it doesn’t touch the original string, but instead
returns a new string:
//: C04:trim.h #ifndef TRIM_H #define TRIM_H #include <string> // General tool to strip spaces from both ends: inline std::string trim(const std::string& s) { if(s.length() == 0) return s; int b = s.find_first_not_of(" \t"); int e = s.find_last_not_of(" \t"); if(b == -1) // No non-spaces return ""; return std::string(s, b, e - b + 1); } #endif // TRIM_H ///:~
The first test checks for an empty
string; in that case no tests are made and a copy is returned. Notice
that once the end points are found, the string constructor is used to
build a new string from the old one, giving the starting count and the
length. This form also utilizes the “return value optimization” (see
the index for more details).
Testing such a general-purpose tool needs
to be thorough:
//: C04:TrimTest.cpp //{L} ../TestSuite/Test #include "trim.h" #include <iostream> using namespace std; string s[] = { " \t abcdefghijklmnop \t ", "abcdefghijklmnop \t ", " \t abcdefghijklmnop", "a", "ab", "abc", "a b c", " \t a b c \t ", " \t a \t b \t c \t ", "", // Must also test the empty string }; void test(string s) { cout << "[" << trim(s) << "]" << endl; } int main() { for(int i = 0; i < sizeof s / sizeof *s; i++) test(s[i]); } ///:~
In the array of string s,
you can see that the character arrays are automatically converted to
string objects. This array provides cases to check the removal of spaces
and tabs from both ends, as well as ensuring that spaces and tabs do not get
removed from the middle of a
string.
My word processor/page layout program
(Microsoft Word) will save a document in HTML, but it doesn’t recognize
that the code listings in this book should be tagged with the HTML
“preformatted” tag (<PRE>), and it puts paragraph marks
(<P> and </P>) around every listing line. This means that all the
indentation in the code listings is lost. In addition, Word saves HTML with
reduced font sizes for body text, which makes it hard to read.
To convert the book to HTML
form[9], then, the
original output must be reprocessed, watching for the tags that mark the start
and end of code listings, inserting the <PRE> and </PRE> tags at the
appropriate places, removing all the <P> and </P> tags within the
listings, and adjusting the font sizes. Removal is accomplished with the
erase( ) member function, but you must correctly determine the
starting and ending points of the substring you wish to erase. Here’s the
program that reprocesses the generated HTML file:
//: C04:ReprocessHTML.cpp // Take Word's html output and fix up // the code listings and html tags #include "../require.h" #include <iostream> #include <fstream> #include <string> using namespace std; // Produce a new string which is the original // string with the html paragraph break marks // stripped off: string stripPBreaks(string s) { int br; while((br = s.find("<P>")) != string::npos) s.erase(br, strlen("<P>")); while((br = s.find("</P>")) != string::npos) s.erase(br, strlen("</P>")); return s; } // After the beginning of a code listing is // detected, this function cleans up the listing // until the end marker is found. The first line // of the listing is passed in by the caller, // which detects the start marker in the line. void fixupCodeListing(istream& in, ostream& out, string& line, int tag) { out << line.substr(0, tag) << "<PRE>" // Means "preformatted" in html << stripPBreaks(line.substr(tag)) << endl; string s; while(getline(in, s)) { int endtag = s.find("/""/""/"":~"); if(endtag != string::npos) { endtag += strlen("/""/""/"":~"); string before = s.substr(0, endtag); string after = s.substr(endtag); out << stripPBreaks(before) << "</PRE>" << after << endl; return; } out << stripPBreaks(s) << endl; } } string removals[] = { "<FONT SIZE=2>", "<FONT SIZE=1>", "<FONT FACE=\"Times\" SIZE=1>", "<FONT FACE=\"Times\" SIZE=2>", "<FONT FACE=\"Courier\" SIZE=1>", "SIZE=1", // Eliminate all other '1' & '2' size "SIZE=2", }; const int rmsz = sizeof(removals)/sizeof(*removals); int main(int argc, char* argv[]) { requireArgs(argc, 2); ifstream in(argv[1]); assure(in, argv[1]); ofstream out(argv[2]); string line; while(getline(in, line)) { // The "Body" tag only appears once: if(line.find("<BODY") != string::npos) { out << "<BODY BGCOLOR=\"#FFFFFF\" " "TEXT=\"#000000\">" << endl; continue; // Get next line } // Eliminate each of the removals strings: for(int i = 0; i < rmsz; i++) { int find = line.find(removals[i]); if(find != string::npos) line.erase(find, removals[i].size()); } int tag1 = line.find("/""/"":"); int tag2 = line.find("/""*"":"); if(tag1 != string::npos) fixupCodeListing(in, out, line, tag1); else if(tag2 != string::npos) fixupCodeListing(in, out, line, tag2); else out << line << endl; } } ///:~
Notice the lines that detect the start
and end listing tags by indicating them with each character in quotes. These
tags are treated in a special way by the logic in the Extractcode.cpp
tool for extracting code listings. To present the code for the tool in the
text of the book, the tag sequence itself must not occur in the listing. This
was accomplished by taking advantage of a C++ preprocessor feature that causes
text strings delimited by adjacent pairs of double quotes to be merged into a
single string during the preprocessor pass of the build.
int tag1 = line.find("/""/"":");
The effect of the sequence of char
arrays is to produce the starting tag for code listings.
Sometimes it’s useful to take an
HTML file and strip its tags so you have something approximating the text that
would be displayed in the Web browser, only as an ASCII text file. The
string class once again comes in handy. The following has some variation
on the theme of the previous example:
//: C04:HTMLStripper.cpp // Filter to remove html tags and markers #include "../require.h" #include <fstream> #include <iostream> #include <string> using namespace std; string replaceAll(string s, string f, string r) { unsigned int found = s.find(f); while(found != string::npos) { s.replace(found, f.length(), r); found = s.find(f); } return s; } string stripHTMLTags(string s) { while(true) { unsigned int left = s.find('<'); unsigned int right = s.find('>'); if(left==string::npos || right==string::npos) break; s = s.erase(left, right - left + 1); } s = replaceAll(s, "<", "<"); s = replaceAll(s, ">", ">"); s = replaceAll(s, "&", "&"); s = replaceAll(s, " ", " "); // Etc... return s; } int main(int argc, char* argv[]) { requireArgs(argc, 1, "usage: HTMLStripper InputFile"); ifstream in(argv[1]); assure(in, argv[1]); const int sz = 4096; char buf[sz]; while(in.getline(buf, sz)) { string s(buf); cout << stripHTMLTags(s) << endl; } } ///:~
The string class can replace one
string with another but there’s no facility for replacing all the strings
of one type with another, so the replaceAll( ) function does this
for you, inside a while loop that keeps finding the next instance of the
find string f. That function is used inside stripHTMLTags after it
uses erase( ) to remove everything that appears inside angle braces
(‘<‘ and ‘>‘). Note that I probably
haven’t gotten all the necessary replacement values, but you can see what
to do (you might even put all the find-replace pairs in a table...). In
main( ) the arguments are checked, and the file is read and
converted. It is sent to standard output so you must redirect it with
‘>‘ if you want to write it to a
file.
Comparing strings is inherently different
than comparing numbers. Numbers have constant, universally meaningful values. To
evaluate the relationship between the magnitude of two strings, you must make a
lexical comparison. Lexical comparison means that when you test a
character to see if it is “greater than” or “less than”
another character, you are actually comparing the numeric representation of
those characters as specified in the collating sequence of the character set
being used. Most often, this will be the ASCII collating sequence, which assigns
the printable characters for the English language numbers in the range from 32
to 127 decimal. In the ASCII collating sequence, the first
“character” in the list is the space, followed by several common
punctuation marks, and then uppercase and lowercase letters. With respect to the
alphabet, this means that the letters nearer the front have lower ASCII values
than those nearer the end. With these details in mind, it becomes easier to
remember that when a lexical comparison that reports s1 is “greater
than” s2, it simply means that when the two were compared, the first
differing character in s1 came later in the alphabet than the character in that
same position in s2.
C++ provides several ways to compare
strings, and each has their advantages. The simplest to use are the non member
overloaded operator functions operator ==, operator != operator >,
operator <, operator >=, and operator <=.
//: C04:CompStr.cpp //{L} ../TestSuite/Test #include <string> #include <iostream> using namespace std; int main() { // Strings to compare string s1("This "); string s2("That "); for(int i = 0; i< s1.size() && i < s2.size(); i++) // See if the string elements are the same: if(s1[i] == s2[i]) cout << s1[i] << " " << i << endl; // Use the string inequality operators if(s1 != s2) { cout << "Strings aren't the same:" << " "; if(s1 > s2) cout << "s1 is > s2" << endl; else cout << "s2 is > s1" << endl; } } ///:~
Here’s the output from
CompStr.cpp:
T 0 h 1 4 Strings aren’t the same: s1 is > s2
The overloaded comparison operators are
useful for comparing both full strings and individual string
elements.
Notice in the code fragment below the
flexibility of argument types on both the left and right hand side of the
comparison operators. The overloaded operator set allows the direct comparison
of string objects, quoted literals, and pointers to C style strings.
// The lvalue is a quoted literal and // the rvalue is a string if("That " == s2) cout << "A match" << endl; // The lvalue is a string and the rvalue is a // pointer to a c style null terminated string if(s1 != s2.c_str()) cout << "No match" << endl;
You won’t find the logical not (!)
or the logical comparison operators (&& and ||) among operators for
string. (Neither will you find overloaded versions of the bitwise C operators
&, |, ^, or ~.) The overloaded non member comparison operators for the
string class are limited to the subset which has clear, unambiguous application
to single characters or groups of characters.
The compare( ) member
function offers you a great deal more sophisticated and precise comparison than
the non member operator set, because it returns a lexical comparison value, and
provides for comparisons that consider subsets of the string data. It provides
overloaded versions that allow you to compare two complete strings, part of
either string to a complete string, and subsets of two strings. This example
compares complete strings:
//: C04:Compare.cpp // Demonstrates compare(), swap() //{L} ../TestSuite/Test #include <string> #include <iostream> using namespace std; int main() { string first("This"); string second("That"); // Which is lexically greater? switch(first.compare(second)) { case 0: // The same cout << first << " and " << second << " are lexically equal" << endl; break; case -1: // Less than first.swap(second); // Fall through this case... case 1: // Greater than cout << first << " is lexically greater than " << second << endl; } } ///:~
The output from Compare.cpp looks
like this:
This is lexically greater than That
To compare a subset of the characters in
one or both strings, you add arguments that define where to start the comparison
and how many characters to consider. For example, we can use the overloaded
version of compare( ):
s1.compare(s1StartPos, s1NumberChars,
s2, s2StartPos, s2NumberChars);
If we substitute the above version of
compare( ) in the previous program so that it only looks at the
first two characters of each string, the program becomes:
//: C04:Compare2.cpp // Overloaded compare() //{L} ../TestSuite/Test //{-g++295} #include <string> #include <iostream> using namespace std; int main() { string first("This"); string second("That"); // Compare first two characters of each string: switch(first.compare(0, 2, second, 0, 2)) { case 0: // The same cout << first << " and " << second << " are lexically equal" << endl; break; case -1: // Less than first.swap(second); // Fall through this case... case 1: // Greater than cout << first << " is lexically greater than " << second << endl; } } ///:~
The output is:
This and That are lexically equal
which is true, for the first two
characters of “This” and “That.”
In the examples so far, we have used C
style array indexing syntax to refer to an individual character in a string. C++
strings provide an alternative to the s[n] notation: the
at( ) member. These two idioms produce the same result in C++ if all
goes well:
//: C04:StringIndexing.cpp //{L} ../TestSuite/Test #include <string> #include <iostream> using namespace std; int main(){ string s("1234"); cout << s[1] << " "; cout << s.at(1) << endl; } ///:~
The output from this code looks like
this:
2 2
However, there is one important
difference between [ ] and at( ). When you try to reference
an array element that is out of bounds, at( ) will do you the
kindness of throwing an exception, while ordinary [ ] subscripting syntax
will leave you to your own devices:
//: C04:BadStringIndexing.cpp //{L} ../TestSuite/Test #include <string> #include <iostream> using namespace std; int main(){ string s("1234"); // Runtime problem: goes beyond array bounds: //! s[5]; // Saves you by throwing an exception: try { s.at(5); } catch(...) { cerr << "Went beyond array bounds!" << endl; } } ///:~
Using at( ) in place of [
] will give you a chance to gracefully recover from references to array
elements that don’t exist. at( ) throws an object of class
out_of_range. By catching this object in an exception handler, you can
take appropriate remedial actions such as recalculating the offending subscript
or growing the array. (You can read more about Exception Handling in Chapter
XX)
In the example program
NewFind.cpp, we used a lot of messy and rather tedious C char
array handling code to change the case of the characters in a string and then
search for the occurrence of matches to a substring. Sometimes the “quick
and dirty” method is justifiable, but in general, you won’t want to
sacrifice the advantages of having your string data safely and securely
encapsulated in the C++ object where it lives.
Here is a better, safer way to handle
case insensitive comparison of two C++ string objects. Because no data is copied
out of the objects and into C style strings, you don’t have to use
pointers and you don’t have to risk overwriting the bounds of an ordinary
character array. In this example, we use the string iterator. Iterators
are themselves objects which move through a collection or container of other
objects, selecting them one at a time, but never providing direct access to the
implementation of the container. Iterators are not pointers, but they are
useful for many of the same jobs.
//: C04:CmpIter.cpp // Find a group of characters in a string //{L} ../TestSuite/Test #include <string> #include <iostream> using namespace std; // Case insensitive compare function: int stringCmpi(const string& s1, const string& s2) { // Select the first element of each string: string::const_iterator p1 = s1.begin(), p2 = s2.begin(); // Don’t run past the end: while(p1 != s1.end() && p2 != s2.end()) { // Compare upper-cased chars: if(toupper(*p1) != toupper(*p2)) // Report which was lexically greater: return (toupper(*p1)<toupper(*p2))? -1 : 1; p1++; p2++; } // If they match up to the detected eos, say // which was longer. Return 0 if the same. return(s2.size() - s1.size()); } int main() { string s1("Mozart"); string s2("Modigliani"); cout << stringCmpi(s1, s2) << endl; } ///:~
Notice that the iterators p1 and
p2 use the same syntax as C pointers – the ‘*’
operator makes the value of element at the location given by the
iterators available to the toupper( ) function. toupper( )
doesn’t actually change the content of the element in the string. In
fact, it can’t. This definition of p1 tells us that we can only use
the elements p1 points to as constants.
string::const_iterator p1 = s1.begin();
The way toupper( ) and the
iterators are used in this example is called a case preserving case
insensitive comparison. This means that the string didn’t have to be
copied or rewritten to accommodate case insensitive comparison. Both of the
strings retain their original data, unmodified.
Just as the standard C pointer gives us
the increment (++) and decrement (--) operators to make pointer arithmetic a bit
more convenient, C++ string iterators come in two basic varieties. You’ve
seen end( ) and begin( ), which are the tools for moving
forward through a string one element at a time. The reverse iterators
rend( ) and rbegin( ) allow you to step backwards
through a string. Here’s how they work:
//: C04:RevStr.cpp // Print a string in reverse //{L} ../TestSuite/Test #include <string> #include <iostream> using namespace std; int main() { string s("987654321"); // Use this iterator to walk backwards: string::reverse_iterator rev; // "Incrementing" the reverse iterator moves // it to successively lower string elements: for(rev = s.rbegin(); rev != s.rend(); rev++) cout << *rev << " "; } ///:~
The output from RevStr.cpp looks
like this:
1 2 3 4 5 6 7 8 9
Reverse iterators act like pointers to
elements of the string’s character array, except that when you apply
the increment operator to them, they move backward rather than forward.
rbegin( ) and rend( ) supply string locations that are
consistent with this behavior, to wit, rbegin( ) locates the
position just beyond the end of the string, and rend( ) locates the
beginning. Aside from this, the main thing to remember about reverse iterators
is that they aren’t type equivalent to ordinary iterators. For
example, if a member function parameter list includes an iterator as an
argument, you can’t substitute a reverse iterator to get the function to
perform it’s job walking backward through the string. Here’s an
illustration:
// The compiler won’t accept this string sBackwards(s.rbegin(), s.rend());
The string constructor won’t accept
reverse iterators in place of forward iterators in its parameter list. This is
also true of string members such as copy( ), insert( ),
and assign( ).
We seem to have worked our way around the
margins of case insensitive string comparisons using C++ string objects, so
maybe it’s time to ask the obvious question: “Why isn’t
case-insensitive comparison part of the standard string class ?”
The answer provides interesting background on the true nature of C++ string
objects.
Consider what it means for a character to
have “case.” Written Hebrew, Farsi, and Kanji don’t use the
concept of upper and lower case, so for those languages this idea has no meaning
at all. This the first impediment to built-in C++ support for case-insensitive
character search and comparison: the idea of case sensitivity is not universal,
and therefore not portable.
It would seem that if there were a way of
designating that some languages were “all uppercase” or “all
lowercase” we could design a generalized solution. However, some languages
which employ the concept of “case” also change the meaning of
particular characters with diacritical marks: the cedilla in Spanish, the
circumflex in French, and the umlaut in German. For this reason, any
case-sensitive collating scheme that attempts to be comprehensive will be
nightmarishly complex to use.
Although we usually treat the C++
string as a class, this is really not the case. string is a
typedef of a more general constituent, the
basic_string< > template. Observe how string is
declared in the standard C++ header file:
typedef basic_string<char> string;
To really understand the nature of
strings, it’s helpful to delve a bit deeper and look at the template on
which it is based. Here’s the declaration of the
basic_string< > template:
template<class charT, class traits = char_traits<charT>, class allocator = allocator<charT> > class basic_string;
Earlier in this book, templates were
examined in a great deal of detail. The main thing to notice about the two
declarations above are that the string type is created when the
basic_string template is instantiated with char. Inside the
basic_string< > template declaration, the line
class traits = char_traits<charT>,
tells us that the behavior of the class
made from the basic_string< > template is specified by
a class based on the template char_traits< >. Thus, the
basic_string< > template provides for cases where you
need string oriented classes that manipulate types other than char (wide
characters or unicode, for example). To do this, the
char_traits< > template controls the content and
collating behaviors of a variety of character sets using the character
comparison functions eq( ) (equal), ne( ) (not equal),
and lt( ) (less than) upon which the
basic_string< > string comparison functions
rely.
This is why the string class
doesn’t include case insensitive member functions: That’s not in its
job description. To change the way the string class treats character comparison,
you must supply a different char_traits< > template,
because that defines the behavior of the individual character comparison member
functions.
This information can be used to make a
new type of string class that ignores case. First, we’ll define a
new case insensitive char_traits< > template that
inherits the existing one. Next, we’ll override only the members we need
to change in order to make character-by-character comparison case insensitive.
(In addition to the three lexical character comparison members mentioned above,
we’ll also have to supply new implementation of find( ) and
compare( ).) Finally, we’ll typedef a new class based
on basic_string, but using the case insensitive ichar_traits
template for its second argument.
//: C04:ichar_traits.h // Creating your own character traits #ifndef ICHAR_TRAITS_H #define ICHAR_TRAITS_H #include <string> #include <cctype> struct ichar_traits : std::char_traits<char> { // We'll only change character by // character comparison functions static bool eq(char c1st, char c2nd) { return std::toupper(c1st) == std::toupper(c2nd); } static bool ne(char c1st, char c2nd) { return std::toupper(c1st) != std::toupper(c2nd); } static bool lt(char c1st, char c2nd) { return std::toupper(c1st) < std::toupper(c2nd); } static int compare(const char* str1, const char* str2, size_t n) { for(int i = 0; i < n; i++) { if(std::tolower(*str1)>std::tolower(*str2)) return 1; if(std::tolower(*str1)<std::tolower(*str2)) return -1; if(*str1 == 0 || *str2 == 0) return 0; str1++; str2++; // Compare the other chars } return 0; } static const char* find(const char* s1, int n, char c) { while(n-- > 0 && std::toupper(*s1) != std::toupper(c)) s1++; return s1; } }; #endif // ICHAR_TRAITS_H ///:~
If we typedef an istring
class like this:
typedef basic_string<char, ichar_traits, allocator<char> > istring;
Then this istring will act like an
ordinary string in every way, except that it will make all comparisons
without respect to case. Here’s an example:
[ Something’s wrong with this, so
it’s currently left out of the compile by removing the
‘:’]
// C04:ICompare.cpp //{L} ../TestSuite/Test #include "ichar_traits.h" #include <string> #include <iostream> using namespace std; typedef basic_string<char, ichar_traits, allocator<char> > istring; int main() { // The same letters except for case: istring first = "tHis"; istring second = "ThIS"; cout << first.compare(second) << endl; } ///:~
The output from the program is
“0”, indicating that the strings compare as equal. This is just a
simple example – in order to make istring fully equivalent to
string, we’d have to create the other functions necessary to
support the new istring
type.
My friend Daniel (who designed the cover
and page layout for this book) does a lot of work with Web pages. One tool he
uses creates a “site map” consisting of a Java applet to display the
map and an HTML tag that invoked the applet and provided it with the necessary
data to create the map. Daniel wanted to use this data to create an ordinary
HTML page (sans applet) that would contain regular links as the site map. The
resulting program turns out to be a nice practical application of the
string class, so it is presented here.
The input is an HTML file that contains
the usual stuff along with an applet tag with a parameter that begins like
this:
<param name="source_file" value="
The rest of the line contains encoded
information about the site map, all combined into a single line (it’s
rather long, but fortunately string objects don’t care). Each entry
may or may not begin with a number of ‘#’ signs; each of
these indicates one level of depth. If no ‘#’ sign is present
the entry will be considered to be at level one. After the
‘#’ is the text to be displayed on the page, followed by a
‘%’ and the URL to use as the link. Each entry is terminated
by a ‘*’. Thus, a single entry in the line might look like
this:
###|Useful Art%./Build/useful_art.html*
The ‘|’ at the
beginning is an artifact that needs to be removed.
My solution was to create an Item
class whose constructor would take input text and create an object that contains
the text to be displayed, the URL and the level. The objects essentially parse
themselves, and at that point you can read any value you want. In
main( ), the input file is opened and read until the line contains
the parameter that we’re interested in. Everything but the site map codes
are stripped away from this string, and then it is parsed into
Item objects:
//: C04:SiteMapConvert.cpp // Using strings to create a custom conversion // program that generates HTML output #include "../require.h" #include <iostream> #include <fstream> #include <string> #include <cstdlib> using namespace std; class Item { string id, url; int depth; string removeBar(string s) { if(s[0] == '|') return s.substr(1); else return s; } public: Item(string in, int& index) : depth(0) { while(in[index] == '#' && index < in.size()){ depth++; index++; } // 0 means no '#' marks were found: if(depth == 0) depth = 1; while(in[index] != '%' && index < in.size()) id += in[index++]; id = removeBar(id); index++; // Move past '%' while(in[index] != '*' && index < in.size()) url += in[index++]; url = removeBar(url); index++; // To move past '*' } string identifier() { return id; } string path() { return url; } int level() { return depth; } }; int main(int argc, char* argv[]) { requireArgs(argc, 1, "usage: SiteMapConvert inputfilename"); ifstream in(argv[1]); assure(in, argv[1]); ofstream out("plainmap.html"); string line; while(getline(in, line)) { if(line.find("<param name=\"source_file\"") != string::npos) { // Extract data from start of sequence // until the terminating quote mark: line = line.substr(line.find("value=\"") + string("value=\"").size()); line = line.substr(0, line.find_last_of("\"")); int index = 0; while(index < line.size()) { Item item(line, index); string startLevel, endLevel; if(item.level() == 1) { startLevel = "<h1>"; endLevel = "</h1>"; } else for(int i = 0; i < item.level(); i++) for(int j = 0; j < 5; j++) out << " "; string htmlLine = "<a href=\"" + item.path() + "\">" + item.identifier() + "</a><br>"; out << startLevel << htmlLine << endLevel << endl; } break; // Out of while loop } } } ///:~
Item contains a private member
function removeBar( ) that is used internally to strip off the
leading bars, if they appear.
The constructor for Item
initializes depth to 0 to indicate that no ‘#’ signs
were found yet; if none are found then it is assumed the Item should be
displayed at level one. Each character in the string is examined using
operator[ ] to find the depth, id and url
values. The other member functions simply return these values.
After opening the files,
main( ) uses string::find( ) to locate the line
containing the site map data. At this point, substr( ) is used in
order to strip off the information before and after the site map data. The
subsequent while loop performs the parsing, but notice that the value
index is passed by reference into the Item constructor, and
that constructor increments index as it parses each new Item, thus
moving forward in the sequence.
If an Item is at level one, then
an HTML h1 tag is used, otherwise the elements are indented using HTML
non-breaking spaces. Note in the initialization of htmlLine how easy it
is to construct a string – you can just combine quoted character
arrays and other string objects using operator+.
When the output is written to the
destination file, startLevel and endLevel will only produce
results if they have been given any value other than their default
initialization values.
C++ string objects provide developers
with a number of great advantages over their C counterparts. For the most part,
the string class makes referring to strings through the use of character
pointers unnecessary. This eliminates an entire class of software defects that
arise from the use of uninitialized and incorrectly valued pointers. C++ strings
dynamically and transparently grow their internal data storage space to
accommodate increases in the size of the string data. This means that when the
data in a string grows beyond the limits of the memory initially allocated to
it, the string object will make the memory management calls that take space from
and return space to the heap. Consistent allocation schemes prevent memory leaks
and have the potential to be much more efficient than “roll your
own” memory management.
The string class member functions
provide a fairly comprehensive set of tools for creating, modifying, and
searching in strings. string comparisons are always case sensitive, but
you can work around this by copying string data to C style null terminated
strings and using case insensitive string comparison functions, temporarily
converting the data held in sting objects to a single case, or by creating a
case insensitive string class which overrides the character traits used to
create the basic_string
object.
[8]
Much of the material in this chapter was originally created by Nancy
Nicolaisen
[9]
I subsequently found better tools to accomplish this task, but the program is
still interesting.