The other half of the STL is the
algorithms, which are templatized functions designed to work with the containers
(or, as you will see, anything that can behave like a container, including
arrays and string objects).
The STL was originally designed around
the algorithms. The goal was that you use algorithms for almost every piece of
code that you write. In this sense it was a bit of an experiment, and only time
will tell how well it works. The real test will be in how easy or difficult it
is for the average programmer to adapt. At the end of this chapter you’ll
be able to decide for yourself whether you find the algorithms addictive or too
confusing to remember. If you’re like me, you’ll resist them at
first but then tend to use them more and more.
Before you make your judgment, however,
there’s one other thing to consider. The STL algorithms provide a
vocabulary with which to describe solutions. That is, once you become
familiar with the algorithms you’ll have a new set of words with which to
discuss what you’re doing, and these words are at a higher level than what
you’ve had before. You don’t have to say “this loop moves
through and assigns from here to there ... oh, I see, it’s copying!”
Instead, you say copy( ). This is the kind of thing we’ve been
doing in computer programming from the beginning – creating more dense
ways to express what we’re doing and spending less time saying
how we’re doing it. Whether the STL algorithms and generic
programming are a great success in accomplishing this remains to be seen,
but that is certainly the
objective.
A concept that is used heavily in the STL
algorithms is the function object, which was introduced in the previous
chapter. A function object has an overloaded operator( ), and the
result is that a template function can’t tell whether you’ve handed
it a pointer to a function or an object that has an operator( ); all
the template function knows is that it can attach an argument list to the object
as if it were a pointer to a function:
//: C08:FuncObject.cpp // Simple function objects //{L} ../TestSuite/Test #include <iostream> using namespace std; template<class UnaryFunc, class T> void callFunc(T& x, UnaryFunc f) { f(x); } void g(int& x) { x = 47; } struct UFunc { void operator()(int& x) { x = 48; } }; int main() { int y = 0; callFunc(y, g); cout << y << endl; y = 0; callFunc(y, UFunc()); cout << y << endl; } ///:~
The template callFunc( ) says
“give me an f and an x, and I’ll write the code
f(x).” In main( ), you can see that it doesn’t
matter if f is a pointer to a function (as in the case of
g( )), or if it’s a function object (which is created as a
temporary object by the expression UFunc( )). Notice you can only
accomplish this genericity with a template function; a non-template function is
too particular about its argument types to allow such a thing. The STL
algorithms use this flexibility to take either a function pointer or a function
object, but you’ll usually find that creating a function object is more
powerful and flexible.
The function object is actually a
variation on the theme of a callback, which is described in the design
patterns chapter. A callback allows you to vary the behavior of a function or
object by passing, as an argument, a way to execute some other piece of code.
Here, we are handing callFunc( ) a pointer to a function or a
function object.
The following descriptions of function
objects should not only make that topic clear, but also give you an introduction
to the way the STL algorithms work.
Just as the STL classifies iterators
(based on their capabilities), it also classifies function objects based on the
number of arguments that their operator( ) takes and the kind of
value returned by that operator (of course, this is also true for function
pointers when you treat them as function objects). The classification of
function objects in the STL is based on whether the operator( )
takes zero, one or two arguments, and if it returns a bool or
non-bool value.
Generator: Takes no arguments, and
returns a value of the desired type. A RandomNumberGenerator is a special
case.
UnaryFunction: Takes a single
argument of any type and returns a value which may be of a different
type.
BinaryFunction: Takes two
arguments of any two types and returns a value of any type.
A special case of the unary and binary
functions is the predicate, which simply means a function that returns a
bool. A predicate is a function you use to make a
true/false decision.
Predicate: This can also be called
a UnaryPredicate. It takes a single argument of any type and returns a
bool.
BinaryPredicate: Takes two
arguments of any two types and returns a bool.
StrictWeakOrdering: A binary
predicate that says that if you have two objects and neither one is less than
the other, they can be regarded as equivalent to each other.
In addition, there are sometimes
qualifications on object types that are passed to an algorithm. These
qualifications are given in the template argument type identifier
name:
LessThanComparable: A class that
has a less-than operator<.
Assignable: A class that has an
assignment operator= for its own type.
The STL has, in the header file
<functional>, a set of templates that will automatically create
function objects for you. These generated function objects are admittedly
simple, but the goal is to provide very basic functionality that will allow you
to compose more complicated function objects, and in many situations this is all
you’ll need. Also, you’ll see that there are some function object
adapters that allow you to take the simple function objects and make them
slightly more complicated.
Here are the templates that generate
function objects, along with the expressions that they effect.
Name |
Type |
Result produced by generated function
object |
---|---|---|
plus |
BinaryFunction |
arg1 + arg2 |
minus |
BinaryFunction |
arg1 - arg2 |
multiplies |
BinaryFunction |
arg1 * arg2 |
divides |
BinaryFunction |
arg1 / arg2 |
modulus |
BinaryFunction |
arg1 % arg2 |
negate |
UnaryFunction |
- arg1 |
equal_to |
BinaryPredicate |
arg1 == arg2 |
not_equal_to |
BinaryPredicate |
arg1 != arg2 |
greater |
BinaryPredicate |
arg1 > arg2 |
less |
BinaryPredicate |
arg1 < arg2 |
greater_equal |
BinaryPredicate |
arg1 >= arg2 |
less_equal |
BinaryPredicate |
arg1 <= arg2 |
logical_and |
BinaryPredicate |
arg1 && arg2 |
logical_or |
BinaryPredicate |
arg1 || arg2 |
logical_not |
UnaryPredicate |
!arg1 |
not1( ) |
Unary Logical |
!(UnaryPredicate(arg1)) |
not2( ) |
Binary Logical |
!(BinaryPredicate(arg1,
arg2)) |
The following example provides simple
tests for each of the built-in basic function object templates. This way, you
can see how to use each one, along with their resulting
behavior.
//: C08:FunctionObjects.cpp // Using the predefined function object templates // in the Standard C++ library //{L} ../TestSuite/Test // This will be defined shortly: #include "Generators.h" #include <algorithm> #include <vector> #include <iostream> #include <functional> using namespace std; template<typename T> void print(vector<T>& v, char* msg = "") { if(*msg != 0) cout << msg << ":" << endl; copy(v.begin(), v.end(), ostream_iterator<T>(cout, " ")); cout << endl; } template<typename Contain, typename UnaryFunc> void testUnary(Contain& source, Contain& dest, UnaryFunc f) { transform(source.begin(), source.end(), dest.begin(), f); } template<typename Contain1, typename Contain2, typename BinaryFunc> void testBinary(Contain1& src1, Contain1& src2, Contain2& dest, BinaryFunc f) { transform(src1.begin(), src1.end(), src2.begin(), dest.begin(), f); } // Executes the expression, then stringizes the // expression into the print statement: #define T(EXPR) EXPR; print(r, "After " #EXPR); // For Boolean tests: #define B(EXPR) EXPR; print(br,"After " #EXPR); // Boolean random generator: struct BRand { BRand() { srand(time(0)); } bool operator()() { return rand() > RAND_MAX / 2; } }; int main() { const int sz = 10; const int max = 50; vector<int> x(sz), y(sz), r(sz); // An integer random number generator: URandGen urg(max); generate_n(x.begin(), sz, urg); generate_n(y.begin(), sz, urg); // Add one to each to guarantee nonzero divide: transform(y.begin(), y.end(), y.begin(), bind2nd(plus<int>(), 1)); // Guarantee one pair of elements is ==: x[0] = y[0]; print(x, "x"); print(y, "y"); // Operate on each element pair of x & y, // putting the result into r: T(testBinary(x, y, r, plus<int>())); T(testBinary(x, y, r, minus<int>())); T(testBinary(x, y, r, multiplies<int>())); T(testBinary(x, y, r, divides<int>())); T(testBinary(x, y, r, modulus<int>())); T(testUnary(x, r, negate<int>())); vector<bool> br(sz); // For Boolean results B(testBinary(x, y, br, equal_to<int>())); B(testBinary(x, y, br, not_equal_to<int>())); B(testBinary(x, y, br, greater<int>())); B(testBinary(x, y, br, less<int>())); B(testBinary(x, y, br, greater_equal<int>())); B(testBinary(x, y, br, less_equal<int>())); B(testBinary(x, y, br, not2(greater_equal<int>()))); B(testBinary(x,y,br,not2(less_equal<int>()))); vector<bool> b1(sz), b2(sz); generate_n(b1.begin(), sz, BRand()); generate_n(b2.begin(), sz, BRand()); print(b1, "b1"); print(b2, "b2"); B(testBinary(b1, b2, br, logical_and<int>())); B(testBinary(b1, b2, br, logical_or<int>())); B(testUnary(b1, br, logical_not<int>())); B(testUnary(b1, br, not1(logical_not<int>()))); } ///:~
To keep this example small, some tools
are created. The print( ) template is designed to print any
vector<T>, along with an optional message. Since
print( ) uses the STL copy( ) algorithm to send objects
to cout via an ostream_iterator, the ostream_iterator must
know the type of object it is printing, and therefore the print( )
template must know this type also. However, you’ll see in
main( ) that the compiler can deduce the type of T when you
hand it a vector<T>, so you don’t have to hand it the
template argument explicitly; you just say print(x) to print the
vector<T> x.
The next two template functions automate
the process of testing the various function object templates. There are two
since the function objects are either unary or binary. In
testUnary( ), you pass a source and destination vector, and a unary
function object to apply to the source vector to produce the destination vector.
In testBinary( ), there are two source vectors which are fed to a
binary function to produce the destination vector. In both cases, the template
functions simply turn around and call the transform( ) algorithm,
although the tests could certainly be more complex.
For each test, you want to see a string
describing what the test is, followed by the results of the test. To automate
this, the preprocessor comes in handy; the T( ) and B( )
macros each take the expression you want to execute. They call that expression,
then call print( ), passing it the result vector (they assume the
expression changes a vector named r and br, respectively), and to
produce the message the expression is “string-ized” using the
preprocessor. So that way you see the code of the expression that is executed
followed by the result vector.
The last little tool is a generator
object that creates random bool values. To do this, it gets a random
number from rand( ) and tests to see if it’s greater than
RAND_MAX/2. If the random numbers are evenly distributed, this should
happen half the time.
In main( ), three
vector<int> are created: x and y for source values,
and r for results. To initialize x and y with random values
no greater than 50, a generator of type URandGen is used; this will be
defined shortly. Since there is one operation where elements of x are
divided by elements of y, we must ensure that there are no zero values of
y. This is accomplished using the transform( ) algorithm,
taking the source values from y and putting the results back into
y. The function object for this is created with the
expression:
bind2nd(plus<int>(), 1)
This uses the plus function object
that adds two objects together. It is thus a binary function which requires two
arguments; we only want to pass it one argument (the element from y) and
have the other argument be the value 1. A “binder” does the trick
(we will look at these next). The binder in this case says “make a new
function object which is the plus function object with the second
argument fixed at 1.”
Another of the tests in the program
compares the elements in the two vectors for equality, so it is interesting to
guarantee that at least one pair of elements is equivalent; in this case element
zero is chosen.
Once the two vectors are printed,
T( ) is used to test each of the function objects that produces a
numerical value, and then B( ) is used to test each function object
that produces a Boolean result. The result is placed into a
vector<bool>, and when this vector is printed it produces a
‘1’ for a true value and a ‘0’ for a false
value.
It’s common to want to take a
binary function object and to “bind” one of its arguments to a
constant value. After binding, you get a unary function object.
For example, suppose you want to find
integers that are less than a particular value, say 20. Sensibly enough, the STL
algorithms have a function called find_if( ) that will search
through a sequence; however, find_if( ) requires a unary predicate
to tell it if this is what you’re looking for. This unary predicate can of
course be some function object that you have written by hand, but it can also be
created using the built-in function object templates. In this case, the
less template will work, but that produces a binary predicate, so we need
some way of forming a unary predicate. The binder templates (which work with any
binary function object, not just binary predicates) give you two
choices:
bind1st(const BinaryFunction& op, const T&
t);
bind2nd(const BinaryFunction& op, const T&
t);
Both bind t to one of the
arguments of op, but bind1st( ) binds t to the first
argument, and bind2nd( ) binds t to the second argument. With
less, the function object that provides the solution to our exercise
is:
bind2nd(less<int>(), 20);
This produces a new function object that
returns true if its argument is less than 20. Here it is, used with
find_if( ):
//: C08:Binder1.cpp // Using STL "binders" //{L} ../TestSuite/Test //{-g++3} #include "Generators.h" #include "copy_if.h" #include <algorithm> #include <vector> #include <iostream> #include <functional> using namespace std; int main() { const int sz = 10; const int max = 40; vector<int> a(sz), r; URandGen urg(max); ostream_iterator<int> out(cout, " "); generate_n(a.begin(), sz, urg); copy(a.begin(), a.end(), out); int* d = find_if(a.begin(), a.end(), bind2nd(less<int>(), 20)); cout << "\n *d = " << *d << endl; // copy_if() is not in the Standard C++ library // but is defined later in the chapter: copy_if(a.begin(), a.end(), back_inserter(r), bind2nd(less<int>(), 20)); copy(r.begin(), r.end(), out); cout << endl; } ///:~
The vector<int> a is
filled with random numbers between 0 and max. find_if( )
finds the first element in a that satisfies the predicate (that is, which
is less than 20) and returns an iterator to it (here, the type of the iterator
is actually just int* although I could have been more precise and said
vector<int>::iterator instead).
A more interesting algorithm to use is
copy_if( ), which isn’t part of the STL but is defined at the
end of this chapter. This algorithm only copies an element from the source to
the destination if that element satisfies a predicate. So the resulting vector
will only contain elements that are less than 20.
Here’s a second example, using a
vector<string> and replacing strings that satisfy particular
conditions:
//: C08:Binder2.cpp // More binders //{L} ../TestSuite/Test #include <algorithm> #include <vector> #include <string> #include <iostream> #include <functional> using namespace std; int main() { ostream_iterator<string> out(cout, " "); vector<string> v, r; v.push_back("Hi"); v.push_back("Hi"); v.push_back("Hey"); v.push_back("Hee"); v.push_back("Hi"); copy(v.begin(), v.end(), out); cout << endl; // Replace each "Hi" with "Ho": replace_copy_if(v.begin(), v.end(), back_inserter(r), bind2nd(equal_to<string>(), "Hi"), "Ho"); copy(r.begin(), r.end(), out); cout << endl; // Replace anything that's not "Hi" with "Ho": replace_if(v.begin(), v.end(), not1(bind2nd(equal_to<string>(),"Hi")),"Ho"); copy(v.begin(), v.end(), out); cout << endl; } ///:~
This uses another pair of STL algorithms.
The first, replace_copy_if( ), copies each element from a source
range to a destination range, performing replacements on those that satisfy a
particular unary predicate. The second, replace_if( ), doesn’t
do any copying but instead performs the replacements directly into the original
range.
A binder doesn’t have to produce a
unary predicate; it can also create a unary function (that is, a function that
returns something other than bool). For example, suppose you’d like
to multiply every element in a vector by 10. Using a binder with the
transform( ) algorithm does the trick:
//: C08:Binder3.cpp // Binders aren't limited to producing predicates //{L} ../TestSuite/Test #include "Generators.h" #include <algorithm> #include <vector> #include <iostream> #include <functional> using namespace std; int main() { ostream_iterator<int> out(cout, " "); vector<int> v(15); generate(v.begin(), v.end(), URandGen(20)); copy(v.begin(), v.end(), out); cout << endl; transform(v.begin(), v.end(), v.begin(), bind2nd(multiplies<int>(), 10)); copy(v.begin(), v.end(), out); cout << endl; } ///:~
Since the third argument to
transform( ) is the same as the first, the resulting elements are
copied back into the source vector. The function object created by
bind2nd( ) in this case produces an int
result.
The “bound” argument to a
binder cannot be a function object, but it does not have to be a compile-time
constant. For example:
//: C08:Binder4.cpp // The bound argument does not have // to be a compile-time constant //{L} ../TestSuite/Test //{-g++295} #include "copy_if.h" #include "PrintSequence.h" #include "../require.h" #include <iostream> #include <algorithm> #include <functional> #include <cstdlib> using namespace std; int boundedRand() { return rand() % 100; } int main() { const int sz = 20; int a[20], b[20] = {0}; generate(a, a + sz, boundedRand); int val = boundedRand(); int* end = copy_if(a, a + sz, b, bind2nd(greater<int>(), val)); // Sort for easier viewing: sort(a, a + sz); sort(b, end); print(a, a + sz, "array a", " "); print(b, end, "values greater than yours"," "); } ///:~
Here, an array is filled with random
numbers between 0 and 100, and the user provides a value on the command line. In
the copy_if( ) call, you can see that the bound argument to
bind2nd( ) is the result of the function call atoi( )
(from <cstdlib>).
Any place in an STL algorithm where a
function object is required, it’s very conceivable that you’d like
to use a function pointer instead. Actually, you can use an ordinary
function pointer – that’s how the STL was designed, so that a
“function object” can actually be anything that can be dereferenced
using an argument list. For example, the rand( ) random number
generator can be passed to generate( ) or generate_n( )
as a function pointer, like this:
//: C08:RandGenTest.cpp // A little test of the random number generator //{L} ../TestSuite/Test //{-msc} #include <algorithm> #include <vector> #include <iostream> #include <functional> #include <cstdlib> #include <ctime> using namespace std; int main() { const int sz = 10000; int v[sz]; srand(time(0)); // Seed the random generator for(int i = 0; i < 300; i++) { // Using a naked pointer to function: generate(v, v + sz, std::rand); int count = count_if(v, v + sz, bind2nd(greater<int>(), RAND_MAX/2)); cout << (((double)count)/((double)sz)) * 100 << ' '; } } ///:~
The “iterators” in this case
are just the starting and past-the-end pointers for the array v, and the
generator is just a pointer to the standard library rand( )
function. The program repeatedly generates a group of random numbers, then it
uses the STL algorithm count_if( ) and a predicate that tells
whether a particular element is greater than RAND_MAX/2. The result is
the number of elements that match this criterion; this is divided by the total
number of elements and multiplied by 100 to produce the percentage of elements
greater than the midpoint. If the random number generator is reasonable, this
value should hover at around 50% (of course, there are many other tests to
determine if the random number generator is reasonable).
The ptr_fun( ) adapters take
a pointer to a function and turn it into a function object. They are not
designed for a function that takes no arguments, like the one above (that is, a
generator). Instead, they are for unary functions and binary functions. However,
these could also be simply passed as if they were function objects, so the
ptr_fun( ) adapters might at first appear to be redundant.
Here’s an example where using ptr_fun( ) and simply passing
the address of the function both produce the same results:
//: C08:PtrFun1.cpp // Using ptr_fun() for single-argument functions //{L} ../TestSuite/Test //{-bor} #include <algorithm> #include <vector> #include <iostream> #include <functional> using namespace std; char* n[] = { "01.23", "91.370", "56.661", "023.230", "19.959", "1.0", "3.14159" }; const int nsz = sizeof n / sizeof *n; template<typename InputIter> void print(InputIter first, InputIter last) { while(first != last) cout << *first++ << "\t"; cout << endl; } int main() { print(n, n + nsz); vector<double> vd; transform(n, n + nsz, back_inserter(vd), atof); print(vd.begin(), vd.end()); transform(n,n + nsz,vd.begin(), ptr_fun(atof)); print(vd.begin(), vd.end()); } ///:~
The goal of this program is to convert an
array of char* which are ASCII representations of floating-point numbers
into a vector<double>. After defining this array and the
print( ) template (which encapsulates the act of printing a range of
elements), you can see transform( ) used with atof( ) as
a “naked” pointer to a function, and then a second time with
atof passed to ptr_fun( ). The results are the same. So why
bother with ptr_fun( )? Well, the actual effect of
ptr_fun( ) is to create a function object with an
operator( ). This function object can then be passed to other
template adapters, such as binders, to create new function objects. As
you’ll see a bit later, the SGI extensions to the STL contain a number of
other function templates to enable this, but in the Standard C++ STL there are
only the bind1st( ) and bind2nd( ) function templates,
and these expect binary function objects as their first arguments. In the above
example, only the ptr_fun( ) for a unary function is used, and that
doesn’t work with the binders. So ptr_fun( ) used with a unary
function in Standard C++ really is redundant (note that Gnu g++ uses the SGI
STL).
With a binary function and a binder,
things can be a little more interesting. This program produces the squares of
the input vector d:
//: C08:PtrFun2.cpp // Using ptr_fun() for two-argument functions //{L} ../TestSuite/Test //{-bor} //{-g++3} #include <algorithm> #include <vector> #include <iostream> #include <functional> #include <cmath> using namespace std; double d[] = { 01.23, 91.370, 56.661, 023.230, 19.959, 1.0, 3.14159 }; const int dsz = sizeof d / sizeof *d; int main() { vector<double> vd; transform(d, d + dsz, back_inserter(vd), bind2nd(ptr_fun(pow), 2.0)); copy(vd.begin(), vd.end(), ostream_iterator<double>(cout, " ")); cout << endl; } ///:~
Here, ptr_fun( ) is
indispensable; bind2nd( ) must have a function object as its
first argument and a pointer to function won’t cut it.
A trickier problem is that of converting
a member function into a function object suitable for using in the STL
algorithms. As a simple example, suppose we have the “shape” problem
and would like to apply the draw( ) member function to each pointer
in a container of Shape:
//: C08:MemFun1.cpp // Applying pointers to member functions //{L} ../TestSuite/Test //{-msc} #include "../purge.h" #include <algorithm> #include <vector> #include <iostream> #include <functional> using namespace std; class Shape { public: virtual void draw() = 0; virtual ~Shape() {} }; class Circle : public Shape { public: virtual void draw() { cout << "Circle::Draw()" << endl; } ~Circle() { cout << "Circle::~Circle()" << endl; } }; class Square : public Shape { public: virtual void draw() { cout << "Square::Draw()" << endl; } ~Square() { cout << "Square::~Square()" << endl; } }; int main() { vector<Shape*> vs; vs.push_back(new Circle); vs.push_back(new Square); for_each(vs.begin(), vs.end(), mem_fun(&Shape::draw)); purge(vs); } ///:~
The for_each( ) function does
just what it sounds like it does: passes each element in the range determined by
the first two (iterator) arguments to the function object which is its third
argument. In this case we want the function object to be created from one of the
member functions of the class itself, and so the function object’s
“argument” becomes the pointer to the object that the member
function is called for. To produce such a function object, the
mem_fun( ) template takes a pointer to member as its
argument.
The mem_fun( ) functions are
for producing function objects that are called using a pointer to the object
that the member function is called for, while mem_fun_ref( ) is used
for calling the member function directly for an object. One set of overloads of
both mem_fun( ) and mem_fun_ref( ) are for member
functions that take zero arguments and one argument, and this is multiplied by
two to handle const vs. non-const member functions. However,
templates and overloading takes care of sorting all of that out; all you need to
remember is when to use mem_fun( ) vs.
mem_fun_ref( ).
Suppose you have a container of objects
(not pointers) and you want to call a member function that takes an argument.
The argument you pass should come from a second container of objects. To
accomplish this, the second overloaded form of the transform( )
algorithm is used:
//: C08:MemFun2.cpp // Applying pointers to member functions //{L} ../TestSuite/Test //{-msc} #include <algorithm> #include <vector> #include <iostream> #include <functional> using namespace std; class Angle { int degrees; public: Angle(int deg) : degrees(deg) {} int mul(int times) { return degrees *= times; } }; int main() { vector<Angle> va; for(int i = 0; i < 50; i += 10) va.push_back(Angle(i)); int x[] = { 1, 2, 3, 4, 5 }; transform(va.begin(), va.end(), x, ostream_iterator<int>(cout, " "), mem_fun_ref(&Angle::mul)); cout << endl; } ///:~
Because the container is holding objects,
mem_fun_ref( ) must be used with the pointer-to-member function.
This version of transform( ) takes the start and end point of the
first range (where the objects live), the starting point of second range which
holds the arguments to the member function, the destination iterator which in
this case is standard output, and the function object to call for each object;
this function object is created with mem_fun_ref( ) and the desired
pointer to member. Notice the transform( ) and
for_each( ) template functions are incomplete;
transform( ) requires that the function it calls return a value and
there is no for_each( ) that passes two arguments to the function it
calls. Thus, you cannot call a member function that returns void and
takes an argument using transform( ) or
for_each( ).
Any member function works, including
those in the Standard libraries. For example, suppose you’d like to read a
file and search for blank lines; you can use the string::empty( )
member function like this:
//: C08:FindBlanks.cpp // Demonstrate mem_fun_ref() with string::empty() //{L} ../TestSuite/Test // Probably a bug in this program: //{-msc} //{-bor} dumps core //{-g++295} dumps core //{-g++3} dumps core #include "../require.h" #include <algorithm> #include <list> #include <string> #include <fstream> #include <functional> using namespace std; typedef list<string>::iterator LSI; LSI blank(LSI begin, LSI end) { return find_if(begin, end, mem_fun_ref(&string::empty)); } int main(int argc, char* argv[]) { char* fname = "FindBlanks.cpp"; if(argc > 1) fname = argv[1]; ifstream in(fname); assure(in, fname); list<string> ls; string s; while(getline(in, s)) ls.push_back(s); LSI lsi = blank(ls.begin(), ls.end()); while(lsi != ls.end()) { *lsi = "A BLANK LINE"; lsi = blank(lsi, ls.end()); } string f(argv[1]); f += ".out"; ofstream out(f.c_str()); copy(ls.begin(), ls.end(), ostream_iterator<string>(out, "\n")); } ///:~
The blank( ) function uses
find_if( ) to locate the first blank line in the given range using
mem_fun_ref( ) with string::empty( ). After the file is
opened and read into the list, blank( ) is called repeated
times to find every blank line in the file. Notice that subsequent calls to
blank( ) use the current version of the iterator so it moves forward
to the next one. Each time a blank line is found, it is replaced with the
characters “A BLANK LINE.” All you have to do to accomplish this is
dereference the iterator, and you select the current
string.
The SGI STL (mentioned at the end of the
previous chapter) also includes additional function object templates, which
allow you to write expressions that create even more complicated function
objects. Consider a more involved program which converts strings of digits into
floating point numbers, like PtrFun2.cpp but more general. First,
here’s a generator that creates strings of integers that represent
floating-point values (including an embedded decimal point):
//: C08:NumStringGen.h // A random number generator that produces // strings representing floating-point numbers #ifndef NUMSTRINGGEN_H #define NUMSTRINGGEN_H #include <string> #include <cstdlib> #include <ctime> class NumStringGen { const int sz; // Number of digits to make public: NumStringGen(int ssz = 5) : sz(ssz) { std::srand(std::time(0)); } std::string operator()() { static char n[] = "0123456789"; const int nsz = 10; std::string r(sz, ' '); for(int i = 0; i < sz; i++) if(i == sz/2) r[i] = '.'; // Insert a decimal point else r[i] = n[std::rand() % nsz]; return r; } }; #endif // NUMSTRINGGEN_H ///:~
You tell it how big the strings
should be when you create the NumStringGen object. The random number
generator is used to select digits, and a decimal point is placed in the
middle.
The following program (which works with
the Standard C++ STL without the SGI extensions) uses NumStringGen to
fill a vector<string>. However, to use the Standard C library
function atof( ) to convert the strings to floating-point numbers,
the string objects must first be turned into char pointers, since
there is no automatic type conversion from string to char*. The
transform( ) algorithm can be used with mem_fun_ref( )
and string::c_str( ) to convert all the strings to
char*, and then these can be transformed using
atof:
//: C08:MemFun3.cpp // Using mem_fun() //{L} ../TestSuite/Test //{-msc} #include "NumStringGen.h" #include <algorithm> #include <vector> #include <string> #include <iostream> #include <functional> using namespace std; int main() { const int sz = 9; vector<string> vs(sz); // Fill it with random number strings: generate(vs.begin(), vs.end(), NumStringGen()); copy(vs.begin(), vs.end(), ostream_iterator<string>(cout, "\t")); cout << endl; const char* vcp[sz]; transform(vs.begin(), vs.end(), vcp, mem_fun_ref(&string::c_str)); vector<double> vd; transform(vcp,vcp + sz,back_inserter(vd), std::atof); copy(vd.begin(), vd.end(), ostream_iterator<double>(cout, "\t")); cout << endl; } ///:~
The SGI extensions to the STL contain a
number of additional function object templates that accomplish more detailed
activities than the Standard C++ function object templates, including
identity (returns its argument unchanged), project1st and
project2nd (to take two arguments and return the first or second one,
respectively), select1st and select2nd (to take a pair
object and return the first or second element, respectively), and the
“compose” function templates.
If you’re using the SGI extensions,
you can make the above program denser using one of the two “compose”
function templates. The first, compose1(f1, f2), takes the two
function objects f1 and f2 as its arguments. It produces a
function object that takes a single argument, passes it to f2, then takes
the result of the call to f2 and passes it to f1. The result of
f1 is returned. By using compose1( ), the process of
converting the string objects to char*, then converting the
char* to a floating-point number can be combined into a single operation,
like this:
//: C08:MemFun4.cpp // Using the SGI STL compose1 function //{L} ../TestSuite/Test //{-bor} Can add the header by hand //{-msc} Can add the header by hand #include "NumStringGen.h" #include <algorithm> #include <vector> #include <string> #include <iostream> #include <functional> using namespace std; int main() { const int sz = 9; vector<string> vs(sz); // Fill it with random number strings: generate(vs.begin(), vs.end(), NumStringGen()); copy(vs.begin(), vs.end(), ostream_iterator<string>(cout, "\t")); cout << endl; vector<double> vd; transform(vs.begin(), vs.end(), back_inserter(vd), compose1(ptr_fun(atof), mem_fun_ref(&string::c_str))); copy(vd.begin(), vd.end(), ostream_iterator<double>(cout, "\t")); cout << endl; } ///:~
You can see there’s only a single
call to transform( ) now, and no intermediate holder for the
char pointers.
The second “compose” function
is compose2( ), which takes three function objects as its arguments.
The first function object is binary (it takes two arguments), and its arguments
are the results of the second and third function objects, respectively. The
function object that results from compose2( ) expects one argument,
and it feeds that argument to the second and third function objects. Here is an
example:
//: C08:Compose2.cpp // Using the SGI STL compose2() function //{L} ../TestSuite/Test //{-bor} Can add the header by hand //{-msc} Can add the header by hand #include "copy_if.h" #include <algorithm> #include <vector> #include <iostream> #include <functional> #include <cstdlib> #include <ctime> using namespace std; int main() { srand(time(0)); vector<int> v(100); generate(v.begin(), v.end(), rand); transform(v.begin(), v.end(), v.begin(), bind2nd(divides<int>(), RAND_MAX/100)); vector<int> r; copy_if(v.begin(), v.end(), back_inserter(r), compose2(logical_and<bool>(), bind2nd(greater_equal<int>(), 30), bind2nd(less_equal<int>(), 40))); sort(r.begin(), r.end()); copy(r.begin(), r.end(), ostream_iterator<int>(cout, " ")); cout << endl; } ///:~
The vector<int> v is first
filled with random numbers. To cut these down to size, the
transform( ) algorithm is used to divide each value by
RAND_MAX/100, which will force the values to be between 0 and 100 (making
them more readable). The copy_if( ) algorithm defined later in this
chapter is then used, along with a composed function object, to copy all the
elements that are greater than or equal to 30 and less than or equal to 40 into
the destination vector<int> r. Just to show how easy it is,
r is sorted, and then displayed.
The arguments of compose2( )
say, in effect:
(x >= 30) && (x <= 40)
You could also take the function object
that comes from a compose1( ) or compose2( ) call and
pass it into another “compose” expression ... but this could rapidly
get very difficult to decipher.
Instead of all this composing and
transforming, you can write your own function objects (without using the
SGI extensions) as follows:
//: C08:NoCompose.cpp // Writing out the function objects explicitly //{L} ../TestSuite/Test #include "copy_if.h" #include <algorithm> #include <vector> #include <string> #include <iostream> #include <functional> #include <cstdlib> #include <ctime> using namespace std; class Rgen { const int max; public: Rgen(int mx = 100) : max(RAND_MAX/mx) { srand(time(0)); } int operator()() { return rand() / max; } }; class BoundTest { int top, bottom; public: BoundTest(int b, int t) : bottom(b), top(t) {} bool operator()(int arg) { return (arg >= bottom) && (arg <= top); } }; int main() { vector<int> v(100); generate(v.begin(), v.end(), Rgen()); vector<int> r; copy_if(v.begin(), v.end(), back_inserter(r), BoundTest(30, 40)); sort(r.begin(), r.end()); copy(r.begin(), r.end(), ostream_iterator<int>(cout, " ")); cout << endl; } ///:~
There are a few more lines of code, but
you can’t deny that it’s much clearer and easier to understand, and
therefore to maintain.
We can thus observe two drawbacks to the
SGI extensions to the STL. The first is simply that it’s an extension;
yes, you can download and use them for free so the barriers to entry are low,
but your company may be conservative and decide that if it’s not in
Standard C++, they don’t want to use it. The second drawback is
complexity. Once you get familiar and comfortable with the idea of composing
complicated functions from simple ones you can visually parse complicated
expressions and figure out what they mean. However, my guess is that most people
will find anything more than what you can do with the Standard, non-extended STL
function object notation to be overwhelming. At some point on the complexity
curve you have to bite the bullet and write a regular class to produce your
function object, and that point might as well be the point where you can’t
use the Standard C++ STL. A stand-alone class for a function object is going to
be much more readable and maintainable than a complicated function-composition
expression (although my sense of adventure does lure me into wanting to
experiment more with the SGI extensions...).
As a final note, you can’t compose
generators; you can only create function objects whose operator( )
requires one or two arguments.
This section provides a quick reference
for when you’re searching for the appropriate algorithm. I leave the full
exploration of all the STL algorithms to other references (see the end of this
chapter, and Appendix XX), along with the more intimate details of complexity,
performance, etc. My goal here is for you to become rapidly comfortable and
facile with the algorithms, and I will assume you will look into the more
specialized references if you need more depth of detail.
Although you will often see the
algorithms described using their full template declaration syntax, I am not
doing that here because you already know they are templates, and it’s
quite easy to see what the template arguments are from the function
declarations. The type names for the arguments provide descriptions for the
types of iterators required. I think you’ll find this form is easier to
read, while you can quickly find the full declaration in the template header
file if for some reason you feel the need.
The names of the iterator classes
describe the iterator type they must conform to. The iterator types were
described in the previous chapter, but here is a summary:
InputIterator. You (or
rather, the STL algorithm and any algorithms you write that use
InputIterators) can increment this with operator++ and
dereference it with operator* to read the value (and only
read the value), but you can only read each value once. InputIterators
can be tested with operator== and operator!=. That’s
all. Because an InputIterator is so limited, it can be used with
istreams (via istream_iterator).
OutputIterator. This can be
incremented with operator++, and dereferenced with operator* to
write the value (and only write the value), but you can only
dereference/write each value once. OutputIterators cannot be tested with
operator== and operator!=, however, because you assume that you
can just keep sending elements to the destination and that you don’t have
to see if the destination’s end marker has been reached. That is, the
container that an OutputIterator references can take an infinite number
of objects, so no end-checking is necessary. This requirement is important so
that an OutputIterator can be used with ostreams (via
ostream_iterator), but you’ll also commonly use the
“insert” iterators insert_iterator,
front_insert_iterator and back_insert_iterator (generated by the
helper templates inserter( ), front_inserter( ) and
back_inserter( )).
With both InputIterator and
OutputIterator, you cannot have multiple iterators pointing to different
parts of the same range. Just think in terms of iterators to support
istreams and ostreams, and InputIterator and
OutputIterator will make perfect sense. Also note that
InputIterator and OutputIterator put the weakest restrictions on
the types of iterators they will accept, which means that you can use any
“more sophisticated” type of iterator when you see
InputIterator or OutputIterator used as STL algorithm template
arguments.
ForwardIterator.
InputIterator and OutputIterator are the most restricted, which
means they’ll work with the largest number of actual iterators. However,
there are some operations for which they are too restricted; you can only read
from an InputIterator and write to an OutputIterator, so you
can’t use them to read and modify a range, for example, and you
can’t have more than one active iterator on a particular range, or
dereference such an iterator more than once. With a ForwardIterator these
restrictions are relaxed; you can still only move forward using
operator++, but you can both write and read and you can write/read
multiple times in each location. A ForwardIterator is much more like a
regular pointer, whereas InputIterator and OutputIterator are a
bit strange by comparison.
BidirectionalIterator.
Effectively, this is a ForwardIterator that can also go backward.
That is, a BidirectionalIterator supports all the operations that a
ForwardIterator does, but in addition it has an operator--.
RandomAccessIterator. An iterator
that is random access supports all the same operations that a regular pointer
does: you can add and subtract integral values to move it forward and backward
by jumps (rather than just one element at a time), you can subscript it with
operator[ ], you can subtract one iterator from another, and
iterators can be compared to see which is greater using operator<,
operator>, etc. If you’re implementing a sorting routine or
something similar, random access iterators are necessary to be able to create an
efficient algorithm.
The names used for the template parameter
types consist of the above iterator types (sometimes with a ‘1’ or
‘2’ appended to distinguish different template arguments), and may
also include other arguments, often function objects.
When describing the group of elements
that an operation is performed on, mathematical “range” notation
will often be used. In this, the square bracket means “includes the end
point” while the parenthesis means “does not include the end
point.” When using iterators, a range is determined by the iterator
pointing to the initial element, and the “past-the-end” iterator,
pointing past the last element. Since the past-the-end element is never used,
the range determined by a pair of iterators can thus be expressed as [first,
last), where first is the iterator pointing to the initial element
and last is the past-the-end iterator.
Most books and discussions of the STL
algorithms arrange them according to side effects: non-mutating algorithms
don’t change the elements in the range, mutating algorithms do change the
elements, etc. These descriptions are based more on the underlying behavior or
implementation of the algorithm – that is, the designer’s
perspective. In practice, I don’t find this a very useful categorization
so I shall instead organize them according to the problem you want to solve: are
you searching for an element or set of elements, performing an operation on each
element, counting elements, replacing elements, etc. This should help you find
the one you want more easily.
Note that all the algorithms are in the
namespace std. If you do not see a different header such as
<utility> or <numerics> above the function
declarations, that means it appears in
<algorithm>.
It’s useful to create some basic
tools with which to test the algorithms.
Displaying a range is something that will
be done constantly, so here is a templatized function that allows you to print
any sequence, regardless of the type that’s in that
sequence:
//: C08:PrintSequence.h // Prints the contents of any sequence #ifndef PRINTSEQUENCE_H #define PRINTSEQUENCE_H #include <iostream> template<typename InputIter> void print(InputIter first, InputIter last, char* nm = "", char* sep = "\n", std::ostream& os = std::cout) { if(*nm != '\0') // Only if you provide a string os << nm << ": " << sep; // is this printed while(first != last) os << *first++ << sep; os << std::endl; } #ifndef _MSC_VER // Use template-templates to allow type deduction // of the typename T: template<typename T, template<typename> class C> void print(C<T>& c, char* nm = "", char* sep = "\n", std::ostream& os = std::cout) { if(*nm != '\0') // Only if you provide a string os << nm << ": " << sep; // is this printed std::copy(c.begin(), c.end(), std::ostream_iterator<T>(os, " ")); cout << endl; } #endif #endif // PRINTSEQUENCE_H ///:~
There are two forms here, one that
requires you to give an explicit range (this allows you to print an array or a
sub-sequence) and one that prints any of the STL containers, which provides
notational convenience when printing the entire contents of that container. The
second form performs template type deduction to determine the type of T
so it can be used in the copy( ) algorithm. That trick
wouldn’t work with the first form, so the copy( ) algorithm is
avoided and the copying is just done by hand (this could have been done with the
second form as well, but it’s instructive to see a template-template in
use). Because of this, you never need to specify the type that you’re
printing when you call either template function.
The default is to print to cout
with newlines as separators, but you can change that. You may also provide a
message to print at the head of the output.
Next, it’s useful to have some
generators (classes with an operator( ) that returns values of the
appropriate type) which allow a sequence to be rapidly filled with different
values.
//: C08:Generators.h // Different ways to fill sequences #ifndef GENERATORS_H #define GENERATORS_H #include <set> #include <cstdlib> #include <cstring> #include <ctime> // MS std namespace work-around #ifndef _MSC_VER using std::srand; using std::rand; using std::time; using std::strlen; #endif // A generator that can skip over numbers: class SkipGen { int i; int skp; public: SkipGen(int start = 0, int skip = 1) : i(start), skp(skip) {} int operator()() { int r = i; i += skp; return r; } }; // Generate unique random numbers from 0 to mod: class URandGen { std::set<int> used; int modulus; public: URandGen(int mod) : modulus(mod) { srand(time(0)); } int operator()() { while(true) { int i = (int)rand() % modulus; if(used.find(i) == used.end()) { used.insert(i); return i; } } } }; // Produces random characters: class CharGen { static const char* source; static const int len; public: CharGen() { srand(time(0)); } char operator()() { return source[rand() % len]; } }; // Statics created here for convenience, but // will cause problems if multiply included: const char* CharGen::source = "ABCDEFGHIJK" "LMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz"; const int CharGen::len = strlen(source); #endif // GENERATORS_H ///:~
To create some interesting values, the
SkipGen generator skips by the value skp each time its
operator( ) is called. You can initialize both the start value and
the skip value in the constructor.
URandGen (‘U’ for
“unique”) is a generator for random ints between 0 and
mod, with the additional constraint that each value can only be produced
once (thus you must be careful not to use up all the values). This is easily
accomplished with a set.
CharGen generates chars and
can be used to fill up a string (when treating a string as a
sequence container). You’ll note that the one member function that any
generator implements is operator( ) (with no arguments). This is
what is called by the “generate” functions.
The use of the generators and the
print( ) functions is shown in the following
section.
Finally, a number of the STL algorithms
that move elements of a sequence around distinguish between “stable”
and “unstable” reordering of a sequence. This refers to preserving
the original order of the elements for those elements that are equivalent but
not identical. For example, consider a sequence { c(1), b(1), c(2), a(1),
b(2), a(2) }. These elements are tested for equivalence based on their
letters, but their numbers indicate how they first appeared in the sequence. If
you sort (for example) this sequence using an unstable sort, there’s no
guarantee of any particular order among equivalent letters, so you could end up
with { a(2), a(1), b(1), b(2), c(2), c(1) }. However, if you used a
stable sort, it guarantees you will get { a(1), a(2), b(1), b(2), c(1), c(2)
}.
To demonstrate the stability versus
instability of algorithms that reorder a sequence, we need some way to keep
track of how the elements originally appeared. The following is a kind of
string object that keeps track of the order in which that particular
object originally appeared, using a static map that maps NStrings
to Counters. Each NString then contains an occurrence field
that indicates the order in which this NString was
discovered:
//: C08:NString.h // A "numbered string" that indicates which // occurrence this is of a particular word #ifndef NSTRING_H #define NSTRING_H #include <string> #include <map> #include <iostream> class NString { std::string s; int occurrence; struct Counter { int i; Counter() : i(0) {} Counter& operator++(int) { i++; return *this; } // Post-incr operator int() { return i; } }; // Keep track of the number of occurrences: typedef std::map<std::string, Counter> csmap; static csmap occurMap; public: NString() : occurrence(0) {} NString(const std::string& x) : s(x), occurrence(occurMap[s]++) {} NString(const char* x) : s(x), occurrence(occurMap[s]++) {} // The synthesized operator= and // copy-constructor are OK here friend std::ostream& operator<<( std::ostream& os, const NString& ns) { return os << ns.s << " [" << ns.occurrence << "]"; } // Need this for sorting. Notice it only // compares strings, not occurrences: friend bool operator<(const NString& l, const NString& r) { return l.s < r.s; } // For sorting with greater<NString>: friend bool operator>(const NString& l, const NString& r) { return l.s > r.s; } // To get at the string directly: operator const std::string&() const {return s;} }; // Allocate static member object. Done here for // brevity, but should actually be done in a // separate cpp file: NString::csmap NString::occurMap; #endif // NSTRING_H ///:~
In the constructors (one that takes a
string, one that takes a char*), the simple-looking initialization
occurrence(occurMap[s]++) performs all the work of maintaining and
assigning the occurrence counts (see the demonstration of the map class
in the previous chapter for more details).
To do an ordinary ascending sort, the
only operator that’s necessary is NString::operator<( ),
however to sort in reverse order the operator>( ) is also
provided so that the greater template can be used.
As this is just a demonstration class I
am getting away with the convenience of putting the definition of the static
member occurMap in the header file, but this will break down if the
header file is included in more than one place, so you should normally relegate
all static definitions to cpp
files.
These algorithms allow you to
automatically fill a range with a particular value, or to generate a set of
values for a particular range (these were introduced in the previous chapter).
The “fill” functions insert a single value multiple times into the
container, while the “generate” functions use an object called a
generator (described earlier) to create the values to insert into the
container.
void fill(ForwardIterator first,
ForwardIterator last, const T&
value);
void fill_n(OutputIterator first,
Size n, const T& value);
fill( ) assigns value
to every element in the range [first, last). fill_n( )
assigns value to n elements starting at
first.
void generate(ForwardIterator first,
ForwardIterator last, Generator gen);
void
generate_n(OutputIterator first, Size n, Generator gen);
generate( ) makes a call to
gen( ) for each element in the range [first, last),
presumably to produce a different value for each element.
generate_n( ) calls gen( ) n times and assigns
each result to n elements starting at first.
The following example fills and generates
into vectors. It also shows the use of
print( ):
//: C08:FillGenerateTest.cpp // Demonstrates "fill" and "generate" //{L} ../TestSuite/Test //{-msc} #include "Generators.h" #include "PrintSequence.h" #include <vector> #include <algorithm> #include <string> using namespace std; int main() { vector<string> v1(5); fill(v1.begin(), v1.end(), "howdy"); print(v1, "v1", " "); vector<string> v2; fill_n(back_inserter(v2), 7, "bye"); print(v2.begin(), v2.end(), "v2"); vector<int> v3(10); generate(v3.begin(), v3.end(), SkipGen(4,5)); print(v3, "v3", " "); vector<int> v4; generate_n(back_inserter(v4),15, URandGen(30)); print(v4, "v4", " "); } ///:~
A vector<string> is created
with a pre-defined size. Since storage has already been created for all the
string objects in the vector, fill( ) can use its
assignment operator to assign a copy of “howdy” to each space in the
vector. To print the result, the second form of print( ) is
used which simply needs a container (you don’t have to give the first and
last iterators). Also, the default newline separator is replaced with a
space.
The second vector<string> v2
is not given an initial size so back_inserter must be used to force new
elements in instead of trying to assign to existing locations. Just as an
example, the other print( ) is used which requires a
range.
The generate( ) and
generate_n( ) functions have the same form as the “fill”
functions except that they use a generator instead of a constant value; here,
both generators are demonstrated.
All containers have a method
size( ) that will tell you how many elements they hold. The
following two algorithms count objects only if they satisfy certain
criteria.
IntegralValue count(InputIterator
first, InputIterator last,
const
EqualityComparable& value);
Produces the number of elements in
[first, last) that are equivalent to value (when tested using
operator==).
IntegralValue count_if(InputIterator
first, InputIterator last, Predicate pred);
Produces the number of elements in
[first, last) which each cause pred to return
true.
Here, a vector<char> v is
filled with random characters (including some duplicates). A
set<char> is initialized from v, so it holds only one of
each letter represented in v. This set is used to count all the
instances of all the different characters, which are then
displayed:
//: C08:Counting.cpp // The counting algorithms //{L} ../TestSuite/Test //{-g++295} //{-msc} #include "PrintSequence.h" #include "Generators.h" #include <vector> #include <algorithm> using namespace std; int main() { vector<char> v; generate_n(back_inserter(v), 50, CharGen()); print(v, "v", ""); // Create a set of the characters in v: set<char> cs(v.begin(), v.end()); set<char>::iterator it = cs.begin(); while(it != cs.end()) { int n = count(v.begin(), v.end(), *it); cout << *it << ": " << n << ", "; it++; } int lc = count_if(v.begin(), v.end(), bind2nd(greater<char>(), 'a')); cout << "\nLowercase letters: " << lc << endl; sort(v.begin(), v.end()); print(v, "sorted", ""); } ///:~
The count_if( ) algorithm is
demonstrated by counting all the lowercase letters; the predicate is created
using the bind2nd( ) and greater function object
templates.
These algorithms allow you to move
sequences around.
OutputIterator copy(InputIterator,
first InputIterator last, OutputIterator destination);
Using assignment, copies from [first,
last) to destination, incrementing destination after each
assignment. Works with almost any type of source range and almost any kind of
destination. Because assignment is used, you cannot directly insert elements
into an empty container or at the end of a container, but instead you must wrap
the destination iterator in an insert_iterator (typically by using
back_inserter( ), or inserter( ) in the case of an
associative container).
The copy algorithm is used in many
examples in this book.
BidirectionalIterator2
copy_backward(BidirectionalIterator1 first,
BidirectionalIterator1 last,
BidirectionalIterator2 destinationEnd);
Like copy( ), but performs
the actual copying of the elements in reverse order. That is, the resulting
sequence is the same, it’s just that the copy happens in a different way.
The source range [first, last) is copied to the destination, but the
first destination element is destinationEnd - 1. This iterator is then
decremented after each assignment. The space in the destination range must
already exist (to allow assignment), and the destination range cannot be within
the source range.
void reverse(BidirectionalIterator
first, BidirectionalIterator
last);
OutputIterator
reverse_copy(BidirectionalIterator first, BidirectionalIterator
last,
OutputIterator
destination);
Both forms of this function reverse the
range [first, last). reverse( ) reverses the range in place,
while reverse_copy( ) leaves the original range alone and copies the
reversed elements into destination, returning the past-the-end iterator
of the resulting range.
ForwardIterator2
swap_ranges(ForwardIterator1 first1, ForwardIterator1 last1,
ForwardIterator2
first2);
Exchanges the contents of two ranges of
equal size, by moving from the beginning to the end of each range and swapping
each set of elements.
void rotate(ForwardIterator first, ForwardIterator middle,
ForwardIterator last);
OutputIterator rotate_copy(ForwardIterator
first, ForwardIterator middle,
ForwardIterator last,
OutputIterator destination);
Swaps the two ranges [first,
middle) and [middle, last). With rotate( ), the swap is
performed in place, and with rotate_copy( ) the original range is
untouched and the rotated version is copied into destination, returning
the past-the-end iterator of the resulting range. Note that while
swap_ranges( ) requires that the two ranges be exactly the same
size, the “rotate” functions do not.
bool
next_permutation(BidirectionalIterator first, BidirectionalIterator
last);
bool
next_permutation(BidirectionalIterator first, BidirectionalIterator
last,
StrictWeakOrdering
binary_pred);
bool
prev_permutation(BidirectionalIterator first, BidirectionalIterator
last);
bool
prev_permutation(BidirectionalIterator first, BidirectionalIterator
last,
StrictWeakOrdering
binary_pred);
A permutation is one unique
ordering of a set of elements. If you have n unique elements, then there
are n! (n factorial) distinct possible combinations of those
elements. All these combinations can be conceptually sorted into a sequence
using a lexicographical ordering, and thus produce a concept of a
“next” and “previous” permutation. Therefore, whatever
the current ordering of elements in the range, there is a distinct
“next” and “previous” permutation in the sequence of
permutations.
The next_permutation( ) and
prev_permutation( ) functions re-arrange the elements into their
next or previous permutation, and if successful return true. If there are
no more “next” permutations, it means that the elements are in
sorted order so next_permutation( ) returns false. If there
are no more “previous” permutations, it means that the elements are
in descending sorted order so previous_permutation( ) returns
false.
The versions of the functions which have
a StrictWeakOrdering argument perform the comparisons using
binary_pred instead of operator<.
void
random_shuffle(RandomAccessIterator first, RandomAccessIterator
last);
void
random_shuffle(RandomAccessIterator first, RandomAccessIterator
last
RandomNumberGenerator&
rand);
This function randomly rearranges the
elements in the range. It yields uniformly distributed results. The first form
uses an internal random number generator and the second uses a user-supplied
random-number generator.
BidirectionalIterator
partition(BidirectionalIterator first, BidirectionalIterator last,
Predicate
pred);
BidirectionalIterator
stable_partition(BidirectionalIterator first,
BidirectionalIterator last, Predicate
pred);
The “partition” functions use
pred to organize the elements in the range [first, last) so they
are before or after the partition (a point in the range). The partition point is
given by the returned iterator. If pred(*i) is true (where
i is the iterator pointing to a particular element), then that element
will be placed before the partition point, otherwise it will be placed after the
partition point.
With partition( ), the order
of the elements is after the function call is not specified, but with
stable_parition( ) the relative order of the elements before and
after the partition point will be the same as before the partitioning
process.
This gives a basic demonstration of
sequence manipulation:
//: C08:Manipulations.cpp // Shows basic manipulations //{L} ../TestSuite/Test //{-g++295} //{-msc} #include "PrintSequence.h" #include "NString.h" #include "Generators.h" #include <vector> #include <string> #include <algorithm> using namespace std; int main() { vector<int> v1(10); // Simple counting: generate(v1.begin(), v1.end(), SkipGen()); print(v1, "v1", " "); vector<int> v2(v1.size()); copy_backward(v1.begin(), v1.end(), v2.end()); print(v2, "copy_backward", " "); reverse_copy(v1.begin(), v1.end(), v2.begin()); print(v2, "reverse_copy", " "); reverse(v1.begin(), v1.end()); print(v1, "reverse", " "); int half = v1.size() / 2; // Ranges must be exactly the same size: swap_ranges(v1.begin(), v1.begin() + half, v1.begin() + half); print(v1, "swap_ranges", " "); // Start with fresh sequence: generate(v1.begin(), v1.end(), SkipGen()); print(v1, "v1", " "); int third = v1.size() / 3; for(int i = 0; i < 10; i++) { rotate(v1.begin(), v1.begin() + third, v1.end()); print(v1, "rotate", " "); } cout << "Second rotate example:" << endl; char c[] = "aabbccddeeffgghhiijj"; const char csz = strlen(c); for(int i = 0; i < 10; i++) { rotate(c, c + 2, c + csz); print(c, c + csz, "", ""); } cout << "All n! permutations of abcd:" << endl; int nf = 4 * 3 * 2 * 1; char p[] = "abcd"; for(int i = 0; i < nf; i++) { next_permutation(p, p + 4); print(p, p + 4, "", ""); } cout << "Using prev_permutation:" << endl; for(int i = 0; i < nf; i++) { prev_permutation(p, p + 4); print(p, p + 4, "", ""); } cout << "random_shuffling a word:" << endl; string s("hello"); cout << s << endl; for(int i = 0; i < 5; i++) { random_shuffle(s.begin(), s.end()); cout << s << endl; } NString sa[] = { "a", "b", "c", "d", "a", "b", "c", "d", "a", "b", "c", "d", "a", "b", "c"}; const int sasz = sizeof sa / sizeof *sa; vector<NString> ns(sa, sa + sasz); print(ns, "ns", " "); vector<NString>::iterator it = partition(ns.begin(), ns.end(), bind2nd(greater<NString>(), "b")); cout << "Partition point: " << *it << endl; print(ns, "", " "); // Reload vector: copy (sa, sa + sasz, ns.begin()); it = stable_partition(ns.begin(), ns.end(), bind2nd(greater<NString>(), "b")); cout << "Stable partition" << endl; cout << "Partition point: " << *it << endl; print(ns, "", " "); } ///:~
The best way to see the results of the
above program is to run it (you’ll probably want to redirect the output to
a file).
The vector<int> v1 is
initially loaded with a simple ascending sequence and printed. You’ll see
that the effect of copy_backward( ) (which copies into v2,
which is the same size as v1) is the same as an ordinary copy. Again,
copy_backward( ) does the same thing as copy( ), it just
performs the operations in backward order.
reverse_copy( ), however,
actually does created a reversed copy, while reverse( ) performs the
reversal in place. Next, swap_ranges( ) swaps the upper half of the
reversed sequence with the lower half. Of course, the ranges could be smaller
subsets of the entire vector, as long as they are of equivalent
size.
After re-creating the ascending sequence,
rotate( ) is demonstrated by rotating one third of v1
multiple times. A second rotate( ) example uses characters and just
rotates two characters at a time. This also demonstrates the flexibility of both
the STL algorithms and the print( ) template, since they can both be
used with arrays of char as easily as with anything
else.
To demonstrate
next_permutation( ) and prev_permutation( ), a set of
four characters “abcd” is permuted through all n! (n
factorial) possible combinations. You’ll see from the output that the
permutations move through a strictly-defined order (that is, permuting is a
deterministic process).
A quick-and-dirty demonstration of
random_shuffle( ) is to apply it to a string and see what
words result. Because a string object has begin( ) and
end( ) member functions that return the appropriate iterators, it
too may be easily used with many of the STL algorithms. Of course, an array of
char could also have been used.
Finally, the partition( ) and
stable_partition( ) are demonstrated, using an array of
NString. You’ll note that the aggregate initialization expression
uses char arrays, but NString has a char* constructor which
is automatically used.
When partitioning a sequence, you need a
predicate which will determine whether the object belongs above or below the
partition point. This takes a single argument and returns true (the
object is above the partition point) or false (it isn’t). I could
have written a separate function or function object to do this, but for
something simple, like “the object is greater than ‘b’”,
why not use the built-in function object templates? The expression
is:
bind2nd(greater<NString>(), "b")
And to understand it, you need to pick it
apart from the middle outward. First,
greater<NString>()
produces a binary function object which
compares its first and second arguments:
return first > second;
and returns a bool. But we
don’t want a binary predicate, and we want to compare against the constant
value “b.” So bind2nd( ) says: create a new
function object which only takes one argument, by taking this
greater<NString>( ) function and forcing the second argument
to always be “b.” The first argument (the only argument) will
be the one from the vector ns.
You’ll see from the output that
with the unstable partition, the objects are correctly above and below the
partition point, but in no particular order, whereas with the stable partition
their original order is maintained.
All of these algorithms are used for
searching for one or more objects within a range defined by the first two
iterator arguments.
InputIterator find(InputIterator first, InputIterator
last,
const EqualityComparable& value);
Searches for value within a range
of elements. Returns an iterator in the range [first, last) that points
to the first occurrence of value. If value isn’t in the
range, then find( ) returns last. This is a linear
search, that is, it starts at the beginning and looks at each sequential
element without making any assumptions about the way the elements are ordered.
In contrast, a binary_search( ) (defined later) works on a sorted
sequence and can thus be much faster.
InputIterator find_if(InputIterator first, InputIterator
last, Predicate pred);
Just like find( ),
find_if( ) performs a linear search through the range. However,
instead of searching for value, find_if( ) looks for an
element such that the Predicate pred returns true when applied to
that element. Returns last if no such element can be
found.
ForwardIterator
adjacent_find(ForwardIterator first, ForwardIterator
last);
ForwardIterator
adjacent_find(ForwardIterator first, ForwardIterator last,
BinaryPredicate
binary_pred);
Like find( ), performs a
linear search through the range, but instead of looking for only one element it
searches for two elements that are right next to each other. The first form of
the function looks for two elements that are equivalent (via operator==).
The second form looks for two adjacent elements that, when passed together to
binary_pred, produce a true result. If two adjacent elements
cannot be found, last is returned.
ForwardIterator1
find_first_of(ForwardIterator1 first1, ForwardIterator1 last1,
ForwardIterator2 first2,
ForwardIterator2 last2);
ForwardIterator1
find_first_of(ForwardIterator1 first1, ForwardIterator1 last1,
ForwardIterator2 first2,
ForwardIterator2 last2, BinaryPredicate binary_pred);
Like find( ), performs a
linear search through the range. The first form finds the first element in the
first range that is equivalent to any of the elements in the second range. The
second form finds the first element in the first range that produces true
when passed to binary_pred along with any of the elements in the second
range. When a BinaryPredicate is used with two ranges in the algorithms,
the element from the first range becomes the first argument to
binary_pred, and the element from the second range becomes the second
argument.
ForwardIterator1
search(ForwardIterator1 first1, ForwardIterator1 last1,
ForwardIterator2 first2,
ForwardIterator2 last2);
ForwardIterator1
search(ForwardIterator1 first1, ForwardIterator1 last1,
ForwardIterator2 first2,
ForwardIterator2 last2 BinaryPredicate binary_pred);
Attempts to find the entire range
[first2, last2) within the range [first1, last1). That is, it
checks to see if the second range occurs (in the exact order of the second
range) within the first range, and if so returns an iterator pointing to the
place in the first range where the second range begins. Returns last1 if
no subset can be found. The first form performs its test using
operator==, while the second checks to see if each pair of objects being
compared causes binary_pred to return true.
ForwardIterator1
find_end(ForwardIterator1 first1, ForwardIterator1
last1,
ForwardIterator2 first2,
ForwardIterator2 last2);
ForwardIterator1
find_end(ForwardIterator1 first1, ForwardIterator1
last1,
ForwardIterator2 first2,
ForwardIterator2 last2, BinaryPredicate binary_pred);
The forms and arguments are just like
search( ) in that it looks for the second range within the first
range, but while search( ) looks for the first occurrence of the
second range, find_end( ) looks for the last occurrence of
the second range within the first.
ForwardIterator
search_n(ForwardIterator first, ForwardIterator last,
Size count, const T&
value);
ForwardIterator
search_n(ForwardIterator first, ForwardIterator last,
Size count, const T& value,
BinaryPredicate binary_pred);
Looks for a group of count
consecutive values in [first, last) that are all equal to value
(in the first form) or that all cause a return value of true when passed
into binary_pred along with value (in the second form). Returns
last if such a group cannot be found.
ForwardIterator
min_element(ForwardIterator first, ForwardIterator
last);
ForwardIterator
min_element(ForwardIterator first, ForwardIterator last,
BinaryPredicate
binary_pred);
Returns an iterator pointing to the first
occurrence of the smallest value in the range (there may be multiple occurrences
of the smallest value). Returns last if the range is empty. The first
version performs comparisons with operator< and the value r
returned is such that
*e < *r
is false for every element e in the
range. The second version compares using binary_pred and the value
r returned is such that binary_pred (*e, *r) is false for every
element e in the range.
ForwardIterator
max_element(ForwardIterator first, ForwardIterator
last);
ForwardIterator
max_element(ForwardIterator first, ForwardIterator last,
BinaryPredicate
binary_pred);
Returns an iterator pointing to the first
occurrence of the largest value in the range (there may be multiple occurrences
of the largest value). Returns last if the range is empty. The first
version performs comparisons with operator< and the value r
returned is such that
*r < *e
is false for every element e in the
range. The second version compares using binary_pred and the value
r returned is such that binary_pred (*r, *e) is false for every
element e in the range.
void replace(ForwardIterator first,
ForwardIterator last,
const T&
old_value, const T& new_value);
void
replace_if(ForwardIterator first, ForwardIterator last,
Predicate pred, const T&
new_value);
OutputIterator
replace_copy(InputIterator first, InputIterator last,
OutputIterator result, const T&
old_value, const T&
new_value);
OutputIterator
replace_copy_if(InputIterator first, InputIterator last,
OutputIterator result, Predicate
pred, const T& new_value);
Each of the “replace” forms
moves through the range [first, last), finding values that match a
criterion and replacing them with new_value. Both replace( )
and replace_copy( ) simply look for old_value to replace,
while replace_if( ) and replace_copy_if( ) look for
values that satisfy the predicate pred. The “copy” versions
of the functions do not modify the original range but instead make a copy with
the replacements into result (incrementing result after each
assignment).
To provide easy viewing of the results,
this example will manipulate vectors of int. Again, not every
possible version of each algorithm will be shown (some that should be obvious
have been omitted).
//: C08:SearchReplace.cpp // The STL search and replace algorithms //{L} ../TestSuite/Test //{-g++295} //{-msc} #include "PrintSequence.h" #include <vector> #include <algorithm> #include <functional> using namespace std; struct PlusOne { bool operator()(int i, int j) { return j == i + 1; } }; class MulMoreThan { int value; public: MulMoreThan(int val) : value(val) {} bool operator()(int v, int m) { return v * m > value; } }; int main() { int a[] = { 1, 2, 3, 4, 5, 6, 6, 7, 7, 7, 8, 8, 8, 8, 11, 11, 11, 11, 11 }; const int asz = sizeof a / sizeof *a; vector<int> v(a, a + asz); print(v, "v", " "); vector<int>::iterator it = find(v.begin(), v.end(), 4); cout << "find: " << *it << endl; it = find_if(v.begin(), v.end(), bind2nd(greater<int>(), 8)); cout << "find_if: " << *it << endl; it = adjacent_find(v.begin(), v.end()); while(it != v.end()) { cout << "adjacent_find: " << *it << ", " << *(it + 1) << endl; it = adjacent_find(it + 2, v.end()); } it = adjacent_find(v.begin(), v.end(), PlusOne()); while(it != v.end()) { cout << "adjacent_find PlusOne: " << *it << ", " << *(it + 1) << endl; it = adjacent_find(it + 1, v.end(), PlusOne()); } int b[] = { 8, 11 }; const int bsz = sizeof b / sizeof *b; print(b, b + bsz, "b", " "); it = find_first_of(v.begin(), v.end(), b, b + bsz); print(it, it + bsz, "find_first_of", " "); it = find_first_of(v.begin(), v.end(), b, b + bsz, PlusOne()); print(it,it + bsz,"find_first_of PlusOne"," "); it = search(v.begin(), v.end(), b, b + bsz); print(it, it + bsz, "search", " "); int c[] = { 5, 6, 7 }; const int csz = sizeof c / sizeof *c; print(c, c + csz, "c", " "); it = search(v.begin(), v.end(), c, c + csz, PlusOne()); print(it, it + csz,"search PlusOne", " "); int d[] = { 11, 11, 11 }; const int dsz = sizeof d / sizeof *d; print(d, d + dsz, "d", " "); it = find_end(v.begin(), v.end(), d, d + dsz); print(it, v.end(),"find_end", " "); int e[] = { 9, 9 }; print(e, e + 2, "e", " "); it = find_end(v.begin(), v.end(), e, e + 2, PlusOne()); print(it, v.end(),"find_end PlusOne"," "); it = search_n(v.begin(), v.end(), 3, 7); print(it, it + 3, "search_n 3, 7", " "); it = search_n(v.begin(), v.end(), 6, 15, MulMoreThan(100)); print(it, it + 6, "search_n 6, 15, MulMoreThan(100)", " "); cout << "min_element: " << *min_element(v.begin(), v.end()) << endl; cout << "max_element: " << *max_element(v.begin(), v.end()) << endl; vector<int> v2; replace_copy(v.begin(), v.end(), back_inserter(v2), 8, 47); print(v2, "replace_copy 8 -> 47", " "); replace_if(v.begin(), v.end(), bind2nd(greater_equal<int>(), 7), -1); print(v, "replace_if >= 7 -> -1", " "); } ///:~
The example begins with two predicates:
PlusOne which is a binary predicate that returns true if the
second argument is equivalent to one plus the first argument, and
MulMoreThan which returns true if the first argument times the
second argument is greater than a value stored in the object. These binary
predicates are used as tests in the example.
In main( ), an array a
is created and fed to the constructor for vector<int> v. This
vector will be used as the target for the search and replace activities, and
you’ll note that there are duplicate elements – these will be
discovered by some of the search/replace routines.
The first test demonstrates
find( ), discovering the value 4 in v. The return value is
the iterator pointing to the first instance of 4, or the end of the input range
(v.end( )) if the search value is not found.
find_if( ) uses a predicate
to determine if it has discovered the correct element. In the above example,
this predicate is created on the fly using greater<int> (that is,
“see if the first int argument is greater than the second”)
and bind2nd( ) to fix the second argument to 8. Thus, it returns
true if the value in v is greater than 8.
Since there are a number of cases in
v where two identical objects appear next to each other, the test of
adjacent_find( ) is designed to find them all. It starts looking
from the beginning and then drops into a while loop, making sure that the
iterator it has not reached the end of the input sequence (which would
mean that no more matches can be found). For each match it finds, the loop
prints out the matches and then performs the next adjacent_find( ),
this time using it + 2 as the first argument (this way, it moves past the
two elements that it already found).
You might look at the while loop
and think that you can do it a bit more cleverly, to wit:
while(it != v.end()) { cout << "adjacent_find: " << *it++ << ", " << *it++ << endl; it = adjacent_find(it, v.end()); }
Of course, this is exactly what I tried
at first. However, I did not get the output I expected, on any compiler. This is
because there is no guarantee about when the increments occur in the above
expression. A bit of a disturbing discovery, I know, but the situation is best
avoided now that you’re aware of it.
The next test uses
adjacent_find( ) with the PlusOne predicate, which discovers
all the places where the next number in the sequence v changes from the
previous by one. The same while approach is used to find all the
cases.
find_first_of( ) requires a
second range of objects for which to hunt; this is provided in the array
b. Notice that, because the first range and the second range in
find_first_of( ) are controlled by separate template arguments,
those ranges can refer to two different types of containers, as seen here. The
second form of find_first_of( ) is also tested, using
PlusOne.
search( ) finds exactly the
second range inside the first one, with the elements in the same order. The
second form of search( ) uses a predicate, which is typically just
something that defines equivalence, but it also opens some interesting
possibilities – here, the PlusOne predicate causes the range {
4, 5, 6 } to be found.
The find_end( ) test
discovers the last occurrence of the entire sequence { 11, 11, 11
}. To show that it has in fact found the last occurrence, the rest of
v starting from it is printed.
The first search_n( ) test
looks for 3 copies of the value 7, which it finds and prints. When using the
second version of search_n( ), the predicate is ordinarily meant to
be used to determine equivalence between two elements, but I’ve taken some
liberties and used a function object that multiplies the value in the sequence
by (in this case) 15 and checks to see if it’s greater than 100. That is,
the search_n( ) test above says “find me 6 consecutive values
which, when multiplied by 15, each produce a number greater than 100.” Not
exactly what you normally expect to do, but it might give you some ideas the
next time you have an odd searching problem.
min_element( ) and
max_element( ) are straightforward; the only thing that’s a
bit odd is that it looks like the function is being dereferenced with a
‘*’. Actually, the returned iterator is being dereferenced to
produce the value for printing.
To test replacements,
replace_copy( ) is used first (so it doesn’t modify the
original vector) to replace all values of 8 with the value 47. Notice the use of
back_inserter( ) with the empty vector v2. To demonstrate
replace_if( ), a function object is created using the standard
template greater_equal along with bind2nd to replace all the
values that are greater than or equal to 7 with the value
-1.
These algorithms provide ways to compare
two ranges. At first glance, the operations they perform seem very close to the
search( ) function above. However, search( ) tells you
where the second sequence appears within the first, while equal( )
and lexicographical_compare( ) simply tell you whether or not two
sequences are exactly identical (using different comparison algorithms). On the
other hand, mismatch( ) does tell you where the two sequences go out
of sync, but those sequences must be exactly the same length.
bool equal(InputIterator first1,
InputIterator last1, InputIterator
first2);
bool equal(InputIterator first1,
InputIterator last1, InputIterator first2
BinaryPredicate binary_pred);
In both of these functions, the first
range is the typical one, [first1, last1). The second range starts at
first2, but there is no “last2” because its length is
determined by the length of the first range. The equal( ) function
returns true if both ranges are exactly the same (the same elements in the same
order); in the first case, the operator== is used to perform the
comparison and in the second case binary_pred is used to decide if two
elements are the same.
bool
lexicographical_compare(InputIterator1 first1, InputIterator1
last1
InputIterator2 first2,
InputIterator2 last2);
bool
lexicographical_compare(InputIterator1 first1, InputIterator1
last1
InputIterator2 first2,
InputIterator2 last2, BinaryPredicate binary_pred);
These two functions determine if the
first range is “lexicographically less” than the second (they return
true if range 1 is less than range 2, and false otherwise.
Lexicographical equality, or “dictionary” comparison, means that the
comparison is done the same way we establish the order of strings in a
dictionary, one element at a time. The first elements determine the result if
these elements are different, but if they’re equal the algorithm moves on
to the next elements and looks at those, and so on. until it finds a mismatch.
At that point it looks at the elements, and if the element from range 1 is less
than the element from range two, then lexicographical_compare( )
returns true, otherwise it returns false. If it gets all the way
through one range or the other (the ranges may be different lengths for this
algorithm) without finding an inequality, then range 1 is not less than
range 2 so the function returns false.
If the two ranges are different lengths,
a missing element in one range acts as one that “precedes” an
element that exists in the other range. So {‘a’, ‘b’}
lexicographically precedes {‘a’, ‘b’, ‘a’
}.
In the first version of the function,
operator< is used to perform the comparisons, and in the second
version binary_pred is used.
pair<InputIterator1,
InputIterator2> mismatch(InputIterator1 first1,
InputIterator1 last1, InputIterator2
first2);
pair<InputIterator1,
InputIterator2> mismatch(InputIterator1 first1,
InputIterator1 last1, InputIterator2
first2, BinaryPredicate binary_pred);
As in equal( ), the length of
both ranges is exactly the same, so only the first iterator in the second range
is necessary, and the length of the first range is used as the length of the
second range. Whereas equal( ) just tells you whether or not the two
ranges are the same, mismatch( ) tells you where they begin to
differ. To accomplish this, you must be told (1) the element in the first range
where the mismatch occurred and (2) the element in the second range where the
mismatch occurred. These two iterators are packaged together into a pair
object and returned. If no mismatch occurs, the return value is last1
combined with the past-the-end iterator of the second range.
As in equal( ), the first
function tests for equality using operator== while the second one uses
binary_pred.
Because the standard C++ string
class is built like a container (it has begin( ) and
end( ) member functions which produce objects of type
string::iterator), it can be used to conveniently create ranges of
characters to test with the STL comparison algorithms. However, you should note
that string has a fairly complete set of native operations, so you should
look at the string class before using the STL algorithms to perform
operations.
//: C08:Comparison.cpp // The STL range comparison algorithms //{L} ../TestSuite/Test //{-g++295} #include "PrintSequence.h" #include <vector> #include <algorithm> #include <functional> #include <string> using namespace std; int main() { // strings provide a convenient way to create // ranges of characters, but you should // normally look for native string operations: string s1("This is a test"); string s2("This is a Test"); cout << "s1: " << s1 << endl << "s2: " << s2 << endl; cout << "compare s1 & s1: " << equal(s1.begin(), s1.end(), s1.begin()) << endl; cout << "compare s1 & s2: " << equal(s1.begin(), s1.end(), s2.begin()) << endl; cout << "lexicographical_compare s1 & s1: " << lexicographical_compare(s1.begin(), s1.end(), s1.begin(), s1.end()) << endl; cout << "lexicographical_compare s1 & s2: " << lexicographical_compare(s1.begin(), s1.end(), s2.begin(), s2.end()) << endl; cout << "lexicographical_compare s2 & s1: " << lexicographical_compare(s2.begin(), s2.end(), s1.begin(), s1.end()) << endl; cout << "lexicographical_compare shortened " "s1 & full-length s2: " << endl; string s3(s1); while(s3.length() != 0) { bool result = lexicographical_compare( s3.begin(), s3.end(), s2.begin(),s2.end()); cout << s3 << endl << s2 << ", result = " << result << endl; if(result == true) break; s3 = s3.substr(0, s3.length() - 1); } pair<string::iterator, string::iterator> p = mismatch(s1.begin(), s1.end(), s2.begin()); print(p.first, s1.end(), "p.first", ""); print(p.second, s2.end(), "p.second",""); } ///:~
Note that the only difference between
s1 and s2 is the capital ‘T’ in s2’s
“Test.” Comparing s1 and s1 for equality yields
true, as expected, while s1 and s2 are not equal because of
the capital ‘T’.
To understand the output of the
lexicographical_compare( ) tests, you must remember two things:
first, the comparison is performed character-by-character, and second that
capital letters “precede” lowercase letters. In the first test,
s1 is compared to s1. These are exactly equivalent, thus one is
not lexicographically less than the other (which is what the comparison
is looking for) and thus the result is false. The second test is asking
“does s1 precede s2?” When the comparison gets to the
‘t’ in “test”, it discovers that the lowercase
‘t’ in s1 is “greater” than the uppercase
‘T’ in s2, so the answer is again false. However, if
we test to see whether s2 precedes s1, the answer is
true.
To further examine lexicographical
comparison, the next test in the above example compares s1 with s2
again (which returned false before). But this time it repeats the
comparison, trimming one character off the end of s1 (which is first
copied into s3) each time through the loop until the test evaluates to
true. What you’ll see is that, as soon as the uppercase
‘T’ is trimmed off of s3 (the copy of s1), then the
characters, which are exactly equal up to that point, no longer count and the
fact that s3 is shorter than s2 is what makes it lexicographically
precede s2.
The final test uses
mismatch( ). In order to capture the return value, you must first
create the appropriate pair p, constructing the template using the
iterator type from the first range and the iterator type from the second range
(in this case, both string::iterators). To print the results, the
iterator for the mismatch in the first range is p.first, and for the
second range is p.second. In both cases, the range is printed from the
mismatch iterator to the end of the range so you can see exactly where the
iterator points.
Because of the genericity of the STL, the
concept of removal is a bit constrained. Since elements can only be
“removed” via iterators, and iterators can point to arrays, vectors,
lists, etc., it is not safe or reasonable to actually try to destroy the
elements that are being removed, and to change the size of the input range
[first, last) (an array, for example, cannot have its size changed). So
instead, what the STL “remove” functions do is rearrange the
sequence so that the “removed” elements are at the end of the
sequence, and the “un-removed” elements are at the beginning of the
sequence (in the same order that they were before, minus the removed elements
– that is, this is a stable operation). Then the function will
return an iterator to the “new last” element of the sequence, which
is the end of the sequence without the removed elements and the beginning of the
sequence of the removed elements. In other words, if new_last is the
iterator that is returned from the “remove” function, then
[first, new_last) is the sequence without any of the removed elements,
and [new_last, last) is the sequence of removed
elements.
If you are simply using your sequence,
including the removed elements, with more STL algorithms, you can just use
new_last as the new past-the-end iterator. However, if you’re using
a resizable container c (not an array) and you actually want to eliminate
the removed elements from the container you can use erase( ) to do
so, for example:
c.erase(remove(c.begin(), c.end(), value), c.end());
The return value of remove( )
is the new_last iterator, so erase( ) will delete all the
removed elements from c.
The iterators in [new_last, last)
are dereferenceable but the element values are undefined and should not be
used.
ForwardIterator remove(ForwardIterator
first, ForwardIterator last, const T&
value);
ForwardIterator
remove_if(ForwardIterator first, ForwardIterator last,
Predicate
pred);
OutputIterator
remove_copy(InputIterator first, InputIterator last,
OutputIterator result, const T&
value);
OutputIterator
remove_copy_if(InputIterator first, InputIterator last,
OutputIterator result, Predicate
pred);
Each of the “remove” forms
moves through the range [first, last), finding values that match a
removal criterion and copying the un-removed elements over the removed elements
(thus effectively removing them). The original order of the un-removed elements
is maintained. The return value is an iterator pointing past the end of the
range that contains none of the removed elements. The values that this iterator
points to are unspecified.
The “if” versions pass each
element to pred( ) to determine whether it should be removed or not
(if pred( ) returns true, the element is removed). The
“copy” versions do not modify the original sequence, but instead
copy the un-removed values into a range beginning at result, and return
an iterator indicating the past-the-end value of this new
range.
ForwardIterator unique(ForwardIterator
first, ForwardIterator
last);
ForwardIterator
unique(ForwardIterator first, ForwardIterator last,
BinaryPredicate
binary_pred);
OutputIterator
unique_copy(InputIterator first, InputIterator last,
OutputIterator
result);
OutputIterator
unique_copy(InputIterator first, InputIterator last,
OutputIterator result,
BinaryPredicate binary_pred);
Each of the “unique”
functions moves through the range [first, last), finding adjacent values
that are equivalent (that is, duplicates) and “removing” the
duplicate elements by copying over them. The original order of the un-removed
elements is maintained. The return value is an iterator pointing past the end of
the range that has the adjacent duplicates removed.
Because only duplicates that are adjacent
are removed, it’s likely that you’ll want to call
sort( ) before calling a “unique” algorithm, since that
will guarantee that all the duplicates are removed.
The versions containing
binary_pred call, for each iterator value i in the input
range:
binary_pred(*i, *(i-1));
and if the result is true then
*(i-1) is considered a duplicate.
The “copy” versions do not
modify the original sequence, but instead copy the un-removed values into a
range beginning at result, and return an iterator indicating the
past-the-end value of this new range.
This example gives a visual demonstration
of the way the “remove” and “unique” functions
work.
//: C08:Removing.cpp // The removing algorithms // May be a bug here? //{L} ../TestSuite/Test //{-bor} //{-msc} //{-g++295} #include "PrintSequence.h" #include "Generators.h" #include <vector> #include <algorithm> #include <cctype> using namespace std; struct IsUpper { bool operator()(char c) { return isupper(c); } }; int main() { vector<char> v(50); generate(v.begin(), v.end(), CharGen()); print(v, "v", ""); // Create a set of the characters in v: set<char> cs(v.begin(), v.end()); set<char>::iterator it = cs.begin(); vector<char>::iterator cit; // Step through and remove everything: while(it != cs.end()) { cit = remove(v.begin(), v.end(), *it); cout << *it << "[" << *cit << "] "; print(v, "", ""); it++; } generate(v.begin(), v.end(), CharGen()); print(v, "v", ""); cit = remove_if(v.begin(), v.end(), IsUpper()); print(v.begin(), cit, "after remove_if", ""); // Copying versions are not shown for remove // and remove_if. sort(v.begin(), cit); print(v.begin(), cit, "sorted", ""); vector<char> v2; unique_copy(v.begin(), cit, back_inserter(v2)); print(v2, "unique_copy", ""); // Same behavior: cit = unique(v.begin(), cit, equal_to<char>()); print(v.begin(), cit, "unique", ""); } ///:~
The vector<char> v is filled
with randomly-generated characters and then copied into a set. Each
element of the set is used in a remove statement, but the entire
vector v is printed out each time so you can see what happens to the rest
of the range, after the resulting endpoint (which is stored in
cit).
To demonstrate remove_if( ),
the address of the Standard C library function isupper( ) (in
<cctype> is called inside of the function object class
IsUpper, an object of which is passed as the predicate for
remove_if( ). This only returns true if a character is
uppercase, so only lowercase characters will remain. Here, the end of the range
is used in the call to print( ) so only the remaining elements will
appear. The copying versions of remove( ) and
remove_if( ) are not shown because they are a simple variation on
the non-copying versions which you should be able to use without an
example.
The range of lowercase letters is sorted
in preparation for testing the “unique” functions (the
“unique” functions are not undefined if the range isn’t
sorted, but it’s probably not what you want). First,
unique_copy( ) puts the unique elements into a new vector
using the default element comparison, and then the form of unique( )
that takes a predicate is used; the predicate used is the built-in function
object equal_to( ), which produces the same results as the default
element comparison.
There is a significant category of STL
algorithms which require that the range they operate on be in sorted
order.
There is actually only one
“sort” algorithm used in the STL. This algorithm is presumably the
fastest one, but the implementer has fairly broad latitude. However, it comes
packaged in various flavors depending on whether the sort should be stable,
partial or just the regular sort. Oddly enough, only the partial sort has a
copying version; otherwise you’ll need to make your own copy before
sorting if that’s what you want. If you are working with a very large
number of items you may be better off transferring them to an array (or at least
a vector, which uses an array internally) rather than using them in some
of the STL containers.
Once your sequence is sorted, there are
many operations you can perform on that sequence, from simply locating an
element or group of elements to merging with another sorted sequence or
manipulating sequences as mathematical sets.
Each algorithm involved with sorting or
operations on sorted sequences has two versions of each function, the first that
uses the object’s own operator< to perform the comparison, and
the second that uses an additional StrictWeakOrdering object’s
operator( )(a, b) to compare two objects for a <
b. Other than this there are no differences, so the distinction will not
be pointed out in the description of each algorithm.
One STL container (list) has its
own built-in sort( ) function which is almost certainly going to be
faster than the generic sort presented here (especially since the list
sort just swaps pointers rather than copying entire objects around). This
means that you’ll only want to use the sort functions here if (a)
you’re working with an array or a sequence container that doesn’t
have a sort( ) function or (b) you want to use one of the other
sorting flavors, like a partial or stable sort, which aren’t supported by
list’s sort( ).
void sort(RandomAccessIterator first,
RandomAccessIterator last);
void
sort(RandomAccessIterator first, RandomAccessIterator last,
StrictWeakOrdering
binary_pred);
Sorts [first, last) into ascending
order. The second form allows a comparator object to determine the
order.
void stable_sort(RandomAccessIterator
first, RandomAccessIterator last);
void
stable_sort(RandomAccessIterator first, RandomAccessIterator last,
StrictWeakOrdering
binary_pred);
Sorts [first, last) into ascending
order, preserving the original ordering of equivalent elements (this is
important if elements can be equivalent but not identical). The second form
allows a comparator object to determine the order.
void partial_sort(RandomAccessIterator
first,
RandomAccessIterator middle,
RandomAccessIterator last);
void
partial_sort(RandomAccessIterator first,
RandomAccessIterator middle, RandomAccessIterator last,
StrictWeakOrdering
binary_pred);
Sorts the number of elements from
[first, last) that can be placed in the range [first, middle). The
rest of the elements end up in [middle, last), and have no guaranteed
order. The second form allows a comparator object to determine the
order.
RandomAccessIterator
partial_sort_copy(InputIterator first, InputIterator last,
RandomAccessIterator result_first,
RandomAccessIterator
result_last);
RandomAccessIterator
partial_sort_copy(InputIterator first,
InputIterator last, RandomAccessIterator result_first,
RandomAccessIterator result_last,
StrictWeakOrdering binary_pred);
Sorts the number of elements from
[first, last) that can be placed in the range [result_first,
result_last), and copies those elements into [result_first,
result_last). If the range [first, last) is smaller than
[result_first, result_last), then the smaller number of elements is used.
The second form allows a comparator object to determine the
order.
void nth_element(RandomAccessIterator
first,
RandomAccessIterator nth,
RandomAccessIterator last);
void
nth_element(RandomAccessIterator first,
RandomAccessIterator nth, RandomAccessIterator last,
StrictWeakOrdering
binary_pred);
Just like partial_sort( ),
nth_element( ) partially orders a range of elements. However,
it’s much “less ordered” than partial_sort( ). The
only thing that nth_element( ) guarantees is that whatever
location you choose will become a dividing point. All the elements in the
range [first, nth) will be less than (they could also be equivalent to)
whatever element ends up at location nth and all the elements in the
range (nth, last] will be greater than whatever element ends up location
nth. However, neither range is in any particular order, unlike
partial_sort( ) which has the first range in sorted
order.
If all you need is this very weak
ordering (if, for example, you’re determining medians, percentiles and
that sort of thing) this algorithm is faster than
partial_sort( ).
The StreamTokenizer class from the
previous chapter is used to break a file into words, and each word is turned
into an NString and added to a deque<NString>. Once the
input file is completely read, a vector<NString> is created from
the contents of the deque. The vector is then used to demonstrate
the sorting algorithms:
//: C08:SortTest.cpp // Test different kinds of sorting //{L} ../C07/StreamTokenizer ../TestSuite/Test //{-g++295} //{-msc} #include "../C07/StreamTokenizer.h" #include "NString.h" #include "PrintSequence.h" #include "Generators.h" #include "../require.h" #include <algorithm> #include <fstream> #include <queue> #include <vector> #include <cctype> using namespace std; // For sorting NStrings and ignore string case: struct NoCase { bool operator()( const NString& x, const NString& y) { /* Somthing's wrong with this approach but I can't seem to see it. It would be much faster: const string& lv = x; const string& rv = y; int len = min(lv.size(), rv.size()); for(int i = 0; i < len; i++) if(tolower(lv[i]) < tolower(rv[i])) return true; return false; } */ // Brute force: copy, force to lowercase: string lv(x); string rv(y); lcase(lv); lcase(rv); return lv < rv; } void lcase(string& s) { int n = s.size(); for(int i = 0; i < n; i++) s[i] = tolower(s[i]); } }; int main(int argc, char* argv[]) { char* fname = "SortTest.cpp"; if(argc > 1) fname = argv[1]; ifstream in(fname); assure(in, fname); StreamTokenizer words(in); deque<NString> nstr; string word; while((word = words.next()).size() != 0) nstr.push_back(NString(word)); print(nstr); // Create a vector from the contents of nstr: vector<NString> v(nstr.begin(), nstr.end()); sort(v.begin(), v.end()); print(v, "sort"); // Use an additional comparator object: sort(v.begin(), v.end(), NoCase()); print(v, "sort NoCase"); copy(nstr.begin(), nstr.end(), v.begin()); stable_sort(v.begin(), v.end()); print(v, "stable_sort"); // Use an additional comparator object: stable_sort(v.begin(), v.end(), greater<NString>()); print(v, "stable_sort greater"); copy(nstr.begin(), nstr.end(), v.begin()); // Partial sorts. The additional comparator // versions are obvious and not shown here. partial_sort(v.begin(), v.begin() + v.size()/2, v.end()); print(v, "partial_sort"); // Create a vector with a preallocated size: vector<NString> v2(v.size()/2); partial_sort_copy(v.begin(), v.end(), v2.begin(), v2.end()); print(v2, "partial_sort_copy"); // Finally, the weakest form of ordering: vector<int> v3(20); generate(v3.begin(), v3.end(), URandGen(50)); print(v3, "v3 before nth_element"); int n = 10; vector<int>::iterator vit = v3.begin() + n; nth_element(v3.begin(), vit, v3.end()); cout << "After ordering with nth = " << n << ", nth element is " << v3[n] << endl; print(v3, "v3 after nth_element"); } ///:~
The first class is a binary predicate
used to compare two NString objects while ignoring the case of the
strings. You can pass the object into the various sort routines to
produce an alphabetic sort (rather than the default lexicographic sort, which
has all the capital letters in one group, followed by all the lowercase
letters).
As an example, try the source code for
the above file as input. Because the occurrence numbers are printed along with
the strings you can distinguish between an ordinary sort and a stable sort, and
you can also see what happens during a partial sort (the remaining unsorted
elements are in no particular order). There is no “partial stable
sort.”
You’ll notice that the use of the
second “comparator” forms of the functions are not exhaustively
tested in the above example, but the use of a comparator is the same as in the
first part of the example.
The test of nth_element does not
use the NString objects because it’s simpler to see what’s
going on if ints are used. Notice that, whatever the nth element turns
out to be (which will vary from one run to another because of URandGen),
the elements before that are less, and after that are greater, but the elements
have no particular order other than that. Because of URandGen, there are
no duplicates but if you use a generator that allows duplicates you can see that
the elements before the nth element will be less than or equal to the nth
element.
Once a range is sorted, there are a group
of operations that can be used to find elements within those ranges. In the
following functions, there are always two forms, one that assumes the intrinsic
operator< has been used to perform the sort, and the second that must
be used if some other comparison function object has been used to perform the
sort. You must use the same comparison for locating elements as you do to
perform the sort, otherwise the results are undefined. In addition, if you try
to use these functions on unsorted ranges the results will be
undefined.
bool binary_search(ForwardIterator
first, ForwardIterator last, const T&
value);
bool binary_search(ForwardIterator
first, ForwardIterator last, const T& value,
StrictWeakOrdering
binary_pred);
Tells you whether value appears in
the sorted range [first, last).
ForwardIterator
lower_bound(ForwardIterator first, ForwardIterator last,
const T&
value);
ForwardIterator
lower_bound(ForwardIterator first, ForwardIterator last,
const T& value,
StrictWeakOrdering binary_pred);
Returns an iterator indicating the first
occurrence of value in the sorted range [first, last). Returns
last if value is not found.
ForwardIterator
upper_bound(ForwardIterator first, ForwardIterator last,
const T&
value);
ForwardIterator
upper_bound(ForwardIterator first, ForwardIterator last,
const T& value,
StrictWeakOrdering binary_pred);
Returns an iterator indicating one past
the last occurrence of value in the sorted range [first, last).
Returns last if value is not found.
pair<ForwardIterator,
ForwardIterator>
equal_range(ForwardIterator first, ForwardIterator last,
const T&
value);
pair<ForwardIterator,
ForwardIterator>
equal_range(ForwardIterator first, ForwardIterator last,
const T& value,
StrictWeakOrdering binary_pred);
Essentially combines
lower_bound( ) and upper_bound( ) to return a
pair indicating the first and one-past-the-last occurrences of
value in the sorted range [first, last). Both iterators indicate
last if value is not found.
Here, we can use the approach from the
previous example:
//: C08:SortedSearchTest.cpp //{L} ../C07/StreamTokenizer ../TestSuite/Test // Test searching in sorted ranges //{-g++295} //{-msc} #include "../C07/StreamTokenizer.h" #include "PrintSequence.h" #include "NString.h" #include "../require.h" #include <algorithm> #include <fstream> #include <queue> #include <vector> using namespace std; int main() { ifstream in("SortedSearchTest.cpp"); assure(in, "SortedSearchTest.cpp"); StreamTokenizer words(in); deque<NString> dstr; string word; while((word = words.next()).size() != 0) dstr.push_back(NString(word)); vector<NString> v(dstr.begin(), dstr.end()); sort(v.begin(), v.end()); print(v, "sorted"); typedef vector<NString>::iterator sit; sit it, it2; string f("include"); cout << "binary search: " << binary_search(v.begin(), v.end(), f) << endl; it = lower_bound(v.begin(), v.end(), f); it2 = upper_bound(v.begin(), v.end(), f); print(it, it2, "found range"); pair<sit, sit> ip = equal_range(v.begin(), v.end(), f); print(ip.first, ip.second, "equal_range"); } ///:~
The input is forced to be the source code
for this file because the word “include” will be used for a find
string (since “include” appears many times). The file is tokenized
into words that are placed into a deque (a better container when you
don’t know how much storage to allocate), and left unsorted in the
deque. The deque is copied into a vector via the
appropriate constructor, and the vector is sorted and
printed.
The binary_search( ) function
only tells you if the object is there or not; lower_bound( ) and
upper_bound( ) produce iterators to the beginning and ending
positions where the matching objects appear. The same effect can be produced
more succinctly using equal_range( ) (as shown in the previous
chapter, with multimap and multiset).
As before, the first form of each
function assumes the intrinsic operator< has been used to perform the
sort. The second form must be used if some other comparison function object has
been used to perform the sort. You must use the same comparison for locating
elements as you do to perform the sort, otherwise the results are undefined. In
addition, if you try to use these functions on unsorted ranges the results will
be undefined.
OutputIterator merge(InputIterator1
first1, InputIterator1 last1,
InputIterator2 first2, InputIterator2 last2, OutputIterator
result);
OutputIterator
merge(InputIterator1 first1, InputIterator1 last1,
InputIterator2 first2, InputIterator2
last2, OutputIterator result,
StrictWeakOrdering binary_pred);
Copies elements from [first1,
last1) and [first2, last2) into result, such that the
resulting range is sorted in ascending order. This is a stable
operation.
void
inplace_merge(BidirectionalIterator first,
BidirectionalIterator middle,
BidirectionalIterator last);
void
inplace_merge(BidirectionalIterator first,
BidirectionalIterator middle,
BidirectionalIterator last,
StrictWeakOrdering binary_pred);
This assumes that [first, middle)
and [middle, last) are each sorted ranges. The two ranges are merged so
that the resulting range [first, last) contains the combined ranges in
sorted order.
It’s easier to see what goes on
with merging if ints are used; the following example also emphasizes how
the algorithms (and my own print template) work with arrays as well as
containers.
//: C08:MergeTest.cpp // Test merging in sorted ranges //{L} ../TestSuite/Test #include <algorithm> #include "PrintSequence.h" #include "Generators.h" using namespace std; int main() { const int sz = 15; int a[sz*2] = {0}; // Both ranges go in the same array: generate(a, a + sz, SkipGen(0, 2)); generate(a + sz, a + sz*2, SkipGen(1, 3)); print(a, a + sz, "range1", " "); print(a + sz, a + sz*2, "range2", " "); int b[sz*2] = {0}; // Initialize all to zero merge(a, a + sz, a + sz, a + sz*2, b); print(b, b + sz*2, "merge", " "); // set_union is a merge that removes duplicates set_union(a, a + sz, a + sz, a + sz*2, b); print(b, b + sz*2, "set_union", " "); inplace_merge(a, a + sz, a + sz*2); print(a, a + sz*2, "inplace_merge", " "); } ///:~
In main( ), instead of
creating two separate arrays both ranges will be created end-to-end in the same
array a (this will come in handy for the inplace_merge). The first
call to merge( ) places the result in a different array, b.
For comparison, set_union( ) is also called, which has the same
signature and similar behavior, except that it removes the duplicates. Finally,
inplace_merge( ) is used to combine both parts of
a.
Once ranges have been sorted, you can
perform mathematical set operations on them.
bool includes(InputIterator1 first1,
InputIterator1 last1,
InputIterator2
first2, InputIterator2 last2);
bool
includes (InputIterator1 first1, InputIterator1 last1,
InputIterator2 first2, InputIterator2
last2,
StrictWeakOrdering
binary_pred);
Returns true if [first2,
last2) is a subset of [first1, last1). Neither range is required to
hold only unique elements, but if [first2, last2) holds n elements
of a particular value, then [first1, last1) must also hold n
elements if the result is to be true.
OutputIterator
set_union(InputIterator1 first1, InputIterator1 last1,
InputIterator2 first2, InputIterator2
last2, OutputIterator
result);
OutputIterator
set_union(InputIterator1 first1, InputIterator1 last1,
InputIterator2 first2, InputIterator2
last2, OutputIterator result,
StrictWeakOrdering binary_pred);
Creates the mathematical union of two
sorted ranges in the result range, returning the end of the output range.
Neither input range is required to hold only unique elements, but if a
particular value appears multiple times in both input sets, then the resulting
set will contain the larger number of identical values.
OutputIterator set_intersection
(InputIterator1 first1, InputIterator1 last1,
InputIterator2 first2, InputIterator2
last2, OutputIterator
result);
OutputIterator set_intersection
(InputIterator1 first1, InputIterator1 last1,
InputIterator2 first2, InputIterator2
last2, OutputIterator result,
StrictWeakOrdering binary_pred);
Produces, in result, the
intersection of the two input sets, returning the end of the output range. That
is, the set of values that appear in both input sets. Neither input range is
required to hold only unique elements, but if a particular value appears
multiple times in both input sets, then the resulting set will contain the
smaller number of identical values.
OutputIterator set_difference
(InputIterator1 first1, InputIterator1 last1, InputIterator2 first2,
InputIterator2 last2, OutputIterator
result);
OutputIterator set_difference
(InputIterator1 first1, InputIterator1 last1,
InputIterator2 first2, InputIterator2
last2, OutputIterator result,
StrictWeakOrdering binary_pred);
Produces, in result, the
mathematical set difference, returning the end of the output range. All the
elements that are in [first1, last1) but not in [first2, last2)
are placed in the result set. Neither input range is required to hold only
unique elements, but if a particular value appears multiple times in both input
sets (n times in set 1 and m times in set 2), then the resulting
set will contain max(n-m, 0) copies of that value.
OutputIterator
set_symmetric_difference(InputIterator1 first1,
InputIterator1 last1, InputIterator2
first2, InputIterator2 last2,
OutputIterator result);
OutputIterator
set_symmetric_difference(InputIterator1 first1,
InputIterator1 last1, InputIterator2
first2, InputIterator2 last2,
OutputIterator result, StrictWeakOrdering binary_pred);
Constructs, in result, the set
containing:
Neither input range is
required to hold only unique elements, but if a particular value appears
multiple times in both input sets (n times in set 1 and m times in
set 2), then the resulting set will contain abs(n-m) copies of that
value, where abs( ) is the absolute value. The return value is the
end of the output range.
It’s easiest to see the set
operations demonstrated using simple vectors of characters, so you view the sets
more easily. These characters are randomly generated and then sorted, but the
duplicates are not removed so you can see what the set operations do when
duplicates are involved.
//: C08:SetOperations.cpp // Set operations on sorted ranges //{L} ../TestSuite/Test //{-msc} #include <vector> #include <algorithm> #include "PrintSequence.h" #include "Generators.h" using namespace std; int main() { vector<char> v(50), v2(50); CharGen g; generate(v.begin(), v.end(), g); generate(v2.begin(), v2.end(), g); sort(v.begin(), v.end()); sort(v2.begin(), v2.end()); print(v, "v", ""); print(v2, "v2", ""); bool b = includes(v.begin(), v.end(), v.begin() + v.size()/2, v.end()); cout << "includes: " << (b ? "true" : "false") << endl; vector<char> v3, v4, v5, v6; set_union(v.begin(), v.end(), v2.begin(), v2.end(), back_inserter(v3)); print(v3, "set_union", ""); set_intersection(v.begin(), v.end(), v2.begin(), v2.end(), back_inserter(v4)); print(v4, "set_intersection", ""); set_difference(v.begin(), v.end(), v2.begin(), v2.end(), back_inserter(v5)); print(v5, "set_difference", ""); set_symmetric_difference(v.begin(), v.end(), v2.begin(), v2.end(), back_inserter(v6)); print(v6, "set_symmetric_difference",""); } ///:~
After v and v2 are
generated, sorted and printed, the includes( ) algorithm is tested
by seeing if the entire range of v contains the last half of v,
which of course it does so the result should always be true. The vectors
v3, v4, v5 and v6 are created to hold the output of
set_union( ), set_intersection( ),
set_difference( ) and set_symmetric_difference( ), and
the results of each are displayed so you can ponder them and convince yourself
that the algorithms do indeed work as
promised.
The heap operations in the STL are
primarily concerned with the creation of the STL priority_queue, which
provides efficient access to the “largest” element, whatever
“largest” happens to mean for your program. These were discussed in
some detail in the previous chapter, and you can find an example
there.
As with the “sort”
operations, there are two versions of each function, the first that uses the
object’s own operator< to perform the comparison, the second
that uses an additional StrictWeakOrdering object’s
operator( )(a, b) to compare two objects for a <
b.
void make_heap(RandomAccessIterator
first, RandomAccessIterator last);
void
make_heap(RandomAccessIterator first, RandomAccessIterator last,
StrictWeakOrdering
binary_pred);
Turns an arbitrary range into a heap. A
heap is just a range that is organized in a particular way.
void push_heap(RandomAccessIterator
first, RandomAccessIterator last);
void
push_heap(RandomAccessIterator first, RandomAccessIterator last,
StrictWeakOrdering
binary_pred);
Adds the element *(last-1) to the
heap determined by the range [first, last-1). Yes, it seems like an odd
way to do things but remember that the priority_queue container presents
the nice interface to a heap, as shown in the previous chapter.
void pop_heap(RandomAccessIterator
first, RandomAccessIterator last);
void
pop_heap(RandomAccessIterator first, RandomAccessIterator last,
StrictWeakOrdering
binary_pred);
Places the largest element (which is
actually in *first, before the operation, because of the way heaps are
defined) into the position *(last-1) and reorganizes the remaining
range so that it’s still in heap order. If you simply grabbed
*first, the next element would not be the next-largest element so you
must use pop_heap( ) if you want to maintain the heap in its proper
priority-queue order.
void sort_heap(RandomAccessIterator
first, RandomAccessIterator last);
void
sort_heap(RandomAccessIterator first, RandomAccessIterator last,
StrictWeakOrdering
binary_pred);
This could be thought of as the
complement of make_heap( ), since it takes a range that is in heap
order and turns it into ordinary sorted order, so it is no longer a heap. That
means that if you call sort_heap( ) you can no longer use
push_heap( ) or pop_heap( ) on that range (rather, you
can use those functions but they won’t do anything sensible). This is not
a stable sort.
These algorithms move through the entire
range and perform an operation on each element. They differ in what they do with
the results of that operation: for_each( ) discards the return value
of the operation (but returns the function object that has been applied to each
element), while transform( ) places the results of each operation
into a destination sequence (which can be the original
sequence).
UnaryFunction for_each(InputIterator
first, InputIterator last, UnaryFunction f);
Applies the function object f to
each element in [first, last), discarding the return value from each
individual application of f. If f is just a function pointer then
you are typically not interested in the return value, but if f is an
object that maintains some internal state it can capture the combined return
value of being applied to the range. The final return value of
for_each( ) is f.
OutputIterator transform(InputIterator
first, InputIterator last,
OutputIterator
result, UnaryFunction f);
OutputIterator
transform(InputIterator1 first, InputIterator1 last,
InputIterator2 first2, OutputIterator
result, BinaryFunction f);
Like for_each( ),
transform( ) applies a function object f to each element in
the range [first, last). However, instead of discarding the result of
each function call, transform( ) copies the result (using
operator=) into *result, incrementing result after each
copy (the sequence pointed to by result must have enough storage,
otherwise you should use an inserter to force insertions instead of
assignments).
The first form of
transform( ) simply calls f( ) and passes it each object
from the input range as an argument. The second form passes an object from the
first input range and one from the second input range as the two arguments to
the binary function f (note the length of the second input range is
determined by the length of the first). The return value in both cases is the
past-the-end iterator for the resulting output range.
Since much of what you do with objects in
a container is to apply an operation to all of those objects, these are fairly
important algorithms and merit several illustrations.
First, consider for_each( ).
This sweeps through the range, pulling out each element and passing it as an
argument as it calls whatever function object it’s been given. Thus
for_each( ) performs operations that you might normally write out by
hand. In Stlshape.cpp, for example:
for(Iter j = shapes.begin(); j != shapes.end(); j++) delete *j;
If you look in your
compiler’s header file at the template defining for_each( ),
you’ll see something like this:
template <class InputIterator, class Function> Function for_each(InputIterator first, InputIterator last, Function f) { while (first != last) f(*first++); return f; }
Function f looks at
first like it must be a pointer to a function which takes, as an argument, an
object of whatever InputIterator selects. However, the above template
actually only says that you must be able to call f using parentheses and
an argument. This is true for a function pointer, but it’s also true for a
function object – any class that defines the appropriate
operator( ). The following example shows several different
ways this template can be expanded. First, we need a class that keeps track of
its objects so we can know that it’s being properly
destroyed:
//: C08:Counted.h // An object that keeps track of itself #ifndef COUNTED_H #define COUNTED_H #include <vector> #include <iostream> class Counted { static int count; char* ident; public: Counted(char* id) : ident(id) { count++; } ~Counted() { std::cout << ident << " count = " << --count << std::endl; } }; int Counted::count = 0; class CountedVector : public std::vector<Counted*> { public: CountedVector(char* id) { for(int i = 0; i < 5; i++) push_back(new Counted(id)); } }; #endif // COUNTED_H ///:~
The class Counted keeps a static
count of how many Counted objects have been created, and tells you as
they are destroyed. In addition, each Counted keeps a char*
identifier to make tracking the output easier.
The CountedVector is inherited
from vector<Counted*>, and in the constructor it creates some
Counted objects, handing each one your desired char*. The
CountedVector makes testing quite simple, as you’ll
see.
//: C08:ForEach.cpp // Use of STL for_each() algorithm //{L} ../TestSuite/Test //{-g++295} //{-g++3} //{-msc} #include "Counted.h" #include <iostream> #include <vector> #include <algorithm> using namespace std; // Simple function: void destroy(Counted* fp) { delete fp; } // Function object: template<class T> class DeleteT { public: void operator()(T* x) { delete x; } }; // Template function: template <class T> void wipe(T* x) { delete x; } int main() { CountedVector A("one"); for_each(A.begin(), A.end(), destroy); CountedVector B("two"); for_each(B.begin(),B.end(),DeleteT<Counted>()); CountedVector C("three"); for_each(C.begin(), C.end(), wipe<Counted>); } ///:~
In main( ), the first
approach is the simple pointer-to-function, which works but has the drawback
that you must write a new Destroy function for each different type. The
obvious solution is to make a template, which is shown in the second approach
with a templatized function object. On the other hand, approach three also makes
sense: template functions work as well.
Since this is obviously something you
might want to do a lot, why not create an algorithm to delete all the
pointers in a container? This was accomplished with the purge( )
template created in the previous chapter. However, that used explicitly-written
code; here, we could use transform( ). The value of
transform( ) over for_each( ) is that
transform( ) assigns the result of calling the function object into
a resulting range, which can actually be the input range. That case means a
literal transformation for the input range, since each element would be a
modification of its previous value. In the above example this would be
especially useful since it’s more appropriate to assign each pointer to
the safe value of zero after calling delete for that pointer.
Transform( ) can easily do this:
//: C08:Transform.cpp // Use of STL transform() algorithm //{L} ../TestSuite/Test //{-msc} #include "Counted.h" #include <iostream> #include <vector> #include <algorithm> using namespace std; template<class T> T* deleteP(T* x) { delete x; return 0; } template<class T> struct Deleter { T* operator()(T* x) { delete x; return 0; } }; int main() { CountedVector cv("one"); transform(cv.begin(), cv.end(), cv.begin(), deleteP<Counted>); CountedVector cv2("two"); transform(cv2.begin(), cv2.end(), cv2.begin(), Deleter<Counted>()); } ///:~
This shows both approaches: using a
template function or a templatized function object. After the call to
transform( ), the vector contains zero pointers, which is safer
since any duplicate deletes will have no effect.
One thing you cannot do is delete
every pointer in a collection without wrapping the call to delete inside
a function or an object. That is, you don’t want to say something like
this:
for_each(a.begin(), a.end(), ptr_fun(operator delete));
You can say it, but what you’ll get
is a sequence of calls to the function that releases the storage. You will not
get the effect of calling delete for each pointer in a, however;
the destructor will not be called. This is typically not what you want, so you
will need wrap your calls to delete.
In the previous example of
for_each( ), the return value of the algorithm was ignored. This
return value is the function that is passed in to for_each( ). If
the function is just a pointer to a function, then the return value is not very
useful, but if it is a function object, then that function object may have
internal member data that it uses to accumulate information about all the
objects that it sees during for_each( ).
For example, consider a simple model of
inventory. Each Inventory object has the type of product it represents
(here, single characters will be used for product names), the quantity of that
product and the price of each item:
//: C08:Inventory.h #ifndef INVENTORY_H #define INVENTORY_H #include <iostream> #include <cstdlib> #include <ctime> class Inventory { char item; int quantity; int value; public: Inventory(char it, int quant, int val) : item(it), quantity(quant), value(val) {} // Synthesized operator= & copy-constructor OK char getItem() const { return item; } int getQuantity() const { return quantity; } void setQuantity(int q) { quantity = q; } int getValue() const { return value; } void setValue(int val) { value = val; } friend std::ostream& operator<<( std::ostream& os, const Inventory& inv) { return os << inv.item << ": " << "quantity " << inv.quantity << ", value " << inv.value; } }; // A generator: struct InvenGen { InvenGen() { std::srand(std::time(0)); } Inventory operator()() { static char c = 'a'; int q = std::rand() % 100; int v = std::rand() % 500; return Inventory(c++, q, v); } }; #endif // INVENTORY_H ///:~
There are member functions to get the
item name, and to get and set quantity and value. An operator<<
prints the Inventory object to an ostream. There’s also a
generator that creates objects that have sequentially-labeled items and random
quantities and values. Note the use of the return value optimization in
operator( ).
To find out the total number of items and
total value, you can create a function object to use with
for_each( ) that has data members to hold the
totals:
//: C08:CalcInventory.cpp // More use of for_each() //{L} ../TestSuite/Test //{-g++295} //{-msc} #include "Inventory.h" #include "PrintSequence.h" #include <vector> #include <algorithm> using namespace std; // To calculate inventory totals: class InvAccum { int quantity; int value; public: InvAccum() : quantity(0), value(0) {} void operator()(const Inventory& inv) { quantity += inv.getQuantity(); value += inv.getQuantity() * inv.getValue(); } friend ostream& operator<<(ostream& os, const InvAccum& ia) { return os << "total quantity: " << ia.quantity << ", total value: " << ia.value; } }; int main() { vector<Inventory> vi; generate_n(back_inserter(vi), 15, InvenGen()); print(vi, "vi"); InvAccum ia = for_each(vi.begin(),vi.end(), InvAccum()); cout << ia << endl; } ///:~
InvAccum’s
operator( ) takes a single argument, as required by
for_each( ). As for_each( ) moves through its range, it
takes each object in that range and passes it to
InvAccum::operator( ), which performs calculations and saves the
result. At the end of this process, for_each( ) returns the
InvAccum object which you can then examine; in this case it is simply
printed.
You can do most things to the
Inventory objects using for_each( ). For example, if you
wanted to increase all the prices by 10%, for_each( ) could do this
handily. But you’ll notice that the Inventory objects have no way
to change the item value. The programmers who designed Inventory
thought this was a good idea, after all, why would you want to change the name
of an item? But marketing has decided that they want a “new,
improved” look by changing all the item names to uppercase; they’ve
done studies and determined that the new names will boost sales (well, marketing
has to have something to do ...). So for_each( ) will not
work here, but transform( ) will:
//: C08:TransformNames.cpp // More use of transform() //{L} ../TestSuite/Test //{-g++295} //{-msc} #include "Inventory.h" #include "PrintSequence.h" #include <vector> #include <algorithm> #include <cctype> using namespace std; struct NewImproved { Inventory operator()(const Inventory& inv) { return Inventory(toupper(inv.getItem()), inv.getQuantity(), inv.getValue()); } }; int main() { vector<Inventory> vi; generate_n(back_inserter(vi), 15, InvenGen()); print(vi, "vi"); transform(vi.begin(), vi.end(), vi.begin(), NewImproved()); print(vi, "vi"); } ///:~
Notice that the resulting range is the
same as the input range, that is, the transformation is performed
in-place.
Now suppose that the sales department
needs to generate special price lists with different discounts for each item.
The original list must stay the same, and there need to be any number of
generated special lists. Sales will give you a separate list of discounts for
each new list. To solve this problem we can use the second version of
transform( ):
//: C08:SpecialList.cpp // Using the second version of transform() //{L} ../TestSuite/Test //{-g++295} //{-msc} #include "Inventory.h" #include "PrintSequence.h" #include <vector> #include <algorithm> #include <cstdlib> #include <ctime> using namespace std; struct Discounter { Inventory operator()(const Inventory& inv, float discount) { return Inventory(inv.getItem(), inv.getQuantity(), inv.getValue() * (1 - discount)); } }; struct DiscGen { DiscGen() { srand(time(0)); } float operator()() { float r = float(rand() % 10); return r / 100.0; } }; int main() { vector<Inventory> vi; generate_n(back_inserter(vi), 15, InvenGen()); print(vi, "vi"); vector<float> disc; generate_n(back_inserter(disc), 15, DiscGen()); print(disc, "Discounts:"); vector<Inventory> discounted; transform(vi.begin(),vi.end(), disc.begin(), back_inserter(discounted), Discounter()); print(discounted, "discounted"); } ///:~
Discounter is a function object
that, given an Inventory object and a discount percentage, produces a new
Inventory with the discounted price. DiscGen just generates random
discount values between 1 and 10 percent to use for testing. In
main( ), two vectors are created, one for Inventory
and one for discounts. These are passed to transform( ) along with a
Discounter object, and transform( ) fills a new
vector<Inventory> called
discounted.
These algorithms are all tucked into the
header <numeric>, since they are primarily useful for performing
numerical calculations.
<numeric>
T
accumulate(InputIterator first, InputIterator last, T
result);
T accumulate(InputIterator first,
InputIterator last, T result,
BinaryFunction f);
The first form is a generalized
summation; for each element pointed to by an iterator i in [first,
last), it performs the operation result = result + *i, where
result is of type T. However, the second form is more general; it
applies the function f(result, *i) on each element *i in the range
from beginning to end. The value result is initialized in both cases by
resultI, and if the range is empty then resultI is
returned.
Note the similarity between the second
form of transform( ) and the second form of
accumulate( ).
<numeric>
T
inner_product(InputIterator1 first1, InputIterator1 last1,
InputIterator2 first2, T
init);
T inner_product(InputIterator1
first1, InputIterator1 last1,
InputIterator2 first2, T init
BinaryFunction1 op1, BinaryFunction2 op2);
Calculates a generalized inner product of
the two ranges [first1, last1) and [first2, first2 + (last1 -
first1)). The return value is produced by multiplying the element from the
first sequence by the “parallel” element in the second sequence, and
then adding it to the sum. So if you have two sequences {1, 1, 2, 2} and {1, 2,
3, 4} the inner product becomes:
(1*1) + (1*2) + (2*3) + (2*4)
Which is 17. The init argument is
the initial value for the inner product; this is probably zero but may be
anything and is especially important for an empty first sequence, because then
it becomes the default return value. The second sequence must have at least as
many elements as the first.
While the first form is very specifically
mathematical, the second form is simply a multiple application of functions and
could conceivably be used in many other situations. The op1 function is
used in place of addition, and op2 is used instead of multiplication.
Thus, if you applied the second version of inner_product( ) to the
above sequence, the result would be the following operations:
init = op1(init, op2(1,1)); init = op1(init, op2(1,2)); init = op1(init, op2(2,3)); init = op1(init, op2(2,4));
Thus it’s similar to
transform( ) but two operations are performed instead of
one.
<numeric>
OutputIterator
partial_sum(InputIterator first, InputIterator last,
OutputIterator result);
OutputIterator partial_sum(InputIterator
first, InputIterator last,
OutputIterator result, BinaryFunction
op);
Calculates a generalized partial sum.
This means that a new sequence is created, beginning at result, where
each element is the sum of all the elements up to the currently selected element
in [first, last). For example, if the original sequence is {1, 1, 2,
2, 3} then the generated sequence is {1, 1 + 1, 1 + 1 + 2, 1 + 1 + 1 + 2
+ 2, 1 + 1 + 1 + 2 + 2 + 3}, that is, {1, 2, 4, 6,
9}.
In the second version, the binary
function op is used instead of the + operator to take all the
“summation” up to that point and combine it with the new value. For
example, if you use multiplies<int>( ) as the object for the
above sequence, the output is {1, 1, 2, 4, 12}. Note that the first
output value is always the same as the first input value.
The return value is the end of the output
range [result, result + (last - first) ).
<numeric>
OutputIterator
adjacent_difference(InputIterator first, InputIterator last,
OutputIterator
result);
OutputIterator
adjacent_difference(InputIterator first, InputIterator
last,
OutputIterator result,
BinaryFunction op);
Calculates the differences of adjacent
elements throughout the range [first, last). This means that in the new
sequence, the value is the value of the difference of the current element and
the previous element in the original sequence (the first value is the same). For
example, if the original sequence is {1, 1, 2, 2, 3}, the resulting
sequence is {1, 1 – 1, 2 – 1, 2 – 2, 3 – 2}, that
is: {1, 0, 1, 0, 1}.
The second form uses the binary function
op instead of the – operator to perform the
“differencing.” For example, if you use
multiplies<int>( ) as the function object for the above
sequence, the output is {1, 1, 2, 4, 6}.
The return value is the end of the output
range [result, result + (last - first) ).
This program tests all the algorithms in
<numeric> in both forms, on integer arrays. You’ll notice
that in the test of the form where you supply the function or functions, the
function objects used are the ones that produce the same result as form one so
the results produced will be exactly the same. This should also demonstrate a
bit more clearly the operations that are going on, and how to substitute your
own operations.
//: C08:NumericTest.cpp //{L} ../TestSuite/Test //{-g++295} #include "PrintSequence.h" #include <numeric> #include <algorithm> #include <iostream> #include <iterator> #include <functional> using namespace std; int main() { int a[] = { 1, 1, 2, 2, 3, 5, 7, 9, 11, 13 }; const int asz = sizeof a / sizeof a[0]; print(a, a + asz, "a", " "); int r = accumulate(a, a + asz, 0); cout << "accumulate 1: " << r << endl; // Should produce the same result: r = accumulate(a, a + asz, 0, plus<int>()); cout << "accumulate 2: " << r << endl; int b[] = { 1, 2, 3, 4, 1, 2, 3, 4, 1, 2 }; print(b, b + sizeof b / sizeof b[0], "b", " "); r = inner_product(a, a + asz, b, 0); cout << "inner_product 1: " << r << endl; // Should produce the same result: r = inner_product(a, a + asz, b, 0, plus<int>(), multiplies<int>()); cout << "inner_product 2: " << r << endl; int* it = partial_sum(a, a + asz, b); print(b, it, "partial_sum 1", " "); // Should produce the same result: it = partial_sum(a, a + asz, b, plus<int>()); print(b, it, "partial_sum 2", " "); it = adjacent_difference(a, a + asz, b); print(b, it, "adjacent_difference 1"," "); // Should produce the same result: it = adjacent_difference(a, a + asz, b, minus<int>()); print(b, it, "adjacent_difference 2"," "); } ///:~
Note that the return value of
inner_product( ) and partial_sum( ) is the past-the-end
iterator for the resulting sequence, so it is used as the second iterator in the
print( ) function.
Since the second form of each function
allows you to provide your own function object, only the first form of the
functions is purely “numeric.” You could conceivably do some things
that are not intuitively numeric with something like
inner_product( ).
Finally, here are some basic tools that
are used with the other algorithms; you may or may not use them directly
yourself.
<utility>
struct
pair;
make_pair( );
This was described and used in the
previous chapter and in this one. A pair is simply a way to package two
objects (which may be of different types) together into a single object. This is
typically used when you need to return more than one object from a function, but
it can also be used to create a container that holds pair objects, or to
pass more than one object as a single argument. You access the elements by
saying p.first and p.second, where p is the pair
object. The function equal_range( ), described in the last chapter
and in this one, returns its result as a pair of iterators. You can
insert( ) a pair directly into a map or
multimap; a pair is the value_type for those
containers.
If you want to create a pair, you
typically use the template function make_pair( ) rather than
explicitly constructing a pair object.
<iterator>
distance(InputIterator
first, InputIterator last);
Tells you the number of elements between
first and last. More precisely, it returns an integral value that
tells you the number of times first must be incremented before it is
equal to last. No dereferencing of the iterators occurs during this
process.
<iterator>
void
advance(InputIterator& i, Distance n);
Moves the iterator i forward by
the value of n (the iterator can also be moved backward for negative
values of n if the iterator is also a bidirectional iterator). This
algorithm is aware of bidirectional iterators, and will use the most efficient
approach.
<iterator>
back_insert_iterator<Container>
back_inserter(Container&
x);
front_insert_iterator<Container>
front_inserter(Container&
x);
insert_iterator<Container>
inserter(Container& x, Iterator i);
These functions are used to create
iterators for the given containers that will insert elements into the container,
rather than overwrite the existing elements in the container using operator=
(which is the default behavior). Each type of iterator uses a different
operation for insertion: back_insert_iterator uses
push_back( ), front_insert_iterator uses
push_front( ) and insert_iterator uses insert( )
(and thus it can be used with the associative containers, while the other two
can be used with sequence containers). These were shown in some detail in the
previous chapter, and also used in this chapter.
const LessThanComparable&
min(const LessThanComparable& a,
const LessThanComparable& b);
const
T& min(const T& a, const T& b, BinaryPredicate
binary_pred);
Returns the lesser of its two arguments,
or the first argument if the two are equivalent. The first version performs
comparisons using operator< and the second passes both arguments to
binary_pred to perform the comparison.
const LessThanComparable& max(const
LessThanComparable& a,
const
LessThanComparable& b);
const T&
max(const T& a, const T& b, BinaryPredicate
binary_pred);
Exactly like min( ), but
returns the greater of its two arguments.
void swap(Assignable& a,
Assignable& b);
void
iter_swap(ForwardIterator1 a, ForwardIterator2 b);
Exchanges the values of a and
b using assignment. Note that all container classes use specialized
versions of swap( ) that are typically more efficient than this
general version.
Once you become comfortable with the STL
algorithm style, you can begin to create your own STL-style algorithms. Because
these will conform to the format of all the other algorithms in the STL,
they’re easy to use for programmers who are familiar with the STL, and
thus become a way to “extend the STL vocabulary.”
The easiest way to approach the problem
is to go to the <algorithm> header file and find something similar
to what you need, and modify that (virtually all STL implementations provide the
code for the templates directly in the header files). For example, an algorithm
that stands out by its absence is copy_if( ) (the closest
approximation is partition( )), which was used in Binder1.cpp
at the beginning of this chapter, and in several other examples in this chapter.
This will only copy an element if it satisfies a predicate. Here’s an
implementation:
//: C08:copy_if.h // Roll your own STL-style algorithm #ifndef COPY_IF_H #define COPY_IF_H template<typename ForwardIter, typename OutputIter, typename UnaryPred> OutputIter copy_if(ForwardIter begin, ForwardIter end, OutputIter dest, UnaryPred f) { while(begin != end) { if(f(*begin)) *dest++ = *begin; begin++; } return dest; } #endif // COPY_IF_H ///:~
The return value is the past-the-end
iterator for the destination sequence (the copied sequence).
Now that you’re comfortable with
the ideas of the various iterator types, the actual implementation is quite
straightforward. You can imagine creating an entire additional library of your
own useful algorithms that follow the format of the
STL.
The goal of this chapter, and the
previous one, was to give you a programmer’s-depth understanding of the
containers and algorithms in the Standard Template Library. That is, to make you
aware of and comfortable enough with the STL that you begin to use it on a
regular basis (or at least, to think of using it so you can come back here and
hunt for the appropriate solution). It is powerful not only because it’s a
reasonably complete library of tools, but also because it provides a vocabulary
for thinking about problem solutions, and because it is a framework for creating
additional tools.
Although this chapter and the last did
show some examples of creating your own tools, I did not go into the full depth
of the theory of the STL that is necessary to completely understand all the STL
nooks and crannies to allow you to create tools more sophisticated than those
shown here. I did not do this partially because of space limitations, but mostly
because it is beyond the charter of this book; my goal here is to give you
practical understanding that will affect your day-to-day programming
skills.
There are a number of books dedicated
solely to the STL (these are listed in the appendices), but the two that I
learned the most from, in terms of the theory necessary for tool creation, were
first, Generic Programming and the STL by Matthew H. Austern,
Addison-Wesley 1999 (this also covers all the SGI extensions, which Austern was
instrumental in creating), and second (older and somewhat out of date, but still
quite valuable), C++ Programmer’s Guide to the Standard Template
Library by Mark Nelson, IDG press
1995.