My Public Notepad: STL

Showing posts with label STL. Show all posts

Tuesday, 12 June 2012

Appending string to STL stringstream

std::stringstream constructor has two optional parameters: string that this stream will be initialized with and the buffer mode. Default mode for stringstream is ios::in|ios::out. Put pointer (the one that points to the location in the buffer where the next output operation will start inserting characters) is always set to location 0 upon stream creation regardless of the buffer content. That explains why Test 2 in the code below yields overwriting of the content on the second output.

If we want to initialize stream during its creation and to make next output operation to append characters, we need to move put pointer to the end of the buffer explicitly - by setting buffer open mode to ios::ate (see Test 3).

main.cpp:

Output:

Test 1:
Contained string: "abc"; Output pointer position: 3
Contained string: "abcdef"; Output pointer position: 6

Test 2:
Contained string: "abc"; Output pointer position: 0
Contained string: "def"; Output pointer position: 3

Test 3:
Contained string: "abc"; Output pointer position: 3
Contained string: "abcdef"; Output pointer position: 6

Thursday, 3 May 2012

Observer pattern: from GoF implementation to events and delegates

Let us assume we have some class (Subject) with properties that can be changed. A set of current values of all properties defines Subject's state. Another class (Observer) must be notified each time Subject's state changes (each time some property changes). Observer pattern decouples direct dependencies between (concrete) subject and (concrete) observer by using interfaces.

The following code shows classic (Gang of Four) implementation of the Observer pattern applied to two subjects (PropertyOwner1 and PropertyOwner2) and their observers (PropertyObserver1 and PropertyObserver2):

main.cpp:

Output:

PropertyOwner1: property1's new value is: 1
PropertyObserver1: PropertyOwner1 property1's value is: 1
PropertyObserver2: PropertyOwner1 property2's value is: 0

PropertyOwner1: property2's new value is: 1.2
PropertyObserver1: PropertyOwner1 property1's value is: 1
PropertyObserver2: PropertyOwner1 property2's value is: 1.2

PropertyOwner2: property1's new value is: 1
PropertyObserver1: PropertyOwner2 property2's value is: 0
PropertyObserver2: PropertyOwner2 property1's value is: 1

PropertyOwner2: property2's new value is: 3.4
PropertyObserver1: PropertyOwner2 property2's value is: 3.4
PropertyObserver2: PropertyOwner2 property1's value is: 1

PropertyOwner1: property1's new value is: 2
PropertyObserver2: PropertyOwner1 property2's value is: 1.2

PropertyOwner1: property2's new value is: 4.5
PropertyObserver2: PropertyOwner1 property2's value is: 4.5

As we can see, observer is notified each time any property of the subject changes. E.g. if we look the test where PropertyOwner1's property1 changes to 2 but PropertyObserver2 is notified regardless the fact that it is interested only in PropertyOwner1's property2.

In this model client gets information that the subject's state (some property) has been changed but it does not know exactly which property has been changed. Of course, observer could maintain the history of obtained values for each property and compare new values with the previous ones and thus detect for which property the value has changed but this increases complexity of the observer.

Somehow we need to pass to the observer information about which property has changed. If we assign each property in a system a unique ID we could pass it to the observer as an additional parameter of the Subject::update() method:

main.cpp:

#include
#include
#include

class Subject;

enum PropertyID
{
Property1,
Property2
};

class Observer
{
public:
virtual ~Observer(){}
// pointer to Subject is passed so ConcreteObserver can distinct ConcreteSubjects
virtual void update(Subject*, PropertyID) = 0;
};

class Subject
{
public:
virtual ~Subject(){}

void registerObserver(Observer* const o)
{
std::list::const_iterator it = std::find(observers_.begin(), observers_.end(), o);
if(it != observers_.end())
throw std::runtime_error("Observer already registered");

observers_.push_back(o);
}

void unregisterObserver(Observer* const o)
{
std::list::const_iterator it = std::find(observers_.begin(), observers_.end(), o);
if(it != observers_.end())
observers_.remove(o);
}

void notify(PropertyID propertyID)
{
std::list::const_iterator it = observers_.begin();
for(; it != observers_.end(); ++it)
(*it)->update(this, propertyID);
}

protected:
std::list observers_;
};

// Concrete Subject
class PropertyOwner1 : public Subject
{
int property1_;
float property2_;

public:

PropertyOwner1() : property1_(0), property2_(0.0f){}

void property1(int n)
{
if(n != property1_)
{
property1_ = n;
std::cout << "\nPropertyOwner1: property1's new value is: " << property1_ << std::endl; notify(Property1); } } int property1() const { return property1_;} void property2(float n) { if(n != property2_) { property2_ = n; std::cout << "\nPropertyOwner1: property2's new value is: " << property2_ << std::endl; notify(Property2); } } float property2() const { return property2_;} }; class PropertyOwner2 : public Subject { bool property1_; double property2_; public: PropertyOwner2() : property1_(false), property2_(0.0){} void property1(bool n) { if(n != property1_) { property1_ = n; std::cout << "\nPropertyOwner2: property1's new value is: " << property1_ << std::endl; notify(Property1); } } bool property1() const { return property1_;} void property2(double n) { if(n != property2_) { property2_ = n; std::cout << "\nPropertyOwner2: property2's new value is: " << property2_ << std::endl; notify(Property2); } } double property2() const { return property2_;} }; // Concrete Observer // observes changes in property1 of ConcreteSubject1 and // property2 of ConcreteSubject2 class PropertyObserver1 : public Observer { // ConcreteObserver knows about ConcreteSubjects PropertyOwner1* pPropertyOwner1_; PropertyOwner2* pPropertyOwner2_; public: PropertyObserver1(PropertyOwner1* pConcreteSubject1, PropertyOwner2* pPropertyOwner2) : pPropertyOwner1_(pConcreteSubject1), pPropertyOwner2_(pPropertyOwner2){} void update(Subject* pSubject, PropertyID propertyID) { if(pSubject == pPropertyOwner1_) { if(propertyID == Property1) { int property1 = pPropertyOwner1_->property1();
std::cout << "\tPropertyObserver1: PropertyOwner1 property1's value is: " << property1 << std::endl; } } else if(pSubject == pPropertyOwner2_) { if(propertyID == Property2) { double property2 = pPropertyOwner2_->property2();
std::cout << "\tPropertyObserver1: PropertyOwner2 property2's value is: " << property2 << std::endl; } } } }; // Concrete Observer // observes changes in property2 of ConcreteSubject1 and // property1 of ConcreteSubject2 class PropertyObserver2 : public Observer { PropertyOwner1* pPropertyOwner1_; PropertyOwner2* pPropertyOwner2_; public: PropertyObserver2(PropertyOwner1* pPropertyOwner1, PropertyOwner2* pPropertyOwner2) : pPropertyOwner1_(pPropertyOwner1), pPropertyOwner2_(pPropertyOwner2){} void update(Subject* pSubject, PropertyID propertyID) { if(pSubject == pPropertyOwner1_) { if(propertyID == Property2) { float property2 = pPropertyOwner1_->property2();
std::cout << "\tPropertyObserver2: PropertyOwner1 property2's value is: " << property2 << std::endl; } } else if(pSubject == pPropertyOwner2_) { if(propertyID == Property1) { bool property1 = pPropertyOwner2_->property1();
std::cout << "\tPropertyObserver2: PropertyOwner2 property1's value is: " << property1 << std::endl; } } } };

Output shows that this time each observer knew exactly which property of which subject has been changed:

PropertyOwner1: property1's new value is: 1
PropertyObserver1: PropertyOwner1 property1's value is: 1

PropertyOwner1: property2's new value is: 1.2
PropertyObserver2: PropertyOwner1 property2's value is: 1.2

PropertyOwner2: property1's new value is: 1
PropertyObserver2: PropertyOwner2 property1's value is: 1

PropertyOwner2: property2's new value is: 3.4
PropertyObserver1: PropertyOwner2 property2's value is: 3.4

PropertyOwner1: property1's new value is: 2

PropertyOwner1: property2's new value is: 4.5
PropertyObserver2: PropertyOwner1 property2's value is: 4.5

There is a room for further improvements: notify() (and so observer's update()) function is called each time any property of the subject is changed. Yes, Property ID is passed so observer knows which property has been modified but still - can we avoid calling update() of the observer that is not interested in that particular property?

The solution is to granulate Observer pattern: instead of taking PropertyOwners as concrete subjects, let us move Observer pattern one level down and take PropertyOwners' properties as explicit concrete subjects. This implies that observers will need to register with properties, not with their owning classes - PropertyOwners. Each property will implement Subject interface and maintain its list of observers. When calling notify()/update() it needs to pass information on the type of its owner (PropertyOwner - "implicit concrete subject") so the observer knows which PropertyOwner has changed. On the observer's side we can do the same, granulate the pattern to smaller units which belong to particular class (PropertyObserverOwner - "implicit concrete observer") and which observe particular
property of the particular class. Templates come handy for this implementation:

main.cpp:

#include
#include
#include

// Abstract Observer
template
class PropertyObserver
{
public:
virtual void update(const TPropertyType& newValue) = 0;
};

// Concrete Observer (member of the class that observes a set of properties through a set of
// property concrete observers)
template
class PropertyObserverMember : public PropertyObserver
{
TPropertyObserverOwner* pPropertyObserverOwner_;

typedef void (TPropertyObserverOwner::*TPropertyChangeHandler)(const TPropertyType&);
TPropertyChangeHandler handler_;

public:
PropertyObserverMember(TPropertyObserverOwner* pPropertyObserverOwner, TPropertyChangeHandler handler) :
pPropertyObserverOwner_(pPropertyObserverOwner), handler_(handler){}

void update(const TPropertyType& newValue)
{
(pPropertyObserverOwner_->*handler_)(newValue);
}
};

// Subject
template
class ObservableProperty
{
public:
virtual ~ObservableProperty(){};
void registerObserver(PropertyObserver* const o)
{
std::list* const>::const_iterator it = std::find(observers_.begin(), observers_.end(), o);
if(it != observers_.end())
throw std::runtime_error("Observer already registered");
observers_.push_back(o);
}

void unregisterObserver(PropertyObserver* const o)
{
std::list* const>::const_iterator it = std::find(observers_.begin(), observers_.end(), o);
if(it != observers_.end())
observers_.remove(o);
}

void notify(const TPropertyType& newValue)
{
std::list* const>::const_iterator it = observers_.begin();
for(; it != observers_.end(); ++it)
(*it)->update(newValue);
}

protected:
std::list* const> observers_;
};

// Concrete Subject
template
class Property : public ObservableProperty
{
TPropertyType t_;
TPropertyOwner* owner_;
public:
Property(const TPropertyType& t, TPropertyOwner* owner) : t_(t), owner_(owner){}

void set(const TPropertyType& t)
{
if(t_ != t)
{
t_ = t;

std::cout << typeid(TPropertyOwner).name() << "'s property of type " << typeid(TPropertyType).name() << " has been set to " << t << std::endl; notify(t); } } TPropertyType get() const { return t_; } }; // class that owns concrete subjects struct PropertyOwner1 { Property property1;
Property property2;
//Property property3;

PropertyOwner1() :
property1(0, this),
property2(0.0f, this)/*,
property3("", this)*/
{}
};

// class that owns concrete subjects
struct PropertyOwner2
{
Property property1;
Property property2;

PropertyOwner2() :
property1(false, this),
property2(0.0, this)
{}
};

// class that owns concrete observers and through them
// observes changes in property1 of PropertyOwner1 and property2 of PropertyOwner2
struct PropertyObserverOwner1
{
PropertyObserverMember propertyOwner1_property1_Observer_;
PropertyObserverMember propertyOwner2_property2_Observer_;
//PropertyObserverMember propertyOwner1_property3_Observer_;

PropertyObserverOwner1() :
propertyOwner1_property1_Observer_(this, &PropertyObserverOwner1::OnPropertyOwner1Property1Changed),
propertyOwner2_property2_Observer_(this, &PropertyObserverOwner1::OnPropertyOwner2Property2Changed)/*,
propertyOwner1_property3_Observer_(this, &PropertyObserverOwner1::OnPropertyOwner1Property3Changed)*/
{}

void OnPropertyOwner1Property1Changed(const int& newValue)
{
std::cout << "\tPropertyObserverOwner1::OnPropertyOwner1Property1Changed(): \n\tnew value is: " << newValue << std::endl; } void OnPropertyOwner2Property2Changed(const double& newValue) { std::cout << "\tPropertyObserverOwner1::OnPropertyOwner2Property2Changed(): \n\tnew value is: " << newValue << std::endl; } /* void OnPropertyOwner1Property3Changed(const std::string& newValue) { std::cout << "\tPropertyObserverOwner1::OnPropertyOwner1Property3Changed(): \n\tnew value is: " << newValue << std::endl; } */ }; // class that owns concrete observers and through them // observes changes in property2 of PropertyOwner1 and property1 of PropertyOwner2 struct PropertyObserverOwner2 { PropertyObserverMember propertyOwner1_property2_Observer_;
PropertyObserverMember propertyOwner2_property1_Observer_;

PropertyObserverOwner2() :
propertyOwner1_property2_Observer_(this, &PropertyObserverOwner2::OnPropertyOwner1Property2Changed),
propertyOwner2_property1_Observer_(this, &PropertyObserverOwner2::OnPropertyOwner2Property1Changed){}

void OnPropertyOwner1Property2Changed(const float& newValue)
{
std::cout << "\tPropertyObserverOwner2::OnPropertyOwner1Property2Changed(): \n\tnew value is: " << newValue << std::endl; } void OnPropertyOwner2Property1Changed(const bool& newValue) { std::cout << "\tPropertyObserverOwner2::OnPropertyOwner2Property1Changed() \n\tnew value is: " << newValue << std::endl; } }; int main(int argc, char** argv) { PropertyOwner1 propertyOwner1; PropertyOwner2 propertyOwner2; PropertyObserverOwner1 propertyObserverOwner1; PropertyObserverOwner2 propertyObserverOwner2; // register propertyObserverOwner1's observers propertyOwner1.property1.registerObserver(&propertyObserverOwner1.propertyOwner1_property1_Observer_); propertyOwner2.property2.registerObserver(&propertyObserverOwner1.propertyOwner2_property2_Observer_); // register propertyObserverOwner2's observers propertyOwner1.property2.registerObserver(&propertyObserverOwner2.propertyOwner1_property2_Observer_); propertyOwner2.property1.registerObserver(&propertyObserverOwner2.propertyOwner2_property1_Observer_); propertyOwner1.property1.set(1); propertyOwner1.property2.set(1.2f); propertyOwner2.property1.set(true); propertyOwner2.property2.set(3.4); // unregister propertyObserverOwner1's observers propertyOwner1.property1.unregisterObserver(&propertyObserverOwner1.propertyOwner1_property1_Observer_); propertyOwner2.property2.unregisterObserver(&propertyObserverOwner1.propertyOwner2_property2_Observer_); propertyOwner1.property1.set(2); propertyOwner1.property2.set(4.5f); }

Output shows that this time a single update() call is made for a single property change:

struct PropertyOwner1's property of type int has been set to 1
PropertyObserverOwner1::OnPropertyOwner1Property1Changed():
new value is: 1
struct PropertyOwner1's property of type float has been set to 1.2
PropertyObserverOwner2::OnPropertyOwner1Property2Changed():
new value is: 1.2
struct PropertyOwner2's property of type bool has been set to 1
PropertyObserverOwner2::OnPropertyOwner2Property1Changed()
new value is: 1
struct PropertyOwner2's property of type double has been set to 3.4
PropertyObserverOwner1::OnPropertyOwner2Property2Changed():
new value is: 3.4
struct PropertyOwner1's property of type int has been set to 2
struct PropertyOwner1's property of type float has been set to 4.5
PropertyObserverOwner2::OnPropertyOwner1Property2Changed():
new value is: 4.5

I don't like the previous implementation. It is too complex and its scalability is questionable.

I was always a big fan of C# events and delegates. It is so easy to bind some event with its handler. I tried to implement something similar in C++ and here is the result:

main.cpp:

#include
#include
#include

// use base class to resolve the problem of how to put into collection objects of different types
template
struct PropertyChangedDelegateBase
{
virtual ~PropertyChangedDelegateBase(){};
virtual void operator()(const TPropertyType& t) = 0;
};

template
struct PropertyChangedDelegate : public PropertyChangedDelegateBase
{
THandlerOwner* pHandlerOwner_;

typedef void (THandlerOwner::*TPropertyChangeHandler)(const TPropertyType&);
TPropertyChangeHandler handler_;

public:
PropertyChangedDelegate(THandlerOwner* pHandlerOwner, TPropertyChangeHandler handler) :
pHandlerOwner_(pHandlerOwner), handler_(handler){}

void operator()(const TPropertyType& t)
{
(pHandlerOwner_->*handler_)(t);
}
};

template
class PropertyChangedEvent
{
public:
virtual ~PropertyChangedEvent(){};

void add(PropertyChangedDelegateBase* const d)
{
std::list* const>::const_iterator it = std::find(observers_.begin(), observers_.end(), d);
if(it != observers_.end())
throw std::runtime_error("Observer already registered");

observers_.push_back(d);
}

void remove(PropertyChangedDelegateBase* const d)
{
std::list* const>::const_iterator it = std::find(observers_.begin(), observers_.end(), d);
if(it != observers_.end())
observers_.remove(d);
}

// notify
void operator()(const TPropertyType& newValue)
{
std::list* const>::const_iterator it = observers_.begin();
for(; it != observers_.end(); ++it)
{
(*it)->operator()(newValue);
}
}

protected:
std::list* const> observers_;
};

// class that owns concrete subjects
class PropertyOwner1
{
int property1_;
float property2_;
public:
PropertyChangedEvent property1ChangedEvent;
PropertyChangedEvent property2ChangedEvent;

PropertyOwner1() :
property1_(0),
property2_(0.0f)
{}

int property1() const {return property1_;}
void property1(int n)
{
if(property1_ != n)
{
property1_ = n;
std::cout << "PropertyOwner1::property1(): property1_ set to " << property1_ << std::endl; property1ChangedEvent(property1_); } } float property2() const {return property2_;} void property2(float n) { if(property2_ != n) { property2_ = n; std::cout << "PropertyOwner1::property2(): property2_ set to " << property2_ << std::endl; property2ChangedEvent(property2_); } } }; // class that owns concrete subjects class PropertyOwner2 { bool property1_; double property2_; public: PropertyChangedEvent property1ChangedEvent;
PropertyChangedEvent property2ChangedEvent;

PropertyOwner2() :
property1_(false),
property2_(0.0)
{}

bool property1() const {return property1_;}
void property1(bool n)
{
if(property1_ != n)
{
property1_ = n;
std::cout << "PropertyOwner2::property1(): property1_ set to " << property1_ << std::endl; property1ChangedEvent(property1_); } } double property2() const {return property2_;} void property2(double n) { if(property2_ != n) { property2_ = n; std::cout << "PropertyOwner2::property2(): property2_ set to " << property2_ << std::endl; property2ChangedEvent(property2_); } } }; // class that observes changes in property1 of PropertyOwner1 and property1 of PropertyOwner2 struct PropertyObserver1 { void OnPropertyOwner1Property1Changed(const int& newValue) { std::cout << "\tPropertyObserver1::OnPropertyOwner1Property1Changed(): \n\tnew value is: " << newValue << std::endl; } void OnPropertyOwner2Property1Changed(const bool& newValue) { std::cout << "\tPropertyObserver1::OnPropertyOwner2Property1Changed(): \n\tnew value is: " << newValue << std::endl; } }; // class that observes changes in property2 of PropertyOwner1 and property2 of PropertyOwner2 struct PropertyObserver2 { void OnPropertyOwner1Property2Changed(const float& newValue) { std::cout << "\tPropertyObserver2::OnPropertyOwner1Property2Changed(): \n\tnew value is: " << newValue << std::endl; } void OnPropertyOwner2Property2Changed(const double& newValue) { std::cout << "\tPropertyObserver2::OnPropertyOwner2Property2Changed(): \n\tnew value is: " << newValue << std::endl; } }; int main(int argc, char** argv) { PropertyOwner1 propertyOwner1; PropertyOwner2 propertyOwner2; PropertyObserver1 propertyObserver1; PropertyObserver2 propertyObserver2; // register observers PropertyChangedDelegate delegate1(&propertyObserver1, &PropertyObserver1::OnPropertyOwner1Property1Changed);
propertyOwner1.property1ChangedEvent.add(&delegate1);

PropertyChangedDelegate delegate2(&propertyObserver2, &PropertyObserver2::OnPropertyOwner1Property2Changed);
propertyOwner1.property2ChangedEvent.add(&delegate2);

PropertyChangedDelegate delegate3(&propertyObserver1, &PropertyObserver1::OnPropertyOwner2Property1Changed);
propertyOwner2.property1ChangedEvent.add(&delegate3);

PropertyChangedDelegate delegate4(&propertyObserver2, &PropertyObserver2::OnPropertyOwner2Property2Changed);
propertyOwner2.property2ChangedEvent.add(&delegate4);

propertyOwner1.property1(1);
propertyOwner1.property2(1.2f);

propertyOwner2.property1(true);
propertyOwner2.property2(3.4);

// unregister PropertyObserver1
propertyOwner1.property1ChangedEvent.remove(&delegate1);
propertyOwner2.property1ChangedEvent.remove(&delegate3);

propertyOwner1.property1(2);
propertyOwner1.property2(4.5f);
}

Output:

PropertyOwner1::property1(): property1_ set to 1
PropertyObserver1::OnPropertyOwner1Property1Changed():
new value is: 1
PropertyOwner1::property2(): property2_ set to 1.2
PropertyObserver2::OnPropertyOwner1Property2Changed():
new value is: 1.2
PropertyOwner2::property1(): property1_ set to 1
PropertyObserver1::OnPropertyOwner2Property1Changed():
new value is: 1
PropertyOwner2::property2(): property2_ set to 3.4
PropertyObserver2::OnPropertyOwner2Property2Changed():
new value is: 3.4
PropertyOwner1::property1(): property1_ set to 2
PropertyOwner1::property2(): property2_ set to 4.5
PropertyObserver2::OnPropertyOwner1Property2Changed():
new value is: 4.5

This is exactly what I wanted: observer registered with particular property (event), notified when property's changed and with a knowledge of property's owner and a new value.

Removing STL list elements whilst iterating through it

I have already written about the traps and solutions when we try to remove elements from vectors and maps whilst iterating through them. Today I want to cover the same topic regarding lists.

Let us assume we have a list containing integers and we want to remove all elements that are odd numbers. If not aware of std::list::remove() side effects, the first thing we would try could be the following:

When run, this program removes element with the value 1 but the next time it tries to dereference iterator it crashes with the message:

Microsoft Visual Studio
Unhandled exception at 0x00d5c7a2 in Test.exe: 0xC0000005: Access violation reading location 0xfeeefef6.

The most often source of errors when working with STL containers are invalid iterators. Before calling some function on some STL container, we need to make sure we understand how it affects container's iterators. The latest draft of the new C++ standard (C++11) says (§ 23.3.5.5):

void remove(const T& value);
template void remove_if(Predicate pred);

Effects: Erases all the elements in the list referred by a list iterator i for which the following conditions
hold: *i == value, pred(*i) != false. Invalidates only the iterators and references to the erased
elements.

Throws: Nothing unless an exception is thrown by *i == value or pred(*i) != false.

This explains what happened in our program: remove() invalidated iterator and all subsequent operations on iterator (incrementing, dereferencing) yielded undefined behaviour (crash in our case).

A simple solution to this is to save iterator's value before removing an element, increment it and then remove element on the saved (previous) iterator value. In the next iteration iterator will be pointing to the next element after the removed one:

This time program works as expected and yields the following output:

display_elements()
list[0] = 1
list[1] = 2
list[2] = 3
list[3] = 4

display_elements()
list[0] = 2
list[1] = 4

Instead of using temporary variable to store previous value of the iterator, we can use post-increment operator (this works as function arguments are fully evaluated before a function is called):

Another function that removes an element (or a range of elements) from a list is std::list::erase(). Instead of the element value, an iterator is passes as its argument. Similar to list::remove(), it invalidates the iterator of the erased element.

This yields debug assertion error:

---------------------------
Microsoft Visual C++ Debug Library
---------------------------
Debug Assertion Failed!

Program: C:\test\Tests.exe
File: c:\program files\microsoft visual studio 10.0\vc\include\list
Line: 227

Expression: list iterator not incrementable

For information on how your program can cause an assertion
failure, see the Visual C++ documentation on asserts.

(Press Retry to debug the application)
---------------------------
Abort Retry Ignore
---------------------------

Applying the same fix like before solves the problem:

If we want to remove from a list all elements that match certain criteria, the best way to achieve that is not to fiddle with iterators and loops but to use std::list::remove_if(). This function uses unary predicate to check the criteria for the current element. In our case the predicate returns true if the value is odd number:

Output:

display_elements()
list[0] = 1
list[1] = 2
list[2] = 3
list[3] = 4

display_elements()
list[0] = 2
list[1] = 4

NOTE: The signature of the predicate does not need to have const &, but the function must not modify the objects passed to it.

To make the code even smaller, we can move the code from a predicate to a lambda expression:

Monday, 30 January 2012

Thread and process synchronisation with semaphores

Semaphore is a synchronisation object that controls resource access (critical section execution) by maintaining the number (count, n) of accessors (threads or processes) (still) allowed to access the resource. While mutex strictly limits access to a single accessor at a time, semaphore allows up to N (N > 0) parallel accessors. N is defined when semaphore is being created and it represents maximum possible value of the semaphore count.

Semaphore count is changed through following operations:

wait() - which, on return, decreases count on successful return (minimum is 0)
release() - which increases count (maximum is N)

Semaphore has two states:

signalled - when 0 < n <= N; wait() does not block
non-signalled - when n == 0; wait() blocks till semaphore gets signalled (or returns on expired timeout)

There are two types of semaphores:

counting - based on the count 0 <= n <= N where N > 1
binary - (specialisation of counting) where N == 1; when in signalled state we say it is unlocked; when in non-signalled state we say it is locked

Semaphores are signalling accessors the right of way - just as traffic semaphores, but unlike them, accessors themselves are controlling when the light for others will become green - through wait() and release() operations. Accessors are wait()-ing for a semaphore. If semaphore is signalled (n > 0), wait() returns immediately, decreasing semaphore count. When accessor (this one or any other) is finished with the shared resource it calls release() on the semaphore, increasing its count. Accessor will block on wait()-ing if semaphore is non-signalled (when maximum allowed number of accessors are sharing the resource). As soon as some accessor finishes the work and releases the semaphore, accessor's wait() will unblock and it will be allowed to access the resource.

Obviously, if N is set to 1, semaphore (called binary semaphore in this case) logically behaves like a mutex, allowing only one accessor at a time. There is a difference between two of them though: only accessor that locked the mutex can unlock it (mutex is owned by the accessor), but any accessor can release semaphore.

Majority of semaphore examples on the internet are focused on consumer-producer problem. I wanted to show use of semaphore on the example of traffic control - something that resembles the real semaphore. So, let's say we have three single-lane, one way roads joining just before a bridge which is one way but has two lanes. To reduce congestion on the bridge, traffic from only two access roads is allowed at a time. There is a semaphore with the red and green light by the each road and once it turns green, it remains in that state for some time period T. First two opened roads get green light first. As soon as the timeout expires for one of those roads and semaphore shows red, semaphore will show green for traffic that has been waiting at the third road.

We can think of roads (traffic on them) as accessors and the bridge as a resource: two roads can lead traffic to the bridge at the same time (two accessors are allowed to access shared resource). Obviously, we will set the maximum value for the semaphore count to 2 in our model. Thread will wait() as long as count is 0 but as soon as it gets increased to 1, wait() returns (decreasing count to 0 again).

This example aims to show how semaphore limits parallel access to the shared resource and how accessors (threads in this case) themselves control semaphore by wait()-ing for the semaphore and release()-ing it.

To stop threads I tend to use event object - not a flag (volatile bool variable). They are thread-safe and thread callbacks don't need to return with delay of one additional wait() cycle in the case when termination has been requested.

I wrapped event and semaphore objects (handles) into RAII-compliant classes - CScopedEvent and CScopedSemaphore.

main.cpp:

Output:

Road 1 opened
0 road(s) is(are) generating traffic...
Road 1 got green light for the next 10 seconds. Generating traffic...
1 road(s) is(are) generating traffic...
Road 2 opened
1 road(s) is(are) generating traffic...
Road 2 got green light for the next 10 seconds. Generating traffic...
2 road(s) is(are) generating traffic...
Road 3 opened
2 road(s) is(are) generating traffic...

Road 1 got red light. Waiting for a green light...
1 road(s) is(are) generating traffic...
Road 3 got green light for the next 10 seconds. Generating traffic...
2 road(s) is(are) generating traffic...

Road 2 got red light. Waiting for a green light...
1 road(s) is(are) generating traffic...
Road 1 got green light for the next 10 seconds. Generating traffic...
2 road(s) is(are) generating traffic...

Road 3 got red light. Waiting for a green light...
1 road(s) is(are) generating traffic...
Road 2 got green light for the next 10 seconds. Generating traffic...
2 road(s) is(are) generating traffic...

Road 1 got red light. Waiting for a green light...
1 road(s) is(are) generating traffic...
Road 3 got green light for the next 10 seconds. Generating traffic...
2 road(s) is(are) generating traffic...

Road 2 got red light. Waiting for a green light...
1 road(s) is(are) generating traffic...
Road 1 got green light for the next 10 seconds. Generating traffic...
2 road(s) is(are) generating traffic...

Road 3 got red light. Waiting for a green light...
1 road(s) is(are) generating traffic...
Road 2 got green light for the next 10 seconds. Generating traffic...
2 road(s) is(are) generating traffic...

Road 1 got red light. Waiting for a green light...
Road 3 got green light for the next 10 seconds. Generating traffic...
1 road(s) is(are) generating traffic...2 road(s) is(are) generating traffic...

Road 2 got red light. Waiting for a green light...
1 road(s) is(are) generating traffic...
Road 1 got green light for the next 10 seconds. Generating traffic...
2 road(s) is(are) generating traffic...

Road 3 got red light. Waiting for a green light...
1 road(s) is(are) generating traffic...
Road 2 got green light for the next 10 seconds. Generating traffic...
2 road(s) is(are) generating traffic...

Road 1 got red light. Waiting for a green light...
Road 3 got green light for the next 10 seconds. Generating traffic...
1 road(s) is(are) generating traffic...
2 road(s) is(are) generating traffic...

Road 2 got red light. Waiting for a green light...
1 road(s) is(are) generating traffic...
Road 1 got green light for the next 10 seconds. Generating traffic...
2 road(s) is(are) generating traffic...

Road 3 got red light. Waiting for a green light...
1 road(s) is(are) generating traffic...
Road 2 got green light for the next 10 seconds. Generating traffic...
2 road(s) is(are) generating traffic...
Closing road 3 ...
Road 3 got request to get closed
Road 3 closed
Closing road 2 ...

Road 1 got red light. Waiting for a green light...
1 road(s) is(are) generating traffic...
Road 1 got green light for the next 10 seconds. Generating traffic...
2 road(s) is(are) generating traffic...

Road 2 got red light. Waiting for a green light...
1 road(s) is(are) generating traffic...
Road 2 got request to get closed
Road 2 closed
Closing road 1 ...

Road 1 got red light. Waiting for a green light...
0 road(s) is(are) generating traffic...
Road 1 got request to get closed
Road 1 closed

The example above shows how semaphore controls access to the resource by multiple threads. But please note that resource was NOT made thread safe! Semaphore was just allowing up to N threads (2 in our case) to be active at a time (and generate the traffic towards the bridge). If we wanted to limit the number of vehicles on the bridge and to control traffic lights depending on the current bridge load, we would have had to limit the number of active threads to 1. In that case only one road would have green light at a time.

In the next example I want to show how to synchronise multiple processes in accessing shared resource. Let's say we have an app which writes some log into the file and does it in a loop. The code could look like this:

main.cpp:

If run with parameter of value e.g. "012345", this app will create text file with the following content:

test.log:

[PID = 4408] Iteration # 1 012345
[PID = 4408] Iteration # 2 012345
[PID = 4408] Iteration # 3 012345
[PID = 4408] Iteration # 4 012345
...

If we run simultaneously two or more instances of this process, they will all write into the same file, increasing its size with each write operation. Manipulator endl inserts new line character ('\n') at the end of the line and flushes the buffer to the disk. Obviously, before writing to the disk, our file stream object needs to know the current size of the file in order to move write pointer to the file end. If Process2 appends a new line to the file after Process1 reads file size but before Process1 writes into it, Process2 will increase file size but Process1 will know only about the previous file size and start writing at the position set accordingly, effectively overwriting Process2's last written line!

The following code is the content of the script (DOS batch file) which runs three instances of our application, providing each with different argument:

run_processes.bat:

@start Process.exe 012345
@start Process.exe ABCDEFHIJKLM
@start Process.exe 987654321

Arguments are of different length so we can easily detect the place of the corruption in the output file, like this one:

test.log:

...
[PID = 8280] Iteration # 214 ABCDEFHIJKLM
[PID = 8120] Iteration # 208 987654321
M
[PID = 8280] Iteration # 216 ABCDEFHIJKLM
...

What happened here? First of all, we need to know that on Windows, \n read from stream buffer is expanded to \r\n (CR-LF) before writing it to the file on disk. Process 8120 has updated its knowledge of file size but before it wrote iteration #208 log, process 8280 had written its log for iteration #215 so basically we had this:

test.log (showing hidden CR-LF characters):

...
[PID = 8280] Iteration # 214 ABCDEFHIJKLM\r\n
[PID = 8280] Iteration # 215 ABCDEFHIJKLM\r\n
...

Then process 8120 wrote its #208 log, but effectively overwriting 8280's #215, after what 8280 wrote its log #216:

test.log (showing hidden CR-LF characters):

...
[PID = 8280] Iteration # 214 ABCDEFHIJKLM\r\n
[PID = 8120] Iteration # 208 987654321\r\nM\r\n
[PID = 8280] Iteration # 216 ABCDEFHIJKLM\r\n
...

Obviosuly, we need to protect file so only one process is accessing it at a time. We can do that with semaphore or mutex which are shared between multiple processes (and therefore must be named).

In this article I will show how to achieve it with semaphore:

main.cpp:

All processes are competing to get semaphore signal. First process whose wait() returns (decreasing semaphore count by 1 - possibly to minimal value of 0 in which case all other processes block on their wait()) gets a exclusive access to a file and updates its content after which it releases semaphore (increasing its count to 1 again). All processes are competing again and the one whose wait() returns first gets its slot of exclusive access. There is no corruption in the file any more:

test.log:

...
[PID = 6036] Iteration # 213 987654321
[PID = 10940] Iteration # 213 012345
[PID = 11644] Iteration # 213 ABCDEFHIJKLM
[PID = 6036] Iteration # 214 987654321
[PID = 10940] Iteration # 214 012345
[PID = 11644] Iteration # 214 ABCDEFHIJKLM
[PID = 6036] Iteration # 215 987654321
[PID = 10940] Iteration # 215 012345
[PID = 11644] Iteration # 215 ABCDEFHIJKLM
...

Note: Although this example uses semaphore for process synchronisation, mutex is here more natural solution - we don't want to limit number of accessors to several (N) but only to 1. Only process that locks the mutex can unlock it and with binary semaphore we are just emulating this behaviour.

Links and References:
Semaphore Objects (MSDN)
Using Semaphore Objects (MSDN)
Semaphore (Wikipedia)
Windows Thread Synchronization - Synchronization Using Semaphores
Joseph M. Newcomer: Semaphores
Mutex or Semaphore for Performance?
Mutex vs Semaphore

Mutex vs Semaphores
Difference between binary semaphore and mutex

Wednesday, 3 August 2011

The order of elements in STL map

Have you ever been surprised by the order of elements when traversing the map? We insert elements with random keys but elements appear sorted by their keys when we traverse that map:

Output:

Map elements:

m["Austria"] = "Vienna"
m["Belgium"] = "Brussels"
m["Czech Republic"] = "Prague"

Maybe you expected that countries would be listed in the order of insertion, starting with Belgium but no, that wasn't the case. Let's unveil that magic which made our map sorted.

STL map is associative container so it provides an ability for fast retrieval of data based on keys. Fast lookup is provided by the container's implementation. Maps are usually implemented as binary search trees (ordered, sorted binary trees) where position of the new element depends on its key. Elements remain sorted after insertion; they are NOT stored in the order of insertion like those in vector! Insertion order is lost.

When a new element is being added to a map, its key is compared with keys of existing elements in the map. For each key type there must be some function (let's call it is_less) that accepts two objects of that type, compares them and returns true if left one has lower value than the right one. Hey, we used map but we never mentioned is_less in the code! Why? That's because std::map uses its default comparison function, or more precisely, functor.

Map constructor has four parameters of which last two are optional:

Traits argument is (according to this MSDN article) "The type that provides a function object that can compare two element values as sort keys to determine their relative order in the map. This argument is optional and the binary predicate less is the default value".

std::less is default functor used for keys comparison so, in our case, it was std::less⟨std::string⟩ that helped that wizzard to order elements during insertion.

There is NO way of preserving the order of insertion. The only thing we can do is to provide some custom comparison function/functor which will redefine the meaning of the 'is less' relation. It's a bit of cheating but this way we can insert our countries in reverse order. All we need is a functor 'is_less' which returns true when string1 is actually greater than string2:

With the map which uses this custom comparison function the output is:

Map elements:

m["Czech Republic"] = "Prague"
m["Belgium"] = "Brussels"
m["Austria"] = "Vienna"

Monday, 1 August 2011

Removing selected map elements whilst iterating through it

Couple of days ago I wrote about how to remove elements from vector while iterating through it and today I will focus on the same task but performed on STL map data structure.

Let's say we have a dictionary which keys are names of countries and values are names of their capitals. There was an error when capitals were typed so all their names begin with small letters. We want to correct this mistake and make those names to start in capital letters. On the top of this, for some reason, we want to remove those countries which capitals start with a letter 'B'.

Someone might write something like this:

When execution tries to move iterator to the next position after one element has been erased Debugger brings up Debug Assertion window:

In aforementioned article we learned that vector::erase invalidates iterators that point at or after element being deleted. Now we know what could have caused this assertion to fail - invalidated iterator. Documentation says that map::erase invalidates only iterator that points to the element being deleted. That was exactly what happened in our case - iterator became invalid after map::erase so it could not have been increased so to point to the next element.

The trick here is to increase iterator BEFORE calling erase:

Output:

Initial map elements:

m["Austria"] = "vienna"
m["Belgium"] = "brussels"
m["Czech Republic"] = "prague"
m["Denmark"] = "copenhagen"
m["Estonia"] = "tallinn"
m["Finland"] = "helsinki"
m["Greece"] = "athens"
m["Hungary"] = "budapest"
m["Iceland"] = "reykjavik"
m["Latvia"] = "riga"
m["Malta"] = "valletta"
m["United Kingdom"] = "london"

Map elements after deletions:

m["Austria"] = "vienna"
m["Czech Republic"] = "prague"
m["Denmark"] = "copenhagen"
m["Estonia"] = "tallinn"
m["Finland"] = "helsinki"
m["Greece"] = "athens"
m["Iceland"] = "reykjavik"
m["Latvia"] = "riga"
m["Malta"] = "valletta"
m["United Kingdom"] = "london"

Nice! No scary assertion dialogs, capitals with 'B' are removed...err...Why do names of capital cities still start with small letters??? We get object for a given dictionary key and we do CHANGE it, right? Do we? Which object is actually changed? In the example above strCapital is string object, initialized with string returned by dereferencing iterator...but strCapital object is totally independent from the object IN the dictionary! It is a COPY! We do change that local variable but that is just wasted effort as we don't change object in the dictionary! We need to access ORIGINAL object and we can do it either by using pointer to it or its reference. This is C++ program and natural choice is to use reference:

Output is now as expected:

Initial map elements:

m["Austria"] = "vienna"
m["Belgium"] = "brussels"
m["Czech Republic"] = "prague"
m["Denmark"] = "copenhagen"
m["Estonia"] = "tallinn"
m["Finland"] = "helsinki"
m["Greece"] = "athens"
m["Hungary"] = "budapest"
m["Iceland"] = "reykjavik"
m["Latvia"] = "riga"
m["Malta"] = "valletta"
m["United Kingdom"] = "london"

Map elements after deletions:

m["Austria"] = "Vienna"
m["Czech Republic"] = "Prague"
m["Denmark"] = "Copenhagen"
m["Estonia"] = "Tallinn"
m["Finland"] = "Helsinki"
m["Greece"] = "Athens"
m["Iceland"] = "Reykjavik"
m["Latvia"] = "Riga"
m["Malta"] = "Valletta"
m["United Kingdom"] = "London"

Friday, 29 July 2011

Removing selected vector elements whilst iterating through it

Have you ever tried to iterate through vector and remove some elements? For example, you need to remove all odd numbers from a collection stored in STL vector. You might have been tempted to write something like this:

When execution tries to move iterator to the next position after one element has been erased Debugger brings up Debug Assertion window:

Who does _Has_container() belong to and why did this assertion fail?

In a very simplified way, we can say that iterator contains two pointers: one that points to container it iterates through (this pointer is named as _Mycont in the STL code) and another one (_Myptr) which points to the current position (element) in the container.

When some element is removed from the vector, contiguity of its storage must be retained so all subsequent elements are reallocated in memory (shifted towards the empty position) invalidating any iterator pointing to them.

If you check what C++ standard says about this, you will be surpised: ISO/IEC 14882:2003 states (in paragraph 23.2.4.3) that vector::erase "Invalidates all the iterators and references after the point of the erase." But hey, the only iterator we were using is the one that points at the location of erased element, not AFTER it! So our program should be working, isn't it? The problem is that current C++ standard incorrectly describes effect of vector::erase. Draft of the new standard (N3225) is more accurate about this matter and says that vector::erase "invalidates iterators and references at or after the point of the erase."

Ok, so now we know that our iterator has been invalidated. A look at the _Mycont value shows 0 which means that this iterator is not associated with any container.

How can we resolve this situation? We need to look at vector::erase not as an enemy but as a good friend! C++ standard says: "The iterator returned from a.erase(q) points to the element immediately following q prior to the element being erased. If no such element exists, a.end() is returned." So, why don't we just assign our iterator to this value and keep iterating happily?
Following code achieves what we want:

Output:

Initial vector elements:
v[0] = 1
v[1] = 2
v[2] = 3
v[3] = 4
v[4] = 5
v[5] = 6
v[6] = 7
v[7] = 8
v[8] = 9
v[9] = 10

Vector elements after removing odd values:
v[0] = 2
v[1] = 4
v[2] = 6
v[3] = 8
v[4] = 10

I found here and here similar discussion about this topic.

You need to be aware of the fact that after erasing the last element in the vector iterator points to the vector end (a position past the last element), in which case you cannot dereference it, so make sure you check its value if you are going to use it before jumping back to the for loop condition check:

It is always a good idea to reuse ready-made routines which are generic and tested. Dig into 〈algorithm〉 header when solving STL-related problems. In our case, remove_if draws our attention for its name. Basically, this function iterates through original set and shifts towards beginning all elements that don't meet condition by copying them over all elements that meet the condition. Return value is iterator that points to the next element behind this set of 'survived' elements. Range passed to this function still has the same size but all 'survivors' are grouped at its beginning. Elements behind this group become irrelevant (unspecified, to say it accurately) and can be deleted. That is so-called Erase-Remove idiom:

Why does iterator get invalidated after erasing an element?

Erasing an element leaves a gap in the memory and because it is guaranteed that vector keeps all its elements in contiguous memory (like an array) it will shift remaining elements in order to fill that gap. Address of the element that has been deleted is stored inside iterator and that address is now not a valid one. That's why is iterator invalidated completely. Luckily, vector::erase() returns a new valid iterator that points to the next element after the deleted one (or to the vector end) and we can use it safely.

Pages