Friday, 29 July 2011

Removing selected vector elements whilst iterating through it

Have you ever tried to iterate through vector and remove some elements? For example, you need to remove all odd numbers from a collection stored in STL vector. You might have been tempted to write something like this:



When execution tries to move iterator to the next position after one element has been erased Debugger brings up Debug Assertion window:

dbgassertfailed1

Who does _Has_container() belong to and why did this assertion fail?

In a very simplified way, we can say that iterator contains two pointers: one that points to container it iterates through (this pointer is named as _Mycont in the STL code) and another one (_Myptr) which points to the current position (element) in the container.

When some element is removed from the vector, contiguity of its storage must be retained so all subsequent elements are reallocated in memory (shifted towards the empty position) invalidating any iterator pointing to them.

If you check what C++ standard says about this, you will be surpised: ISO/IEC 14882:2003 states (in paragraph 23.2.4.3) that vector::erase "Invalidates all the iterators and references after the point of the erase." But hey, the only iterator we were using is the one that points at the location of erased element, not AFTER it! So our program should be working, isn't it? The problem is that current C++ standard incorrectly describes effect of vector::erase. Draft of the new standard (N3225) is more accurate about this matter and says that vector::erase "invalidates iterators and references at or after the point of the erase."

Ok, so now we know that our iterator has been invalidated. A look at the _Mycont value shows 0 which means that this iterator is not associated with any container.

How can we resolve this situation? We need to look at vector::erase not as an enemy but as a good friend! C++ standard says: "The iterator returned from a.erase(q) points to the element immediately following q prior to the element being erased. If no such element exists, a.end() is returned." So, why don't we just assign our iterator to this value and keep iterating happily?
Following code achieves what we want:



Output:

Initial vector elements:
v[0] = 1
v[1] = 2
v[2] = 3
v[3] = 4
v[4] = 5
v[5] = 6
v[6] = 7
v[7] = 8
v[8] = 9
v[9] = 10

Vector elements after removing odd values:
v[0] = 2
v[1] = 4
v[2] = 6
v[3] = 8
v[4] = 10

I found here and here similar discussion about this topic.

You need to be aware of the fact that after erasing the last element in the vector iterator points to the vector end (a position past the last element), in which case you cannot dereference it, so make sure you check its value if you are going to use it before jumping back to the for loop condition check:



It is always a good idea to reuse ready-made routines which are generic and tested. Dig into 〈algorithm〉 header when solving STL-related problems. In our case, remove_if draws our attention for its name. Basically, this function iterates through original set and shifts towards beginning all elements that don't meet condition by copying them over all elements that meet the condition. Return value is iterator that points to the next element behind this set of 'survived' elements. Range passed to this function still has the same size but all 'survivors' are grouped at its beginning. Elements behind this group become irrelevant (unspecified, to say it accurately) and can be deleted. That is so-called Erase-Remove idiom:



Why does iterator get invalidated after erasing an element?

Erasing an element leaves a gap in the memory and because it is guaranteed that vector keeps all its elements in contiguous memory (like an array) it will shift remaining elements in order to fill that gap. Address of the element that has been deleted is stored inside iterator and that address is now not a valid one. That's why is iterator invalidated completely. Luckily, vector::erase() returns a new valid iterator that points to the next element after the deleted one (or to the vector end) and we can use it safely.

Resource Hacker

Yesterday I needed to add version info to NSIS installer executable and came across some unexpected behaviour of VIProductVersion command. Anders mentioned one handy tool - Resource Hacker so can view file version header (among other resources).

ResourceHacker by Bojan Komazec
ResourceHacker, a photo by Bojan Komazec on Flickr.

Thursday, 28 July 2011

File Version VS Product Version

Right click on a file, select Properties, click on Details tab and you will see File properties. File Version and Product Version are among them but what is the difference between these two?


Software product shipped to customers often comprises multiple files. Files have their own life, updates and patches so their version numbers are independent from product versions. Their numbering scheme can be different from one of the product (package) and this is very often the case when file is shared between multiple products. Product Version depends on changes in its components though.

File Version can increase with each build or after code changes (new features added, bug fixes). Changes in components are reflected in final package version: Major Product Version number increases after new features has been added and its Minor number is usually increased for patch releases.


Wednesday, 27 July 2011

MFC ASSERT trap

ASSERT page in MSDN says that "In the Release version of MFC, ASSERT does not evaluate the expression and thus will not interrupt the program. If the expression must be evaluated regardless of environment, use the VERIFY macro in place of ASSERT". This can be somewhat misleading as ASSERT will definitely not bring up Assert Dialog Box in Release build of your program but can break its logic if expression within ASSERT macro changes program's state. Line with ASSERT is removed in Release build and expression is never executed!

This kind of logical errors can be avoided by using macro VERIFY: it works as ASSERT in Debug mode and expression in its argument is executed in Release mode.


This line is not executed in Release build:

This is safer as foo() will be executed:

VERIFY macro is safe and compact - foo() is always executed: