17 August 2006

A Rant about Const

WARNING: THE FOLLOWING MAY CAUSE BRAIN DAMAGE

This is an unfinished essay that sat in draft form in my bog, unpublished, for several years. It seems very unfinished, but I decided to go ahead and publish it anyway. I'm not sure it actually makes sense; it certainly needs some revision. It's possibly my thinking has just been too altered by working with pure functional programming languages such as Haskell to really understand the C++ mindset anymore. There may indeed be a point buried in here somewhere, if only I can determine what it is, but it might be so trivial or pedantic as to be uninteresting to anyone but me. But I'll let you the reader decide that.

A while back I was thinking about why I couldn't qualify the return value of a function -- in this case an object -- with const. When compiling some code with GCC, such a qualifier produced a warning, somehting like "type qualifier on return type is meaningless." GCC was calling my code meaningless.

Maybe it isn't sensible, and maybe the compiler can't enforce it, but it seemed to express what I meant, so I looked this up in the comp.lang.c++ newsgroups. What I keep reading is that returning a const value, when the value is int or float or some other plain built-in data type, as opposed to a class type, is not "meaningful."

Either I'm too dumb to understand this argument or the argument isn't actually coherent, it is just widely propagated because everyone mistakes the existing C/C++ runtime behavior for a truly sensible model of language semantics.

What they seem to be actually saying is "according to the language standard, C++ is not allowed to enforce meaningful const semantics on return types that are not reference or pointer types. And pointer types are hopelessly unsafe. But references can be pretty unsafe too."

There was a thread about this on comp.lang.c++, specifically about why you can't return a const int. One of the comments was:

"A value returned by a function is a rvalue, unless it is a reference. The notion of being (or not being) const-qualified is not applicable to rvalues of non-class types ('int' in your case). A rvalue cannot be modified just because it is a rvlaue. That's it. It doesn't make any difference whether you const-qualify it or not. That's what your compiler is trying to tell you."

See: http://groups.google.com/group/comp.lang.c++/browse_thread/thread/4e50f8abf9c11f71/6039d5b2e6612df0%236039d5b2e6612df0

As far as I can tell, the C++ standard specifies this because of the way returning values from functions has usually been implemented in C/C++ (by copying on the stack). So the caller isn't actually getting the "same" object or integer value, and in fact the original will be gone.

What this means is actually that you can freely change the return value:

int y = f(x);
y = y + 1;

But this is not actually "changing the rvalue" because the rvalue was actually copied into y; you're just changing a copy. So the statement above is correct in that the rvalue "cannot be modified." (It means that only indirectly).

In cases even when the compiler puts the return values in registers, or wherever, the standard rules about lvalues and rvalues still apply, for sanity and safety and consistency.

But it is not what I want to express when I use const with a return value -- consider it an annotation, if you will, rather than something the compiler will enforce. In the methods that generated warnings, I was using const return values to mean:

Case 1: what I'm returning is the result of a mathematical function call: f(x) always returns the same value for the given x, in a functional programming sense, so it wouldn't make sense for you to change the value I'm returning to you. If my function f is f(x) = 1 + x, then f(1) = 2. It doesn't make any sense to then say f(1) = 3 later. But of course this makes more sense in Haskell, where variables don't "vary," than in C or C++.

Case 2: what I'm returning is the result of an iterator and represents the state of another object at a given moment in the life of the program. Therefore, you shouldn't change the value itself; if you want to update it, call the iterator to get a new updated value. If say iterator_state = iterator:GetNextValue() and get a new iterator_state back, it doesn't make any sense to say iterator_state = iterator_state + 1 without calling GetNextValue(). Of course, it expresses something in C++, but your program is confusing. It would be less confusing, readability-wise, if you explicitly copied iterator_state to a working variable before altering it.

It seems to me that I ought to be able to expressthe notion that the caller of one of my object's methods should not update the object I'm returning, whether "returning" means binding it via a pointer, a reference (hidden pointer), or by copying the rvalue.

If it is a const u32_t I'm returning, the compiler should insist that the value be assigned only to a const u32_t. (It doesn't).

If I'm returning a const object, the compiler, it seems to me that the compiler should be able to require that it is assigned to a const variable. (It doesn't, but at least it doesn't generate the warning in those cases). This is because returning an object on the stack does not constitute returning an "rvalue."

This is, as far as I can tell, a completely arbitrary distinction, with the sole exception that copying an object gets to invoke the copy constructor (default, or one you provide), whereas copying a built-in type just does it for you.

In a similar thread Greg Comeau wrote:

"Also, returning a const T, where T IS NOT A BUILTIN TYPE many not be meaningless, because you may end up calling member functions on that object, and the member function may need to be a const member function."

It is "meaningful" in the sense that you may want to put the returned object into a const variable to _prevent_ calling any non-const member functions. But Greg phrased it backwards. You can always call const member functions on non-const objects. He should have said "you may end up calling member functions on that object, and you might want to prevent the calling of non-const member functions."

Example:

static AZMMultiZoneRef_c const MakeFromZonesBitmap( u32_t zones_bmp );
const AZMMultiZoneRef_c zones_ref = AZMMultiZoneRef_c::MakeFromZonesBitmap( p_data->zones_bitmap );
u32_t AZMMultiZoneRef_c::MutateMe( void_t );
zones_ref.MutateMe();

The compiler will (rightly) complain about this. But it will not complain about calling const member functions on a non-const object. The intent is that you should not be able to mutate an object that is const, except using the mutable workaround to allow hidden state changes.

However, the rules for returned objects let us completely get around this! If I don't like that restriction, I can just assign the return value to a non-const object. So this is legal:

AZMMultiZoneRef_c zones_ref = AZMMultiZoneRef_c::MakeFromZonesBitmap( p_data->zones_bitmap );
zones_ref_2.MutateMe();

Not much of a restriction, then, is it?

OK, so let's say we're more serious about enforcing const. So, we'll use a pointer or reference instead, because that will maintain the const enforcement restrictions, right? BUT... the semantics are actually quite different, and there is a serious safety hole introduced:

"When returning references, a reference bound to the return value is not valid after object deletion; when returning values, a reference bound to the return value is valid after object deletion."

And, of course, if what you are returning a pointer or reference to is a an object in a local variable, that is a seriously broken program right there. So it is kind of a "damned if I do, damned if I don't" situation.

C++ is actively hostile to functional programming. In fact, the "safety features" of the language can't even be used consistently. And with this compiler, when you try to express what you mean (taken as a given that the compiler won't enforce it), it treats it as a warning!

No comments: