Looking under the hood….

Over the last year now, I’ve been attending the UK BSi C++ Panel meetings in London, and it’s been an eye opening experience. I must confess that often I sit there listening, and taking lots of notes of all the stuff I want to look up when I get home, (it’s a long list…) But I was having a chat with a friend of mine, to whom I mentioned “I want to get to look under the hood of the language, so I can follow these disucssions…”

The glint in his eye should have been my cue to run away, but I didn’t see it, so eager I was to learn more, so he gave me some helpful hints as to where to start, and that’s how this blog post came about.  I’m going to be writing about Reference/Value semantics.

Let me say right now, I’m not a deep expert, this article is the result of a lot of reading, and trying to get my own head around it all.

C++ has value semantics by default, whereas other languages such as Java, Python etc have reference semantics by default. This clearly marks C++ as a different breed of programming language, which also raises some interesting questions (which my new mentor challenged me to think about…):

  • What does it mean to have value semantics by default?
  • How does this make a difference in C++?
  • How does this make C++ different to other languages?
  • What are the implications of these differences in regards to:
    • Performance?
    • Memory usage and allocation?
    • Resource management?

What does it mean to have value semantics by default?

Maybe a good starting point would be “What does value semantics mean?”  In its simplest terms, value semantics is a term used to describe a programming language that is primarily concerned about the value of an object, rather than the object itself.  The objects are used to denote values, we don’t really care about the identity of the object in such a programming language.

Now, it’s important to note that when I speak of objects in C++, I don’t mean the Java or Python definition of an object, which is something that’s an instance of a class, and has methods and such. In C++, an object is a piece of memory that has:

  • an address (@0FC349 for example)
  • a type (int)
  • capable of storing a value (42)

This leads to another important factoid to consider. Because the sequence of bits stored in memory can be interpreted differently depending on its type. For example the binary value in memory of 10000001 can be seen as 65 if the type is a short int, yet in can also be interpreted as ‘A’ if the type is a char.

Now it’s important to note that C++ has value semantics by default. That is to say, there are no keywords or special symbols you need to use, to tell the language that you’re using value semantics.

Consider the following code snippet:

x = y;

What’s going on here? Well the = isn’t an equality operator in C++, it’s an assignment operator. And in this context, the value of y is being copied to x.

But X isn’t a value, but an object. So Why isn’t X a value? Well, it’s because it can be 1 at one moment, 45 at another time. So if we want to know the value of X, we’d need to query the address X is held in to get that value.

So why did C++ go down the route of having value semantics over reference semantics?

  • Allocating on the stack is faster than allocating on the heap.
  • Local values are good for cache locality. If C++ had no value semantics, it wouldn’t be possible to have a contiguous std::vector, you’d simply have an array of pointers, which would lead to memory fragmentation.

Why use value semantics then?

You get value semantics by default in C++, but you need to make a specific effort to use reference semantics in C++ by adding a reference or pointer type symbol (&, *)

Using value semantics we don’t run in to memory management issues such as :

  • No dangling references to a non-existent object
  • No expensive and unnecessary free store allocations.
  • No memory leaks.
  • No smart/dumb pointers

It also helps to avoid reference aliasing issues in multi-threaded environments.  Passing by value and ensuring each thread has its own cope of the value helps to prevent data races.

You also don’t need to synchronise on such values, and the programs run faster, and safer as you avoid deadlocks.

It’s also beneficial for referential transparency. This means that we get shocks or surprises when a value is changed behind the scenes.

And using Pass by value is often safer than pass by reference, because you cannot accidentally modify the parameters to your method/function. This makes the language simpler to use, since you don’t have to worry about the variables you pass to a function, as you know they won’t be changed and this is often what’s expected.

Then when do we use Reference Semantics?

We use reference semantics when something has to be in the same location in memory each time.  A good example of this would be something like std::cout or any such global.

You also use reference semantics when you want to modify the value you’re passing to your function, and this is made explicit in C++ by passing a reference pointer to your function.

e.g.

void foo::do_something(int & some_value) {
...
}

This is just a starter for 10 type article, I will go deeper in to this as time goes on and I learn more 🙂

In the mean time, happy coding.

 

Advertisements

About welshboy2008

In my spare time, I read, listen to music, play my guitar among other stuff. I love to write computer programs too. In various languages. But mostly C# and Java. I'm currently learning: PHP, Javascript, XML, ASP.NET, Python, C and Assembler.

Posted on 09/09/2018, in Uncategorized. Bookmark the permalink. Leave a comment.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: