Posts C++ lvalues and rvalues notes
Post
Cancel

C++ lvalues and rvalues notes

On a very basic level:

  • an lvalue is an expression referring to an object
  • an object is a region of storage
  • an rvalue is simply an expression that’s not an lvalue
1
2
int n; // a definition for an integer object named n
n = 1; // an assignment expression

n in this case is a sub-expression referring to an int object. It’s an lvalue. 1 is a sub-expression not referring to an object. It’s an rvalue.

1
x[i + 1] = abs(p->value);

x[i + 1] is an expression. So is abs(p->value). For the assignment to be valid the left operand must be an lvalue, it must refer to an object. The right operand can be either an lvalue or rvalue, it can be any expression.

Why do we even need distinction between lvalues and rvalues?

  • Compilers can assume that rvalues don’t necessarily occupy storage
  • This offers considerable freedom in generating code for rvalue expressions

Considering:

1
2
int n; // declaration for an integer object named n
n = 1; // assignment expression

A compiler might represent 1 as named data storage initialized with the value 1 (as if 1 were an lvalue):

1
2
one:
    .word 1

The compiler would generate code to copy from that initialized storage to the storage allocated for n:

1
mov n, one

A lot of machines however have immediate mode addressing: a source operand value can be part of an instruction

1
mov n, #1

In this case the rvalue 1 never appears as an object in the data space, rather, it appears as part of an instruction in the code space.

On some machines the way to put 1 into an object may be to clear the object and then increment it.

1
2
clr n
inc n

The data representing the values of 0 and 1 do not appear in either the source or object code.

Now suppose we write something obviously wrong like:

1
1 = n; // obviously an error

Why exactly does C++ reject it as an error? Well, we are breaking the rule of assignment, where the left operand must be an lvalue, since 1 is in fact, an rvalue.

An lvalue can appear on either side of an assignment, as in:

1
2
int m, n;
m = n;

Obviously, you can assign the value in n to the objected designated by m. This assignment uses the lvalue expression n as an rvalue. Officially, C++ performs an lvalue-to-rvalue conversion.

Both operands of the binary addition operator must be an expression (with suitable types). But each operand can be either an lvalue or rvalue:

1
2
3
4
int x;

x + 2 // lvalue + rvalue
2 + x // rvalue + lvalue

What about the result? An expression such as m + n actually places its result in a compiler-generated temporary object, often a CPU register. Such temporary objects are rvalues.

For example, this is obviously an error:

1
m + 1 = n; // error

The + operator has higher precedence than =, so the assignment expression is equivalent to m + 1 = n;, which is an error as m + 1 yields an rvalue.

Another example to consider is the unary &. &e is a valid expression only if e is an lvalue. So, &3 is an error as 3 does not refer to an object, so it’s not addressable. Although the operand must be an lvalue, the result in an rvalue. For example:

1
2
3
4
int n, *p;

p = &n; // okay as n is an lvalue
&n = p; // error as &n is an rvalue

In contrast, the unary * yields an lvalue. A pointer p can point to an object, so *p is an lvalue.

1
2
3
4
5
6
int a[N];
int *p = a;
char *s = nullptr;

*p = 3; // okay, *p is an lvalue
*s = '\0'; // undefined behavior

*s in a lvalue even if s is null. If s is null, evaluating *s causes undefined behavior.

In theory, rvalues don’t occupy data storage in the object program. In reality, some might. C++ however insists that we program as if non-class rvalues don’t occupy storage. Conceptually, lvalues occupy data storage. In truth, the optimizer might eliminate some of them. C++ lets us assume that lvalues always do occupy storage.

Additionally, not all lvalues can appear on the left of an assignment. An lvalue is non-modifiable if it has a const-qualified type. For example:

1
2
char const name[] = "dan";
name[0] = "D"; // error - name[0] is const

Each element of a const array is itself const.

Lvalues and rvalues also provide a vocabulary for describing subtle behavioral differences such as between enumeration constants and const objects.

For example:

1
enum { MAX = 100 };

This MAX is a constant of an unnamed enumeration type. Unscoped enumeration values implicitly convert to integer. So, when MAX appears in an expression, it yields an integer rvalue. Thus you can’t assign to it, nor can you take its address.

1
2
MAX += 3; // error, MAX is an rvalue
int *p = &MAX; // error, MAX is an rvalue

On the other hand, if MAX is a const-qualified object like:

1
int const MAX = 100;

When it appears in an expression, its a non-modifiable lvalue. Thus, you cant assign to it, however you can take its address:

1
2
MAX += 3; // error, MAX is non-modifiable
int const *p = &MAX; // this is okay, MAX is an lvalue

All of these concepts of lvalues and rvalues help explain C++ reference types. References provide an alternative to pointers as a way of associating names with objects. C++ libraries often use references instead of pointers as function parameters and return types.

Consider the following:

1
2
int i; // define i as an integer object
int &ri = i; // define ri as a reference to int

Here, the reference ri is an alias for i.

A reference is essentially a pointer thats automatically dereferenced each time its used. You can rewrite most, if not all, code that uses a reference as code that uses a const pointer as in:

reference notationequivalent pointer notation
int &ri = i;int *const cpi = &i;
ri = 4;*cpi = 4;
int j = ri + 2;int j = *cpi + 2;

A reference acts like a const pointer thats dereferenced whenever you touch it. A reference yields an lvalue.

So why even use references?

References can provide better function interfaces. More specifically, C++ has references so that overloaded operators can look like built-in operators.

A common decision that we have to make is now, when writing a function, should we pass by reference to const or pass by value. Either way calling the function won’t alter the actual argument x. If passed by value, the function only has access to a copy of x, not itself. If by reference to const, the function parameter is declared as non-modifiable. So how do we choose?

Well, it all depends on performance. Passing by reference to const might be much more efficient than passing by value. It depends on the cost to make a copy.

Whereas an lvalue reference declaration uses the & operator, an rvalue reference uses the && operator. For example, as seen in the RAII post, modern C++ uses rvalue references to implement move operations that can avoid unnecessary copying.

C++ also introduces further classification:

  • gvalue - a generalized lvalue
  • prvalue - a pure rvalue
  • xvalue - an expiring lvalue
This post is licensed under CC BY 4.0 by the author.

Recent Update

    Contents