Home:Professional:Universal References

I recently had occasion to view Scott Meyers' presentation on MSDN Channel 9 about ‘universal references’; he also has a written version of his talk Standard C++ Foundation Web site. The ‘universal reference’ is not a concept you will see defined in the C++ standard, nor is it even something that has any conceptually objective existence in the language or compilation process. It is a construct defined by Meyers in an attempt to make some sense of behavior in the language that he presents as being unexpected or even mysterious. On closer inspection, however, I find that the observed mysterious behavior is actually quite readily explained and has an existing analog that corresponds to already intuitively-understood behavior.

The Mystery

In a nutshell, Meyers himself summarizes the mystery in the subtitle of his article:

“T&& Doesn't Always Mean ‘Rvalue Reference’”

In other words, if you see “T&&” then this may be an rvalue reference but, under certain circumstances, will actually be an lvalue reference. One example given to illustrate his point: with the declaration

template<typename T> void f(T &&t);

then in a call of f() where the actual argument is an rvalue:

f(10);

the type of the formal argument is indeed an rvalue reference T&&; whereas in the following call, where the actual argument is an lvalue:

int i = 11;
f(i);

the type of the formal argument is actually an lvalue reference T&. At first blush this perhaps seems difficult to believe, but this is correct behavior that can be readily verified in a source debugger. In fact, two different functions f(int&) and f(int&&) are instantiated.

However, given

template<typename T> void g(std::vector<T> &&param);

the parameter is always an rvalue reference. So, what gives?

Note that the ‘mystery’ pertains to the type of the argument; it has nothing to do with the fact that inside the function, reference arguments are always lvalue references (because they are named).

Universal References

Meyers' attempt to bring order to this confusion involves a new concept he calls a ‘universal reference’; a new kind of reference which, depending on context, may ‘become’ an lvalue or rvalue reference. He provides a rule of thumb for when they appear:

If a variable or parameter is declared to have type T&& for some deduced type T, that variable or parameter is a universal reference.

It isn't a complete definition, for one because it uses the term ‘deduced type’ that is not itself defined in his article and that I don't find intuitive. For instance, in a template declaration that is parameterized on some typename T, T itself would be considered ‘deduced’ whereas std::vector<T> would not. Since it seems to me that it is absolutely possible for the compiler to deduce that std::vector<T> would instantiate as std::vector<int> in some particular use of that declaration, I find the distinction somewhat lost on me.

The properties of the ‘universal reference’ are such that it becomes the kind of reference that it is initialized with. Given the declaration of the function template above, where T is considered to be a ‘deduced type’, T&& is a ‘universal reference’. Therefore, in f(10), the type of the function argument is the rvalue reference int&& because 10 is an rvalue. Similarly, in f(i) the argument type is the lvalue reference int& because i is.

The Truth

At the end of his article, Meyers does eventually give enough of an explanation so that some research allows one to see what is actually going on: §8.3.2(6) of the N3242 draft standard explains ‘reference collapsing’ rules and gives the following example using typedef:

int i;
typedef int& LRI;
typedef int&& RRI;
LRI& r1 = i; // r1 has the type int&
const LRI& r2 = i; // r2 has the type int&
const LRI&& r3 = i; // r3 has the type int&
RRI& r4 = i; // r4 has the type int&
RRI&& r5 = i; // r5 has the type int&&

Notice how the language allows the reference to be specified both on the typedef definition and in the declarator in which the typedef is used. Applying an lvalue reference in either place (“typedef int &LRI” or “RRI &r4”) produces an lvalue reference type; when rvalue references are used in both places (“typedef int &&RRI; RRI &&r5”), an rvalue reference type results. Meyers quotes Stephan Lavavej as saying that “lvalue references are contagious,” because when they appear they override (or ‘infect’) rvalue references.

The same thing happens in template parameters, so now it becomes clear what is going on in the template function example:

template<typename T> void f(
	T		&&t
	) {
	printf("in f<>() with %d\n", static_cast<int>(t));
	}

template<> void f<int>(
	int		&&x
	) {
	printf("in f<int>(%d)\n", x);
	}

template<> void f<int&>(
	int		&x
	) {
	printf("in f<int&>(%d)\n", x);
	}

int main(
	int		argc,
	char		*argv[]
	)
{
// is f<int> because T=int and int && = int&&
f(10);
// is f<int&> because T=int& so int& && = int&
int i = 11;
f(i);
}

which produces the output

in f<int>(10)
in f<int&>(11)

The key here is that reference collapsing rules allow the instantiation of f<int&> with T = int& to do something meaningful: the function argument type T&& becomes int&. Essentially the compiler uses the flexibility it has to pick an argument type that allowed the template to be instantiated.

This is why function templates with arguments like g(std::vector<T>&&) won't work: the compiler has no ability through modifying T to collapse the rvalue reference type into an lvalue reference type; and an lvalue actual parameter won't bind to an rvalue reference type.

This is Nothing New

This may seem familiar, because the same thing happens with const. I don't know if there is a word for this in the standard, but consider that constness collapses in a similar way:

typedef const int CI;
typedef int NCI;
CI r1 = 0; // r1 has the type 'const int'
const CI r2 = 0; // r2 has the type 'const int'
const NCI r4 = 0; // r4 has the type 'const int'
// ERROR: cannot assign to const variable
// r4 = 1;
NCI r5 = 0; // r5 has the type non-const 'int'
r5 = 1;

Exactly like in the left/right reference example before, the language allows constness to be specified both on the typedef declaration and in the declarator in which it's used. Applying const in either place produces a const type. But only when no const is used in either place (“typedef int NCI; NCI r5 = 0;”) does a non-const type result. In this sense, const is ‘contagious’ in the same way that lvalue references are.

To complete the analogy, in

template<typename T> void f(T t);

the declarator type T doesn't always mean ‘non-const’. The program

struct X {
	int		i;
	
			X(int ii) : i(ii) {}
	
	operator	int() const { return i; }
	};

template<typename T> void f(T &t) {
	printf("in f<>()\n");
	}

template<> void f<X>(X&) {
	printf("in f<X>(X&)\n");
	}

template<> void f<const X>(const X&) {
	printf("in f<const X>(const X&)\n");
	}

int main(
	int		argc,
	char		*argv[]
	)
{
X x(1);
f(x);	// is f<X>
const X xc(2);
f(xc);	// is f<const X>

return 0;
}

produces the following output:

in f<X>(X&)
in f<const X>(const X&)

In other words, once again the compiler used the flexibility it had to pick a type T that allowed it to instantiate a version of the function template that allowed the call of f(xc) to compile. It just so happens that T was const X.

It Really Wasn't That Complicated

Were we really surprised that T could be const in the body of f() even though we didn't specifically say so? Should we therefore also be surprised that an rvalue reference could be lvalue even though we didn't say so? The ‘contagious’ state with constness is syntactically more obvious because we have to ask for it with a keyword. In the case of references, the noncontagious state (rvalue references), perhaps unfortunately, doesn't lack syntax and that is where the symmetry breaks down.

The mistake is to read much meaning into a template argument type before the template has been instantiated.

Summary

Meyers valuably points out a subtlety in the language that could easily be overlooked. However, inventing a new concept that doesn't exist in the language needlessly obfuscates the underlying reality defined by the standard, which is actually sensible and deserves to be understood. Furthermore, when he writes in his article unambiguously that “‘&&’ does not mean rvalue reference” in the declaration

template<typename T> void f(T &&param);

that statement is correct only if you accept his definition of ‘universal reference’ as something real and distinct from other terms defined in the standard. Approaching his article from a ‘standards-compliant’ point of view, his statement is not correct; because, as we've seen, there absolutely are cases in which that's exactly what it means.

Posted on 2013/04/06