This is going to be my first post so I chose a rather simple concept: Assignment vs Initialization in C++. I will try to keep the post as practical as possible and share keywords in case the reader wants to do in-depth research. So buckle up and enjoy the ride folks!
int x; // Define x x = 3; // Assign 3 to x int y{3}; // Define and initialize y with 3
Above statements cause both x and y variables to have a value of 3, which leads to the common pitfall that they are identical.
Let's wear ISO C++ Standards Committee hat
According to C++20 standards, which is recently published, initialization is explained in "9.4 Initializers" section; whereas assignment is explained in "11.4.5 Assignment operator" section. How dare you call them identical, you peasant!
That was a little bit harsh. Perhaps we should wear C++ compiler implementer hat
Gcc 10.2 produces identical output for below codes with or without optimizations.
int getX(){ int x {3}; return x; } int getY(){ int y; y = 3; return y; }
get(): push rbp mov rbp, rsp mov DWORD PTR [rbp-4], 3 mov eax, DWORD PTR [rbp-4] pop rbp ret
That was a bit anticlimactic, I guess. My guess is if data type is scalar, compiler can directly assign value as cppreference.com suggests but I couldn't find the relevant section (direct assignment) on c++ standard. Perhaps we should try a non-scalar data type. For example std::string.
#include <string> std::string getX(){ std::string x {"bugra"}; return x; } std::string getY(){ std::string x; x = "bugra"; return x; }
Let's see what GCC 10.2 produce for getX and getY with optimizations enabled.
getX[abi:cxx11](): lea rdx, [rdi+16] mov BYTE PTR [rdi+20], 97 mov rax, rdi mov QWORD PTR [rdi], rdx mov DWORD PTR [rdi+16], 1919382882 mov QWORD PTR [rdi+8], 5 mov BYTE PTR [rdi+21], 0 ret
.LC0: .string "bugra" getY[abi:cxx11](): push r12 mov r8d, 5 mov ecx, OFFSET FLAT:.LC0 xor edx, edx push rbp xor esi, esi mov r12, rdi push rbx lea rbx, [rdi+16] mov QWORD PTR [rdi], rbx mov QWORD PTR [rdi+8], 0 mov BYTE PTR [rdi+16], 0 call std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::_M_replace(unsigned long, unsigned long, char const*, unsigned long) mov rax, r12 pop rbx pop rbp pop r12 ret mov rbp, rax jmp .L2 getY[abi:cxx11]() [clone .cold]:
I am no expert but I think getX (initializing method) is a lot better than getY (assignment method). Since we established that assignment and initializing can cause different outputs to be produced, let's try to understand the difference between them. We need to wear the formal hat again.
C++20 standard "6.7.7 Temporary objects" section states that the expression a = f() requires a temporary object for the result of f(), which is materialized so that the reference parameter of X::operator=(const X&) can bind to it.
Cpp Core Guidelines also advices to prefer initalization to assignment.
Let's finish with a regular developer hat
TLDR: Prefer initialization to assignment!
class A { // Good string s1; public: A(string p) : s1{p} { } // GOOD: directly construct }; class B { // BAD string s1; public: B(const char* p) { s1 = p; } // BAD: default constructor followed by assignment };
Keywords: copy assignment, Builtin direct assignment for scalar types, copy initialization, direct initialization, list-initialization, temporary objects
Top comments (3)
Just to complement on this part:
That's not the only problem we face in
getY
for the string example.In fact, we have to understand how local variables are initialized. When you create a local integer (
int i;
), memory is reserved for the new variable, but primitive data types are not default initialized. They will hold whatever value that they find in the allocated memory (on the stack).On the other hand,
std::string
is not a primitive data type and objects are default initialized given that they have a default constructor. If they don't have, well, such code wouldn't compile.So getting back to
getY
for the string example.The line
std::string x;
creates a variablex
which is initialized to an empty string. Then on the next linex = "bugra";
you assign a new value tox
.x
is assigned twice! (The integeri
was assigned only once!)It's yet another problem that "bugra" is not a string. It's a
const char*
that first have to be - implicitly - converted to astd::string
and you pay for it. Hence the immense difference in the ASM code. If we want to avoid that cost, and we have access to C++14, we should use a string literal.Then the generated ASM code becomes similar:

But even with whatever optimization turned on, there is no reason in similar circumstances to split declaration from initialization. For example, it doesn't let you declare your variables const.
Here is a great talk on this
Thanks for the great feedback Sandor. I really appreciate it :)
Cool, before I got involved with move semantics I had no idea of the difference between assignment and initialization. I believe that many people don't even try to understand :3
constructor -> initialization
operator= -> assignment
[google translator]