Given a void pointer to a "blob" of raw memory, there are two ways of writing something onto it.
The first way is to use placement new. This method has the advantage of calling the ctor automagically when we are dealing with class-types. However, when I deal with non-class types, would it be better to do a cast instead? I imagine it could possibly be faster.
(pLocation is a void pointer to a blob of memory
// ----- Is this better -----
*reinterpret_cast<char*>(pLocation) = pattern;
/开发者_如何学编程/ ----- Or is this better -----
::new(pLocation) char(pattern);
I had a look at the generated assembly for each of these techniques, using the following program:
#include <new>
char blob[128];
int main() {
void *pLocation = blob;
char pattern = 'x';
#ifdef CAST
*reinterpret_cast<char*>(pLocation) = pattern;
#else
::new(pLocation) char(pattern);
#endif
}
I'm using g++ 4.4.3 on Linux 64-bits with default compiler flags.
The relevant part of the asm for placement new:
movb $120, -1(%rbp)
movq -16(%rbp), %rax
movq %rax, %rsi
movl $1, %edi
call _ZnwmPv
movq %rax, %rdx
testq %rdx, %rdx
je .L5
movzbl -1(%rbp), %edx
movb %dl, (%rax)
.L5:
From what I gather, this actually calls the placement new operator, and checks its return value, even though it always succeeds. It then proceeds to write the value of x
into the returned memory.
And for the reinterpret_cast
:
movb $120, -1(%rbp)
movq -16(%rbp), %rax
movzbl -1(%rbp), %edx
movb %dl, (%rax)
Note that these instructions are identical to the first two and the last two of the placement new
version.
Using -O1
, both pieces of code generate identical assembly:
movb $120, blob(%rip)
So, if you're worried about performance, don't be. Any other sane compiler will probably reduce both to the same code as well.
While casting raw memory into objects might work in practice, officially it invokes undefined behavior and as a result of that, according to the C++ standard, your code might do anything.
Placement new, OTOH, is a technique to invoke a constructor at a particular address, and construction is what officially turns raw memory into valid objects. That's why I would prefer placement new.
Just to make sure, I would also have the destructor for such objects is called. While you say that you only need this for PODs and PODs' destruction is a no-op, many bugs I have seen in my carrier were in code that was written with a set of restrictions in mind, but had later some of the restrictions lifted and suddenly found itself in an environment with which it was unable to cope.
Also note that there might be platforms out there for which not all possible bit patterns a valid values even for a built-in type. Such platforms might also trap access to values of such pattern. For example, it could be that an all-zero bit pattern is not a valid value for a floating type, so even zeroing the memory before-hand could not prevent a hardware exception.
精彩评论