开发者

C++ function parameters: use a reference or a pointer (and then dereference)?

开发者 https://www.devze.com 2022-12-10 01:27 出处:网络
I was given some code in which some of the parameters are pointers, and then the po开发者_运维问答inters are dereferenced

I was given some code in which some of the parameters are pointers, and then the po开发者_运维问答inters are dereferenced to provide values. I was concerned that the pointer dereferencing would cost cycles, but after looking at a previous StackOverflow article: How expensive is it to dereference a pointer?, perhaps it doesn't matter.

Here are some examples:


bool MyFunc1(int * val1, int * val2)
{
    *val1 = 5;
    *val2 = 10;
    return true;
}

bool MyFunc2(int &val1, int &val2)
{
    val1 = 5;
    val2 = 10;
    return true;
}

I personally prefer the pass-by-reference as a matter of style, but is one version better (in terms of process cycles) than another?


My rule of thumb is to pass by pointer if the parameter can be NULL, ie optional in the cases above, and reference if the parameter should never be NULL.


From a performance point of view, it probably doesn't matter. Others have already answered that.

Having said that, I have yet not found a situation where an added instruction in this case would make a noticeable difference. I do realize that for a function that is called billions of times, it could make a difference. As a rule, you shouldn't adapt your programming style for these kind of "optimizations".


You can get the assembly code from your compiler and compare them. At least in GCC, they produce identical code.


This will get voted down since it's Old Skool, but I often prefer pointers since it's easier to just glance at code and see if my objects that I am passing to a function could get modified, especially if they are simple datatypes like int and float.


There are different guidelines on using reference vs. pointer parameters out there, tailored to different requirements. In my opinion, the most meaningful rule that should be applied in generic C++ development is the following:

  1. Use reference parameters when overloading operators. (In this case you actually have no choice. This is what references were introduced for in the first place.)

  2. Use const-reference for composite (i.e. logically "large") input parameters. I.e input parameters should be passed either by value ("atomic" values) or by const-reference ("aggregate" values). Use pointers for output parameters and input-output parameters. Do not use references for output parameters.

Taking the above into the account, the overwhelming majority of reference parameters in your program should be const-references. If you have a non-const reference parameter and it is not an operator, consider using a pointer instead.

Following the above convention, you'll be able to see at the point of the call whether the function might modify one of its arguments: the potentially modified arguments will be passed with explicit & or as already-existing pointers.

There's another popular rule out there that states that something that can be null should be passed as a pointer, while something that can't be null should be passed as a reference. I can imagine that this might make sense in some very narrow and very specific circumstances, but in general this is a major anti-rule. Just don't do it this way. If you want to express the fact that some pointer must not be null, put a corresponding assertion as the very first line of your function.

As for the perfromance considerations, there's absolutely no performance difference in passing by pointer or passing by reference. Both kinds of parameters are exactly the same thing at the physical level. Even when the function gets inlined, a modern compiler should be smart enough to preserve the equivalence.


Here's the difference in the generated assembly with g++. a.cpp is pointers, b.cpp is references.

$ g++ -S a.cpp

$ g++ -S b.cpp

$ diff a.S b.S
1c1
<       .file   "a.cpp"
---
>       .file   "b.cpp"
4,6c4,6
< .globl __Z7MyFunc1PiS_
<       .def    __Z7MyFunc1PiS_;        .scl    2;      .type   32;     .endef
< __Z7MyFunc1PiS_:
---
> .globl __Z7MyFunc1RiS_
>       .def    __Z7MyFunc1RiS_;        .scl    2;      .type   32;     .endef
> __Z7MyFunc1RiS_:

Just the function name is slightly different; the contents are identical. I had identical results when I used g++ -O3.


References are very similar to pointers with one big difference: references can not be NULL. So you no not need to check if they are acutual usable objects (like for pointers).

Therefore I assume that compilers will produce the same code.


All the other answers already point out that neither function is superior to the other in terms of runtime performance.

However, I think that that the former function is superior to the other in terms of readability, because a call like

f( &a, &b );

clearly expresses that a reference to some variable is passed (which, to me, rings the 'this object might be modified by the function' bell). The version which takes references instead of pointers just looks like

f( a, b );

It would be fairly surprising to me to see that a changed after the call, because I cannot tell from the invocation that the variable is passed by reference.


From a performance perspective, any competent compiler should wipe out the issue, which is highly unlikely to be a bottleneck in any case. If you're genuinely working at that low a level, assembly code analysis and performance profiling on realistic data are going to be essential parts of your toolkit anyway.

From a maintenance perspective, you really shouldn't allow someone to pass in parameters as pointers without checking them for nullity.

So you end up writing a ton of null check code, for no good reason.

Basically it goes from:

  1. Create object
  2. Pass in as non-const reference

To:

  1. Create object
  2. Get address of object
  3. Pass in address
  4. Check if address points to null
  5. Dereference back to a non-const reference for use throughout the function

While the compiler will parse all of that junk out, the reader of the code won't. It'll be extra code to consider when extending the functions or working out what the code does. Also more code is written, which increases the number of lines that can contain a bug. And as it's pointers, the bugs are more likely to be quirky undefined behaviour bugs.

It also encourages weaker programmers to write code like this:

int* a = new int(4); // Don't understand why this has to be a pointer
int* b = new int(5); // Don't understand why this has to be a pointer
MyFunc2(a, b);
int& a_r = *a;
int& b_r = *b;

Yes, that's terrible, but I've seen it from new coders who don't really understand the pointer model.

At the same time, I'd argue that the whole "I can see whether it's going to be modified without looking at the actual header", is a bit of a false advantage, considering the potential losses. If you can't immediately tell which are the output parameters from the context of the code, then that's your problem, and no amount of pointers are going to save you. If you must have an & to identify your output parameters, may I introduce:

MyFunc2(/*&*/a, /*&*/b)

All the "readability" of the ampersand, none of the associated pointer risks.

However, when it comes to maintenance, consistency is king. If there is existing code that you're integrating with, that passes as pointers (e.g. the other functions of the class or library), there's no good reason to be a crazy rebel and go your own way. That's really going to cause confusion.


You should see the generated assembly code for the target machine... take into account that a function call is always done in constant time, and on actual machines that time is really negligible...


If you need to do things like bellow to remember that "a" is going to be changed on the function. What will stop you from inverting parameters that have same type or making some nasty error, when you have to call some function you must keep the prototype on your memory, or use some IDE that show it on a tooltip. There no excuse for that, I don't like references because I cant know if the function will change it, is not an valid argument.

MyFunc2(/*&*/a, /*&*/b)
0

精彩评论

暂无评论...
验证码 换一张
取 消