If-else-if versus map_问答_开发者_运维开发者技术经验分享

Suppose I have such an if/else-if chain:

if( x.GetId() == 1 )
{
}
else if( x.GetId() == 2 )
{
}
// ... 50 more e开发者_运维技巧lse if statements

What I wonder is, if I keep a map, will it be any better in terms of performance? (assuming keys are integers)

Maps (usually) are implemented using red-black trees which gives O(log N) lookups as the tree is constantly kept in balance. Your linear list of if statements will be O(N) worst case. So, yes a map would be significantly faster for lookup.

Many people are recommending using a switch statement, which may not be faster for you, depending on your actual if statements. A compiler can sometimes optimize switch by using a jump table which would be O(1), but this is only possible for values that an undefined criteria; hence this behavior can be somewhat nondeterministic. Though there is a great article with a few tips on optimizing switch statements here Optimizing C and C++ Code.

You technically could even formulate a balanced tree manually, this works best for static data and I happened to just recently create a function to quickly find which bit was set in a byte (This was used in an embedded application on an I/O pin interrupt and had to be quick when 99% of the time only 1 bit would be set in the byte):

unsigned char single_bit_index(unsigned char bit) {
    // Hard-coded balanced tree lookup
    if(bit > 0x08)
        if(bit > 0x20)
            if(bit == 0x40)
                return 6;
            else
                return 7;
        else
            if(bit == 0x10)
                return 4;
            else
                return 5;
    else
        if(bit > 0x02)
            if(bit == 0x04)
                return 2;
            else
                return 3;
        else
            if(bit == 0x01)
                return 0;
            else
                return 1;
}

This gives a constant lookup in 3 steps for any of the 8 values which gives me very deterministic performance, a linear search -- given random data -- would average 4 step lookups, with a best-case of 1 and worst-case of 8 steps.

This is a good example of a range that a compiler would probably not optimize to a jump table since the 8 values I am searching for are so far apart: 1, 2, 4, 8, 16, 32, 64, and 128. It would have to create a very sparse 128 position table with only 8 elements containing a target, which on a PC with a ton of RAM might not be a big deal, but on a microcontroller it'd be killer.

why dont you use a a switch ?

swich(x.GetId())
{
  case 1: /* do work */ break; // From the most used case
  case 2: /* do work */ break; 
  case ...: // To the less used case
}

EDIT:

Put the most frequently used case in the top of the switch (This can have some performance issue if x.GetId is generally equal to 50)

switch is the best thing I think

The better solution would be a switch statement. This will allow you to check the value of x.GetId() just once, rather than (on average) 25 times as your code is doing now.

If you want to get fancy, you can use a data structure containing pointers to functions that handle whatever it is that's in the braces. If your ID values are consecutive (i.e. numbers between 1 and 50) then an array of function pointers would be best. If they are spread out, then a map would be more appropriate.

The answer, as with most performance related questions, is maybe.

If the IDs are in a fortunate range, a switch might become a jump-table, providing constant time lookups to all IDs. You won't get much better than this, short of redesigning. Alternatively, if the IDs are consecutive but you don't get a jump-table out of the compiler, you can force the issue by filling an array with function pointers.

[from here on out, switch refers to a generic if/else chain]

A map provides worst-case logarithmic lookup for any given ID, while a switch can only guarantee linear. However, if the IDs are not random, sorting the switch cases by usage might ensure the worst-case scenario is sufficiently rare that this doesn't matter.

A map will incur some initial overhead when loading the IDs and associating them with the functions, and then incur a the overhead of calling a function pointer every time you access an ID. A switch incurs additional overhead when writing the routine, and possibly significant overhead when debugging it.

Redesigning might allow you to avoid the question all together. No matter how you implement it, this smells like trouble. I can't help but think there's a better way to handle this.

If I really had a potential switch of fifty possibilities, I'd definitely think about a vector of pointers to functions.

#include <cstdio>
#include <cstdlib>
#include <ctime>

const unsigned int Max = 4;

void f1();
void f2();
void f3();
void f4();
void (*vp[Max])();

int main()
{
    vp[ 0 ] = f1;
    vp[ 1 ] = f2;
    vp[ 2 ] = f3;
    vp[ 3 ] = f4;

    srand( std::time( NULL ) );
    vp[( rand() % Max )]();
}

void f1()
{
    std::printf( "Hello from f1!\n" );
}

void f2()
{
    std::printf( "Hello from f2!\n" );
}

void f3()
{
    std::printf( "Hello from f3!\n" );
}

void f4()
{
    std::printf( "Hello from f4!\n" );
}

There are a lot of suggestions involving switch-case. In terms of efficiency, this might be better, might be the same. Won't be worse.

But if you're just setting/returning a value or name based on the ID, then YES. A map is exactly what you need. STL containers are optimised, and if you think you can optimise better, then you are either incredibly smart or staggeringly dumb.

e.g A single call using a std::map called mymap,

thisvar = mymap[x.getID()];

is much better than 50 of these

if(x.getID() == ...){thisvar = ...;}

because it's more efficient as the number of IDs increases. If you're interested in why, search for a good primer on data structures.

But what I'd really look at here is maintenance/fixing time. If you need to change the name of the variable, or change from using getID() or getName(), or make any kind of minor change, you've got to do it FIFTY TIMES in your example. And you need a new line every time you add an ID.

The map reduces that to one code change NO MATTER HOW MANY IDs YOU HAVE.

That said, if you're actually carrying out different actions for each ID, a switch-case might be better. With switch-case rather than if statements, you can improve performance and readability. See here: Advantage of switch over if-else statement I'd avoid pointers to functions unless you're very clear on how they'd improve your code, because if you're not 100% certain what you're doing, the syntax can be messed up, and it's overkill for anything you'd feasibly use a map for.

Basically, I'd be interested in the problem you're trying to solve. You might be better off with a map or a switch-case, but if you think you can use a map, that is ABSOLUTELY what you should be using instead.