开发者

Which technique for breaking out of a 'visitor' loop is easier for compilers to analyze?

开发者 https://www.devze.com 2023-03-18 06:56 出处:网络
I\'m currently generating code intended for a C++ compiler. I\'m generating a class that will take a visitor object, call the accept method on it with several objects. However, I want the visitor to h

I'm currently generating code intended for a C++ compiler. I'm generating a class that will take a visitor object, call the accept method on it with several objects. However, I want the visitor to have the ability to 'break', that is, the visitor should have a way of indicating that it wants to stop the rest of the accept calls because its lost interest.

I see two ways to accomplish this, and assuming the compiler has full visibility of the dispatching class methods and the visitor's methods (so inlining is possible), I'm curious which way is easier for the compiler to optimize. Obviously, different compilers will produce different results, but I'd like my code generator to produce code that requires the least amount of sophistication from the compiler to produce fast code.

The first way is to generate a doVisiting method that expects the Accept method on the visitor indicating whether it should continue:

template<class VisitorT>
void doVisiting(VisitorT& visitor)
{
    if(visitor.accept(object1)) {
        if(visitor.accept(object2)) {
            if(visitor.accept(object3)) {
                visitor.accept(object4);
            }
        }
    }
}

One advantage of this way is that if any the accept methods are hardcoded to return false or true, I expect the compiler's constant propogation would kick in would prune the if checks and accept calls. I think this is a reasonable assumption because constant propogation is something pretty much any optimizing compiler has to implement.

The second way would be not pay attention to the return type, and count on the visitor to maintain a bool internally indicating whether it wants to accept the rest of the iterations. At the top of it开发者_StackOverflow中文版s accept method it would check the bool to decide whether to do any processing:

struct myvisitor
{
    bool stop;
    void accept(object_type1& o)
    {
        if(stop)
            return;

        // do work
    }

    void accept(object_type2& o)
    {
        if(stop)
            return;

        // do work

        if(some_break_worthy_condition)
            // if we were returning bool instead of void,
            // we would have a return false here.
            stop = true;
    }

    // .. other accept methods
};

In the first method the ifs are outside the accept calls, and the second way the ifs are inside them. The second also necessarily involves storing a bit of state. My intuition is that the latter requires more sophisticated analysis, but maybe I underestimate compilers.

It also goes without saying I'm interested in other suggestions compilers would handle even better ;)


On the first option your unoptimized simplified code will look something like this:

call visitor.accept o1
branch on result is false to end
call visitor.accept o2
branch on result is false to end
call visitor.accept o3
branch on result is false to end
call visitor.accept o4

So how can the compiler optimize that? Well, I have now idea. Seems pretty efficient already. So you need to dig deeper.

How difficulty will be the "call vistor.accept" part. Say that visitor is not a pointer, then the target address of the call instruction is known to the compiler at compile time. And it can be made a simple call instruction. It will just need to place the argument oX somewhere where the called function can find it. The compiler COULD further optimize this by actually putting the code from your myvisitor-struct inline into the code above. Than no call statement would be needed at all, but it would increase the file size, increasing the risk that your code does not fit in your CPU cache anymore etc. so there is certainly some discussion out there whether this is a suitable thing for a compiler to do, especially since calls are very fast on modern CPUs.

What does your other unoptimized but simplified code look like?

call visitor.accept o1
call visitor.accept o2
...

in every call to visitor the following will additionally happen:

branch on stop is true to end
[Some do-Stuff-code]

So in this code the visitor's accept method is called 4 times and the branch statement is called 4 times as well in any case. In the first method that you outlined, it may be possible that the first branch takes effect and you will be good with many fewer instructions, while you will never have a chance to gain an advantage.

So what can the compiler do for you in the second example? You might expect the compiler to figure out that if the stop variable is set to true, than it can jump over all remaining call statements. But, that seriously needs a sophisticated compiler. Even if the compiler can figure it out, he may still not do anything, because in a multi-threaded environment you have no idea if the flow of instructions will surely be free from side effects.

Thus I believe that the first example will be faster with compiler optimization or without, but disclaimer: I am not a compiler optimization professional. Maybe someone else will come up with something extremely smart :-).

If your are further interested on what compilers can do for you, you may be interested in the talk Know your compiler by Felix von Leitner 2007.

0

精彩评论

暂无评论...
验证码 换一张
取 消

关注公众号