开发者

Firefox JavaScript arithmetics performance oddity

开发者 https://www.devze.com 2023-04-12 20:20 出处:网络
Please run this test on firefox. http://jsperf.com/s开发者_JAVA技巧tatic-arithmetic How would you explain the results?

Please run this test on firefox.

http://jsperf.com/s开发者_JAVA技巧tatic-arithmetic

How would you explain the results?

This

b = a + 5*5;
b = a + 6/2;
b = a + 7+1;

executes much faster than

b = a + 25;
b = a + 3;
b = a + 8;

Why?


First of all, your test is slightly flawed.

You should compare the following:

  • b = a + 8 - 2; vs b = a + 6

  • b = a + 8 + 2; vs b = a + 10

  • b = a + 8 / 2; vs b = a + 4

  • b = a + 8 * 2; vs b = a + 16

You'll notice something interesting: only problems that have + or - in the second pair of terms are slower (division and multiplication are fine). There must be a clear difference between the implementation of addition/subtraction and multiplication/division. And indeed there is:

So lets look at addition and multiplication (jsparse.cpp):

    JSParseNode *
    Parser::addExpr()
    {
        JSParseNode *pn = mulExpr();
        while (pn &&
               (tokenStream.matchToken(TOK_PLUS) ||
                tokenStream.matchToken(TOK_MINUS))) {
            TokenKind tt = tokenStream.currentToken().type;
            JSOp op = (tt == TOK_PLUS) ? JSOP_ADD : JSOP_SUB;
            pn = JSParseNode::newBinaryOrAppend(tt, op, pn, mulExpr(), tc);
        }
        return pn;
    }

    JSParseNode *
    Parser::mulExpr()
    {
        JSParseNode *pn = unaryExpr();
        while (pn && (tokenStream.matchToken(TOK_STAR) || tokenStream.matchToken(TOK_DIVOP))) {
            TokenKind tt = tokenStream.currentToken().type;
            JSOp op = tokenStream.currentToken().t_op;
            pn = JSParseNode::newBinaryOrAppend(tt, op, pn, unaryExpr(), tc);
        }
        return pn;
    }

But, as we can tell, there isn't a huge difference here. Both are implemented in a similar fashion and both call on newBinaryOrAppend().. so what exactly IS in this function?

(Spoiler: Its namesake may betray why addition/subtraction is more costly. Again, looking at jsparse.cpp)

JSParseNode *
JSParseNode::newBinaryOrAppend(TokenKind tt, JSOp op, JSParseNode *left, JSParseNode *right,
                               JSTreeContext *tc)
{
    JSParseNode *pn, *pn1, *pn2;

    if (!left || !right)
        return NULL;

    /*
     * Flatten a left-associative (left-heavy) tree of a given operator into
     * a list, to reduce js_FoldConstants and js_EmitTree recursion.
     */
    if (PN_TYPE(left) == tt &&
        PN_OP(left) == op &&
        (js_CodeSpec[op].format & JOF_LEFTASSOC)) {
        if (left->pn_arity != PN_LIST) {
            pn1 = left->pn_left, pn2 = left->pn_right;
            left->pn_arity = PN_LIST;
            left->pn_parens = false;
            left->initList(pn1);
            left->append(pn2);
            if (tt == TOK_PLUS) {
                if (pn1->pn_type == TOK_STRING)
                    left->pn_xflags |= PNX_STRCAT;
                else if (pn1->pn_type != TOK_NUMBER)
                    left->pn_xflags |= PNX_CANTFOLD;
                if (pn2->pn_type == TOK_STRING)
                    left->pn_xflags |= PNX_STRCAT;
                else if (pn2->pn_type != TOK_NUMBER)
                    left->pn_xflags |= PNX_CANTFOLD;
            }
        }
        left->append(right);
        left->pn_pos.end = right->pn_pos.end;
        if (tt == TOK_PLUS) {
            if (right->pn_type == TOK_STRING)
                left->pn_xflags |= PNX_STRCAT;
            else if (right->pn_type != TOK_NUMBER)
                left->pn_xflags |= PNX_CANTFOLD;
        }
        return left;
    }

    /*
     * Fold constant addition immediately, to conserve node space and, what's
     * more, so js_FoldConstants never sees mixed addition and concatenation
     * operations with more than one leading non-string operand in a PN_LIST
     * generated for expressions such as 1 + 2 + "pt" (which should evaluate
     * to "3pt", not "12pt").
     */
    if (tt == TOK_PLUS &&
        left->pn_type == TOK_NUMBER &&
        right->pn_type == TOK_NUMBER) {
        left->pn_dval += right->pn_dval;
        left->pn_pos.end = right->pn_pos.end;
        RecycleTree(right, tc);
        return left;
    }

    pn = NewOrRecycledNode(tc);
    if (!pn)
        return NULL;
    pn->init(tt, op, PN_BINARY);
    pn->pn_pos.begin = left->pn_pos.begin;
    pn->pn_pos.end = right->pn_pos.end;
    pn->pn_left = left;
    pn->pn_right = right;
    return (BinaryNode *)pn;
}

Given the above, and in particular the constant folding:

if (tt == TOK_PLUS &&
    left->pn_type == TOK_NUMBER &&
    right->pn_type == TOK_NUMBER) {
    left->pn_dval += right->pn_dval;
    left->pn_pos.end = right->pn_pos.end;
    RecycleTree(right, tc);
    return left;
}

And considering that when formulating the problem like

  • b = Number(a) + 7 + 2; vs b = Number(a) + 9;

... the problem disappears altogether (although it's obviously much slower since we're invoking a static method), I'm tempted to believe that either constant folding is broken (which doesn't seem likely since the parenthetical folding appears to work fine), that Spidermonkey isn't categorizing numeric literals (or numeric expressions, i.e. b = a + ( 7 + 2 )) as TOK_NUMBER (at least at the first parsing level), which is also unlikely, or that we're descending somewhere recursively too deep.

I haven't worked with the Spidermonkey codebase, but my Spidey sense is telling me we're getting lost somewhere and I have a feeling it's in RecycleTree().


In Firefox, it looks like it has something to do with floating point math vs. integer math where the floating point is a lot faster. When I add some floating point math, you can see the difference: http://jsperf.com/static-arithmetic/14.

This is a lot faster:

b = a + 26.01;
b = a + 3.1;
b = a + 8.2;

than this:

b = a + 25;
b = a + 3;
b = a + 8;

All I can guess is that Firefox has some floating point optimizations that don't apply to integer math or the code somehow just takes a different path when floating point numbers are involved.

So, extrapolating this info to your original answer, the + 5*5 must be using the faster float path where as the + 25 is not. See the referenced jsPerf for more details.

Once you make everything floats, the + (5.1 * 5.1) option is slower than the + 26.01 option like we would expect.


Firefox versions 4-8 have two different JITs: Tracemonkey (tracejit) and JaegerMonkey (methodjit). TraceMonkey is much better on simple numeric code; JaegerMonkey is much better on branchy code of various sorts.

There is a heuristic that's used to decide which JIT to use. It looks at a bunch of factors most of which are irrelevant here, but the one that matters for this testcase is that the more arithmetic ops there are in the loop body the more likely TraceMonkey is to be used.

You can test this by changing the values of the javascript.options.tracejit.content and javascript.options.methodjit.content to force code to run under one or the other JIT and then seeing how that affects performance.

Looks like constant-folding is not saving the day in terms of making the testcases behave identically because Spidermonkey can't constant-fold a + 7 + 1 = (a + 7) + 1 to a + 8 because it doesn't know what a is (for example, "" + 7 + 1 == "71" whereas "" + 8 == "8"). If you write it as a + (7 + 1) then suddenly you get the other JIT running on this code.

All of which proves the danger of extrapolating from microbenchmarks to actual code. ;)

Oh, and Firefox 9 only has one JIT (JaegerMonkey with optimizations based on Brian Hackett's type inference work that make it also fast on arithmetic code of this sort).


Testing in Firefox 3.6.23 on Windows XP Test Ops/sec assign arithmetic

b = a + 5*5;
b = a + 6/2;
b = a + 7+1;

67,346,939 ±0.83%11% slower assign plain


b = a + 25;
b = a + 3;
b = a + 8;

75,530,913 ±0.51%fastest


Not true in Chrome.

For me:

b = a + 5*5;
b = a + 6/2;
b = a + 7+1;

Result: 267,527,019, ±0.10%, 7% slower

And

b = a + 25;
b = a + 3;
b = a + 8;

Result: 288,678,771, ±0.06%, fastest

So, not really... No idea why it does that on Firefox.

(Testing in Chrome 14.0.835.202 x86 on Windows Server 2008 R2 / 7 x64)

0

精彩评论

暂无评论...
验证码 换一张
取 消