I'm trying to build a code sample to show the optimization of code by the compiler when multiplying with a power of 2 number. Yet when I turn Optimize code on the IL remains mainly the same. Any ideas what I'm doing wrong here?
The code:
int nr;
int result;
var stopwatch = new Stopwatch();
nr = 5;
stopwatch.Start();
result = nr * 4;
stopwatch.Stop();
Console.WriteLine(result);
Console.WriteLine(stopwatch.Elapsed.ToString() + "ms ellapsed");
stopwatch.Reset();
stopwatch.Start();
result = nr << 2;
stopwatch.Stop();
Console.WriteLine(result);
Console.WriteLine(stopwatch.Elapsed.ToString() + "ms ellapsed");
Non Optimized IL:
.method private hidebysig static void Main(string[] args) cil managed
{
.entrypoint
// Code size 130 (0x82)
.maxstack 2
.locals init ([0] int32 nr,
[1] int32 result,
[2] class [System]System.Diagnostics.Stopwatch stopwatch,
[3] valuetype [mscorlib]System.TimeSpan CS$0$0000,
[4] valuetype [mscorlib]System.TimeSpan CS$0$0001)
IL_0000: newobj instance void [System]System.Diagnostics.Stopwatch::.ctor()
IL_0005: stloc.2
IL_0006: ldc.i4.5
IL_0007: stloc.0
IL_0008: ldloc.2
IL_0009: callvirt instance void [System]System.Diagnostics.Stopwatch::Start()
IL_000e: ldloc.0
IL_000f: ldc.i4.4
IL_0010: mul
IL_0011: stloc.1
IL_0012: ldloc.2
IL_0013: callvirt instance void [System]System.Diagnostics.Stopwatch::Stop()
IL_0018: ldloc.1
IL_0019: call void [mscorlib]System.Console::WriteLine(int32)
IL_001e: ldloc.2
IL_001f: callvirt instance valuetype [mscorlib]System.TimeSpan [System]System.Diagnostics.Stopwatch::get_Elapsed()
IL_0024: stloc.3
IL_0025: ldloca.s CS$0$0000
IL_0027: constrained. [mscorlib]System.TimeSpan
IL_002d: callvirt instance string [mscorlib]System.Object::ToString()
IL_0032: ldstr "ms ellapsed"
IL_0037: call string [mscorlib]System.String::Concat(string,
string)
IL_003c: call void [mscorlib]System.Console::WriteLine(string)
IL_0041: ldloc.2
IL_0042: callvirt instance void [System]System.Diagnostics.Stopwatch::Reset()
IL_0047: ldloc.2
IL_0048: callvirt instance void [System]System.Diagnostics.Stopwatch::Start()
IL_004d: ldloc.0
IL_004e: ldc.i4.2
IL_004f: shl
IL_0050: stloc.1
IL_0051: ldloc.2
IL_0052: callvirt instance void [System]System.Diagnostics.Stopwatch::Stop()
IL_0057: ldloc.1
IL_0058: call void [mscorlib]System.Console::WriteLine(int32)
IL_005d: ldloc.2
IL_005e: callvirt开发者_Go百科 instance valuetype [mscorlib]System.TimeSpan [System]System.Diagnostics.Stopwatch::get_Elapsed()
IL_0063: stloc.s CS$0$0001
IL_0065: ldloca.s CS$0$0001
IL_0067: constrained. [mscorlib]System.TimeSpan
IL_006d: callvirt instance string [mscorlib]System.Object::ToString()
IL_0072: ldstr "ms ellapsed"
IL_0077: call string [mscorlib]System.String::Concat(string,
string)
IL_007c: call void [mscorlib]System.Console::WriteLine(string)
IL_0081: ret
} // end of method Program::Main
Optimized IL:
.method private hidebysig static void Main(string[] args) cil managed
{
.entrypoint
// Code size 130 (0x82)
.maxstack 2
.locals init ([0] int32 nr,
[1] int32 result,
[2] class [System]System.Diagnostics.Stopwatch stopwatch,
[3] valuetype [mscorlib]System.TimeSpan CS$0$0000,
[4] valuetype [mscorlib]System.TimeSpan CS$0$0001)
IL_0000: newobj instance void [System]System.Diagnostics.Stopwatch::.ctor()
IL_0005: stloc.2
IL_0006: ldc.i4.5
IL_0007: stloc.0
IL_0008: ldloc.2
IL_0009: callvirt instance void [System]System.Diagnostics.Stopwatch::Start()
IL_000e: ldloc.0
IL_000f: ldc.i4.4
IL_0010: mul
IL_0011: stloc.1
IL_0012: ldloc.2
IL_0013: callvirt instance void [System]System.Diagnostics.Stopwatch::Stop()
IL_0018: ldloc.1
IL_0019: call void [mscorlib]System.Console::WriteLine(int32)
IL_001e: ldloc.2
IL_001f: callvirt instance valuetype [mscorlib]System.TimeSpan [System]System.Diagnostics.Stopwatch::get_Elapsed()
IL_0024: stloc.3
IL_0025: ldloca.s CS$0$0000
IL_0027: constrained. [mscorlib]System.TimeSpan
IL_002d: callvirt instance string [mscorlib]System.Object::ToString()
IL_0032: ldstr "ms ellapsed"
IL_0037: call string [mscorlib]System.String::Concat(string,
string)
IL_003c: call void [mscorlib]System.Console::WriteLine(string)
IL_0041: ldloc.2
IL_0042: callvirt instance void [System]System.Diagnostics.Stopwatch::Reset()
IL_0047: ldloc.2
IL_0048: callvirt instance void [System]System.Diagnostics.Stopwatch::Start()
IL_004d: ldloc.0
IL_004e: ldc.i4.2
IL_004f: shl
IL_0050: stloc.1
IL_0051: ldloc.2
IL_0052: callvirt instance void [System]System.Diagnostics.Stopwatch::Stop()
IL_0057: ldloc.1
IL_0058: call void [mscorlib]System.Console::WriteLine(int32)
IL_005d: ldloc.2
IL_005e: callvirt instance valuetype [mscorlib]System.TimeSpan [System]System.Diagnostics.Stopwatch::get_Elapsed()
IL_0063: stloc.s CS$0$0001
IL_0065: ldloca.s CS$0$0001
IL_0067: constrained. [mscorlib]System.TimeSpan
IL_006d: callvirt instance string [mscorlib]System.Object::ToString()
IL_0072: ldstr "ms ellapsed"
IL_0077: call string [mscorlib]System.String::Concat(string,
string)
IL_007c: call void [mscorlib]System.Console::WriteLine(string)
IL_0081: ret
} // end of method Program::Main
I thought the compiler would optimize the mul statement to a shl statement?
My knowledge of IL is very limited (if not non-existing).This is the code generated by the jitter in the Release build:
0000003e mov ecx,14h
The optimizer is far too smart to generate code for a multiplication when it knows the operand values. If you replace nr = 5; with nr = int.Parse("5") so that the jitter cannot know the operand values then it generates this code for the multiplication:
0000005c lea ebx,[rdi*4+00000000h]
Which takes advantage of the multiplier built into the address generation logic on the cpu, allowing the instruction to be overlapped by another instruction that uses the ALU. Which makes the multiplication essentially free. That's output for the 64-bit jitter, the 32-bit jitter generates this:
0000004d shl edi,2
Which is what you were hoping for. I documented the kind of optimizations performed by the jitter in this post.
The "optimize" flag doesn't do an awful lot in the C# to IL compilation phase. It does make a difference, but not for this sort of thing.
I would expect that sort of optimization to be handled by the JIT compiler instead.
精彩评论