I ha开发者_运维知识库ve come across a lot of optimization tips which say that you should mark your classes as sealed to get extra performance benefits.
I ran some tests to check the performance differential and found none. Am I doing something wrong? Am I missing the case where sealed classes will give better results?
Has anyone run tests and seen a difference?
Help me learn :)
The answer was no, sealed classes do not perform better than non-sealed.
2021: The answer is now yes there are performance benefits to sealing a class.
Sealing a class may not always provide a performance boost, but the dotnet team are adopting the rule of sealing all internal classes to give the optimiser the best chance.
For details you can read https://devblogs.microsoft.com/dotnet/performance-improvements-in-net-6/#peanut-butter
Old answer below.
The issue comes down to the call
vs callvirt
IL op codes. Call
is faster than callvirt
, and callvirt
is mainly used when you don't know if the object has been subclassed. So people assume that if you seal a class all the op codes will change from calvirts
to calls
and will be faster.
Unfortunately callvirt
does other things that make it useful too, like checking for null references. This means that even if a class is sealed, the reference might still be null and thus a callvirt
is needed. You can get around this (without needing to seal the class), but it becomes a bit pointless.
Structs use call
because they cannot be subclassed and are never null.
See this question for more information:
Call and callvirt
The JITter will sometimes use non-virtual calls to methods in sealed classes since there is no way they can be extended further.
There are complex rules regarding calling type, virtual/nonvirtual, and I don't know them all so I can't really outline them for you, but if you google for sealed classes and virtual methods you might find some articles on the topic.
Note that any kind of performance benefit you would obtain from this level of optimization should be regarded as last-resort, always optimize on the algorithmic level before you optimize on the code-level.
Here's one link mentioning this: Rambling on the sealed keyword
Update: As of .NET Core 2.0 and .NET Desktop 4.7.1, the CLR now supports devirtualization. It can take methods in sealed classes and replace virtual calls with direct calls - and it can also do this for non-sealed classes if it can figure out it's safe to do so.
In such a case (a sealed class that the CLR couldn't otherwise detect as safe to devirtualise), a sealed class should actually offer some kind of performance benefit.
That said, I wouldn't think it'd be worth worrying about unless you had already profiled the code and determined that you were in a particularly hot path being called millions of times, or something like that:
https://blogs.msdn.microsoft.com/dotnet/2017/06/29/performance-improvements-in-ryujit-in-net-core-and-net-framework/
Original Answer:
I made the following test program, and then decompiled it using Reflector to see what MSIL code was emitted.
public class NormalClass {
public void WriteIt(string x) {
Console.WriteLine("NormalClass");
Console.WriteLine(x);
}
}
public sealed class SealedClass {
public void WriteIt(string x) {
Console.WriteLine("SealedClass");
Console.WriteLine(x);
}
}
public static void CallNormal() {
var n = new NormalClass();
n.WriteIt("a string");
}
public static void CallSealed() {
var n = new SealedClass();
n.WriteIt("a string");
}
In all cases, the C# compiler (Visual studio 2010 in Release build configuration) emits identical MSIL, which is as follows:
L_0000: newobj instance void <NormalClass or SealedClass>::.ctor()
L_0005: stloc.0
L_0006: ldloc.0
L_0007: ldstr "a string"
L_000c: callvirt instance void <NormalClass or SealedClass>::WriteIt(string)
L_0011: ret
The oft-quoted reason that people say sealed provides performance benefits is that the compiler knows the class isn't overriden, and thus can use call
instead of callvirt
as it doesn't have to check for virtuals, etc. As proven above, this is not true.
My next thought was that even though the MSIL is identical, perhaps the JIT compiler treats sealed classes differently?
I ran a release build under the visual studio debugger and viewed the decompiled x86 output. In both cases, the x86 code was identical, with the exception of class names and function memory addresses (which of course must be different). Here it is
// var n = new NormalClass();
00000000 push ebp
00000001 mov ebp,esp
00000003 sub esp,8
00000006 cmp dword ptr ds:[00585314h],0
0000000d je 00000014
0000000f call 70032C33
00000014 xor edx,edx
00000016 mov dword ptr [ebp-4],edx
00000019 mov ecx,588230h
0000001e call FFEEEBC0
00000023 mov dword ptr [ebp-8],eax
00000026 mov ecx,dword ptr [ebp-8]
00000029 call dword ptr ds:[00588260h]
0000002f mov eax,dword ptr [ebp-8]
00000032 mov dword ptr [ebp-4],eax
// n.WriteIt("a string");
00000035 mov edx,dword ptr ds:[033220DCh]
0000003b mov ecx,dword ptr [ebp-4]
0000003e cmp dword ptr [ecx],ecx
00000040 call dword ptr ds:[0058827Ch]
// }
00000046 nop
00000047 mov esp,ebp
00000049 pop ebp
0000004a ret
I then thought perhaps running under the debugger causes it to perform less aggressive optimization?
I then ran a standalone release build executable outside of any debugging environments, and used WinDBG + SOS to break in after the program had completed, and view the dissasembly of the JIT compiled x86 code.
As you can see from the code below, when running outside the debugger the JIT compiler is more aggressive, and it has inlined the WriteIt
method straight into the caller.
The crucial thing however is that it was identical when calling a sealed vs non-sealed class. There is no difference whatsoever between a sealed or nonsealed class.
Here it is when calling a normal class:
Normal JIT generated code
Begin 003c00b0, size 39
003c00b0 55 push ebp
003c00b1 8bec mov ebp,esp
003c00b3 b994391800 mov ecx,183994h (MT: ScratchConsoleApplicationFX4.NormalClass)
003c00b8 e8631fdbff call 00172020 (JitHelp: CORINFO_HELP_NEWSFAST)
003c00bd e80e70106f call mscorlib_ni+0x2570d0 (6f4c70d0) (System.Console.get_Out(), mdToken: 060008fd)
003c00c2 8bc8 mov ecx,eax
003c00c4 8b1530203003 mov edx,dword ptr ds:[3302030h] ("NormalClass")
003c00ca 8b01 mov eax,dword ptr [ecx]
003c00cc 8b403c mov eax,dword ptr [eax+3Ch]
003c00cf ff5010 call dword ptr [eax+10h]
003c00d2 e8f96f106f call mscorlib_ni+0x2570d0 (6f4c70d0) (System.Console.get_Out(), mdToken: 060008fd)
003c00d7 8bc8 mov ecx,eax
003c00d9 8b1534203003 mov edx,dword ptr ds:[3302034h] ("a string")
003c00df 8b01 mov eax,dword ptr [ecx]
003c00e1 8b403c mov eax,dword ptr [eax+3Ch]
003c00e4 ff5010 call dword ptr [eax+10h]
003c00e7 5d pop ebp
003c00e8 c3 ret
Vs a sealed class:
Normal JIT generated code
Begin 003c0100, size 39
003c0100 55 push ebp
003c0101 8bec mov ebp,esp
003c0103 b90c3a1800 mov ecx,183A0Ch (MT: ScratchConsoleApplicationFX4.SealedClass)
003c0108 e8131fdbff call 00172020 (JitHelp: CORINFO_HELP_NEWSFAST)
003c010d e8be6f106f call mscorlib_ni+0x2570d0 (6f4c70d0) (System.Console.get_Out(), mdToken: 060008fd)
003c0112 8bc8 mov ecx,eax
003c0114 8b1538203003 mov edx,dword ptr ds:[3302038h] ("SealedClass")
003c011a 8b01 mov eax,dword ptr [ecx]
003c011c 8b403c mov eax,dword ptr [eax+3Ch]
003c011f ff5010 call dword ptr [eax+10h]
003c0122 e8a96f106f call mscorlib_ni+0x2570d0 (6f4c70d0) (System.Console.get_Out(), mdToken: 060008fd)
003c0127 8bc8 mov ecx,eax
003c0129 8b1534203003 mov edx,dword ptr ds:[3302034h] ("a string")
003c012f 8b01 mov eax,dword ptr [ecx]
003c0131 8b403c mov eax,dword ptr [eax+3Ch]
003c0134 ff5010 call dword ptr [eax+10h]
003c0137 5d pop ebp
003c0138 c3 ret
To me, this provides solid proof that there cannot be any performance improvement between calling methods on sealed vs non-sealed classes... I think I'm happy now :-)
As I know, there is no guarantee of performance benefit. But there is a chance to decrease performance penalty under some specific condition with sealed method. (sealed class makes all methods to be sealed.)
But it's up to compiler implementation and execution environment.
Details
Many of modern CPUs use long pipeline structure to increase performance. Because CPU is incredibly faster than memory, CPU has to prefetch code from memory to accelerate pipeline. If the code is not ready at proper time, the pipelines will be idle.
There is a big obstacle called dynamic dispatch which disrupts this 'prefetching' optimization. You can understand this as just a conditional branching.
// Value of `v` is unknown,
// and can be resolved only at runtime.
// CPU cannot know which code to prefetch.
// Therefore, just prefetch any one of a() or b().
// This is *speculative execution*.
int v = random();
if (v==1) a();
else b();
CPU cannot prefetch next code to execute in this case because the next code position is unknown until the condition is resolved. So this makes hazard causes pipeline idle. And performance penalty by idle is huge in regular.
Similar thing happen in case of method overriding. Compiler may determine proper method overriding for current method call, but sometimes it's impossible. In this case, proper method can be determined only at runtime. This is also a case of dynamic dispatch, and, a main reason of dynamically-typed languages are generally slower than statically-typed languages.
Some CPU (including recent Intel's x86 chips) uses technique called speculative execution to utilize pipeline even on the situation. Just prefetch one of execution path. But hit rate of this technique is not so high. And speculation failure causes pipeline stall which also makes huge performance penalty. (this is completely by CPU implementation. some mobile CPU is known as does not this kind of optimization to save energy)
Basically, C# is a statically compiled language. But not always. I don't know exact condition and this is entirely up to compiler implementation. Some compilers can eliminate possibility of dynamic dispatch by preventing method overriding if the method is marked as sealed
. Stupid compilers may not.
This is the performance benefit of the sealed
.
This answer (Why is it faster to process a sorted array than an unsorted array?) is describing the branch prediction a lot better.
<off-topic-rant>
I loathe sealed classes. Even if the performance benefits are astounding (which I doubt), they destroy the object-oriented model by preventing reuse via inheritance. For example, the Thread class is sealed. While I can see that one might want threads to be as efficient as possible, I can also imagine scenarios where being able to subclass Thread would have great benefits. Class authors, if you must seal your classes for "performance" reasons, please provide an interface at the very least so we don't have to wrap-and-replace everywhere that we need a feature you forgot.
Example: SafeThread had to wrap the Thread class because Thread is sealed and there is no IThread interface; SafeThread automatically traps unhandled exceptions on threads, something completely missing from the Thread class. [and no, the unhandled exception events do not pick up unhandled exceptions in secondary threads].
</off-topic-rant>
Marking a class sealed
should have no performance impact.
There are cases where csc
might have to emit a callvirt
opcode instead of a call
opcode. However, it seems those cases are rare.
And it seems to me that the JIT should be able to emit the same non-virtual function call for callvirt
that it would for call
, if it knows that the class doesn't have any subclasses (yet). If only one implementation of the method exists, there's no point loading its address from a vtable—just call the one implementation directly. For that matter, the JIT can even inline the function.
It's a bit of a gamble on the JIT's part, because if a subclass is later loaded, the JIT will have to throw away that machine code and compile the code again, emitting a real virtual call. My guess is this doesn't happen often in practice.
(And yes, VM designers really do aggressively pursue these tiny performance wins.)
Sealed classes should provide a performance improvement. Since a sealed class cannot be derived, any virtual members can be turned into non-virtual members.
Of course, we're talking really small gains. I wouldn't mark a class as sealed just to get a performance improvement unless profiling revealed it to be a problem.
I consider "sealed" classes the normal case and I ALWAYS have a reason to omit the "sealed" keyword.
The most important reasons for me are:
a) Better compile time checks (casting to interfaces not implemented will be detected at compile time, not only at runtime)
and, top reason:
b) Abuse of my classes is not possible that way
I wish Microsoft would have made "sealed" the standard, not "unsealed".
sealed classes will be at least a tiny bit faster, but sometimes can be waayyy faster... if the JIT Optimizer can inline calls that would have otherwise been virtual calls. So, where there's oft-called methods that are small enough to be inlined, definitely consider sealing the class.
However, the best reason to seal a class is to say "I didn't design this to be inherited from, so I'm not going to let you get burned by assuming it was designed to be so, and I'm not going to burn myself by getting locked into an implementation because I let you derive from it."
I know some here have said they hate sealed classes because they want the opportunity to derive from anything... but that is OFTEN not the most maintainable choice... because exposing a class to derivation locks you in a lot more than not exposing all that. Its similar to saying "I loathe classes that have private members... I often can't make the class do what I want because I don't have access." Encapsulation is important... sealing is one form of encapsulation.
To really see them you need to analyze the JIT-Compiled code (last one).
C# Code
public sealed class Sealed
{
public string Message { get; set; }
public void DoStuff() { }
}
public class Derived : Base
{
public sealed override void DoStuff() { }
}
public class Base
{
public string Message { get; set; }
public virtual void DoStuff() { }
}
static void Main()
{
Sealed sealedClass = new Sealed();
sealedClass.DoStuff();
Derived derivedClass = new Derived();
derivedClass.DoStuff();
Base BaseClass = new Base();
BaseClass.DoStuff();
}
MIL Code
.method private hidebysig static void Main() cil managed
{
.entrypoint
// Code size 41 (0x29)
.maxstack 8
IL_0000: newobj instance void ConsoleApp1.Program/Sealed::.ctor()
IL_0005: callvirt instance void ConsoleApp1.Program/Sealed::DoStuff()
IL_000a: newobj instance void ConsoleApp1.Program/Derived::.ctor()
IL_000f: callvirt instance void ConsoleApp1.Program/Base::DoStuff()
IL_0014: newobj instance void ConsoleApp1.Program/Base::.ctor()
IL_0019: callvirt instance void ConsoleApp1.Program/Base::DoStuff()
IL_0028: ret
} // end of method Program::Main
JIT- Compiled Code
--- C:\Users\Ivan Porta\source\repos\ConsoleApp1\Program.cs --------------------
{
0066084A in al,dx
0066084B push edi
0066084C push esi
0066084D push ebx
0066084E sub esp,4Ch
00660851 lea edi,[ebp-58h]
00660854 mov ecx,13h
00660859 xor eax,eax
0066085B rep stos dword ptr es:[edi]
0066085D cmp dword ptr ds:[5842F0h],0
00660864 je 0066086B
00660866 call 744CFAD0
0066086B xor edx,edx
0066086D mov dword ptr [ebp-3Ch],edx
00660870 xor edx,edx
00660872 mov dword ptr [ebp-48h],edx
00660875 xor edx,edx
00660877 mov dword ptr [ebp-44h],edx
0066087A xor edx,edx
0066087C mov dword ptr [ebp-40h],edx
0066087F nop
Sealed sealedClass = new Sealed();
00660880 mov ecx,584E1Ch
00660885 call 005730F4
0066088A mov dword ptr [ebp-4Ch],eax
0066088D mov ecx,dword ptr [ebp-4Ch]
00660890 call 00660468
00660895 mov eax,dword ptr [ebp-4Ch]
00660898 mov dword ptr [ebp-3Ch],eax
sealedClass.DoStuff();
0066089B mov ecx,dword ptr [ebp-3Ch]
0066089E cmp dword ptr [ecx],ecx
006608A0 call 00660460
006608A5 nop
Derived derivedClass = new Derived();
006608A6 mov ecx,584F3Ch
006608AB call 005730F4
006608B0 mov dword ptr [ebp-50h],eax
006608B3 mov ecx,dword ptr [ebp-50h]
006608B6 call 006604A8
006608BB mov eax,dword ptr [ebp-50h]
006608BE mov dword ptr [ebp-40h],eax
derivedClass.DoStuff();
006608C1 mov ecx,dword ptr [ebp-40h]
006608C4 mov eax,dword ptr [ecx]
006608C6 mov eax,dword ptr [eax+28h]
006608C9 call dword ptr [eax+10h]
006608CC nop
Base BaseClass = new Base();
006608CD mov ecx,584EC0h
006608D2 call 005730F4
006608D7 mov dword ptr [ebp-54h],eax
006608DA mov ecx,dword ptr [ebp-54h]
006608DD call 00660490
006608E2 mov eax,dword ptr [ebp-54h]
006608E5 mov dword ptr [ebp-44h],eax
BaseClass.DoStuff();
006608E8 mov ecx,dword ptr [ebp-44h]
006608EB mov eax,dword ptr [ecx]
006608ED mov eax,dword ptr [eax+28h]
006608F0 call dword ptr [eax+10h]
006608F3 nop
}
0066091A nop
0066091B lea esp,[ebp-0Ch]
0066091E pop ebx
0066091F pop esi
00660920 pop edi
00660921 pop ebp
00660922 ret
While the creation of the objects is the same, the instruction executed to invoke the methods of the sealed and derived/base class are slightly different. After moving data into registers or RAM (mov instruction), the invoke of the sealed method, execute a comparison between dword ptr [ecx],ecx (cmp instruction) and then call the method while the derived/base class execute directly the method..
According to the report written by Torbj¨orn Granlund, Instruction latencies and throughput for AMD and Intel x86 processors, the speed of the following instruction in a Intel Pentium 4 are:
- mov: has 1 cycle as latency and the processor can sustain 2.5 instructions per cycle of this type
- cmp: has 1 cycle as latency and the processor can sustain 2 instructions per cycle of this type
Link: https://gmplib.org/~tege/x86-timing.pdf
This mean that, ideally, the time needed to invoke a sealed method is 2 cycles while the time needed to invoke a derived or base class method is 3 cycles.
The optimization of the compilers have made the difference between the performances of a sealed and not-sealed classed so low that we are talking about processor circles and for this reason are irrelevant for the majority of applications.
Starting from .NET 6.0 the answer is yes.
Sealing a class can help the JIT de-virtualize calls, resulting in less overhead when calling a method. This has additional benefits, because the de-virtualized call can be inlined by the JIT if necessary, which can also lead to constant folding.
For example, in this code from the MSDN article:
[Benchmark(Baseline = true)]
public int NonSealed() => _nonSealed.M() + 42;
[Benchmark]
public int Sealed() => _sealed.M() + 42;
public class BaseType
{
public virtual int M() => 1;
}
public class NonSealedType : BaseType
{
public override int M() => 2;
}
public sealed class SealedType : BaseType
{
public override int M() => 2;
}
The "NonSealed" benchmark runs in 0.9837ns, but the "Sealed" method doesn't take more time than a function that simply returns a constant value. This is due to constant folding.
Type checking sealed classes also has performance benefits, like in this code from the MSDN article:
private object _o = "hello";
[Benchmark(Baseline = true)]
public bool NonSealed() => _o is NonSealedType;
[Benchmark]
public bool Sealed() => _o is SealedType;
public class NonSealedType { }
public sealed class SealedType { }
Checking against a non-sealed type takes ~1.76ns, while checking the sealed type is only ~0.07ns.
In fact, the .NET team made a policy to seal all the private and internal classes that can be sealed.
Notice that we're dealing with saving less than 2 nanoseconds on a call, so the overhead of calling a virtual method is not gonna be the bottleneck most of the time. I think it's more appropriate for simple virtual getters or very short methods.
Run this code and you'll see that sealed classes are 2 times faster:
class Program
{
static void Main(string[] args)
{
Console.ReadLine();
var watch = new Stopwatch();
watch.Start();
for (int i = 0; i < 10000000; i++)
{
new SealedClass().GetName();
}
watch.Stop();
Console.WriteLine("Sealed class : {0}", watch.Elapsed.ToString());
watch.Start();
for (int i = 0; i < 10000000; i++)
{
new NonSealedClass().GetName();
}
watch.Stop();
Console.WriteLine("NonSealed class : {0}", watch.Elapsed.ToString());
Console.ReadKey();
}
}
sealed class SealedClass
{
public string GetName()
{
return "SealedClass";
}
}
class NonSealedClass
{
public string GetName()
{
return "NonSealedClass";
}
}
output: Sealed class : 00:00:00.1897568 NonSealed class : 00:00:00.3826678
精彩评论