At the end of my previous post on C# and code explosion within MSIL I had mentioned that when it comes to type constructors or commonly known as “static constructors” or “class constructors” the beforefieldinit flag can considerably boost the performance of your types. This is my first post in a two part series of posts where I would try to analyze the impact of declaring type constructors on the performance of an application.
If a type has a static constructor defined then the JIT compiler will decide when to call it. A developer does not has the liberty to decide when the constructor code should be executed which is unlike normal constructors which a developer can explicitly call from her code.
Let me begin by showing you the problem. In the following code sample I have declared two static classes. In one of them I have explicitly declared a static constructor whereas the other does not has an explicit type constructor. Both these classes have an integer field which is exposed to callers within the assembly (bad design I know but the intention is to keep things as simple as possible for this example).
Next I simply run a loop where all I do is set this field to 0 for both types and I log the time this loop takes for both types. Have a look at the code sample below. What do you think the output will be?
Here’s screenshot of the output on my machine.
As the result shows the type with an explicit type constructor defined took considerably longer to execute as compared to the type with no explicit type constructor defined. This performance hit is constant and repetitive. That is calling the Method1 multiple times will always incur in this performance penalty. This is not a “one off hit”
Before we move onto the discussion of why exactly this is happening let us have a quick look at the MSIL that is generated for the code above. Notice that even though I have not provided an explicit type constructor for the WithoutConstructor class, the compiler adds one which contains identical code as the one for the WithExplicitConstructor class. There is one subtle difference in the MSIL however which I have highlighted in the screenshot. Any guesses what that that might be 😉
Right so now we know that using type constructors can severely penalize the performance of a type. Let us now understand why this happens. Before we do that it is necessary to understand how .NET runtime treats type constructors and implicit assurances that the CLR runtime provides.
When it comes to type constructors or static constructors the .NET runtime does some pretty clever stuff.
CLR guarantees that a type constructor will only be called once per type in an app domain and further it guarantees thread safety while calling these type constructors. The CLR also guarantees that a type constructor for a type will always be executed before any static or instance field or constructors are run.
This actually makes type constructors very good places to do any kind of singleton initialization stuff – but be careful! This singleton stuff in a static constructor can lead to more problems. I will dedicate a post on that next.
JIT Compiler and Thread Execution
Let us peek under the hood of JIT compiler and see if we can understand how JIT compiler would be spitting out code for a class which has type constructor defined for it. Please remember we are now looking at how native code is being generated by JIT compiler and not MSIL. When JIT compiler is compiling a method that uses a type with type constructor the JIT compiler checks whether this type’s constructor has already been called for this AppDomain. If this constructor has not been called the JIT compiler emits a call to call the type constructor of the type being referenced. If the type constructor has already been called once the JIT compiler skips the instructions to call the type constructor of the type.
The JIT compiler selectively produces the native machine code that runs the type constructor’s code.
At some point after this a thread (or more thread than one) will start executing this native code compiled by the JIT compiler. As I mentioned above the .NET CLR guarantees that type constructors for a type or only called once per AppDomain. Thus when the thread reaches this point of execution it will acquire a mutually exclusive lock, it will then verify that the code within the type constructor has not been executed and finally run the code produced by the JIT compiler to call the constructor. Once this thread is done processing the other threads will wake up. When the next thread reaches this point of execution it will verify whether the type constructor has been called or not. In this instance the thread will know that the code within this type constructor has actually been called and thus the thread will simply return without executing the code.
Before a thread runs the JIT compiled code for a type constructor the thread verifies whether the type constructor’s code has been run or not. If it has been run then the thread skips executing the type constructor’s code.
When does JIT compiler calls the type constructor then?
So JIT compiler automatically generates instructions to call a type’s type constructor once per AppDomain in a thread safe manner before any static or instance fields of the type evaluated or executed. How does JIT compiler decide when to call the constructor? As it turns out there are two possibilities for JIT compiler
- The JIT compiler can emit the call sometime before the running code first invokes a non static constructor or references a static field or static or instance method. This is called “Before-Field-Init” semantics since the only guarantee CLR provides is that the type constructor will run “at some point” before the type is used.
- Or the JIT compiler can emit the call immediately before the code either first creates the first instance of this type or immediately before the running code accesses the first member of this class. This is called “precise” semantics because CLR calls the constructor precisely at the right time.
The CLR prefers the before-field-init semantic as that gives the CLR quite a lot of flexibility about when to call the type constructor and CLR takes advantage of it by producing optimized code that runs faster. However it comes down a particular compiler on what semantic it wants the CLR to use. Compilers indicate this marking the metadata of a particular type by the “beforefieldinit” flag. The presence of this flag indicates that the CLR would use the Before-Field-Init semantic and the absence of this flag tells CLR to use precise semantics.
Compilers set the “beforefieldinit” flag in the row of type metadatas for the type which tells the CLR to use Before-Field-Init semantics.
Important bits so far
For the sake of clarity let me reiterate the important points so far
- For type constructors or static constructors CLR makes the choice when to call them. Developer can not directly influence this decision.
- CLR has two semantics to execute the type constructors for types. These are called Precise and Before-Field-Init semantics. The preference is Before-Field-Init.
- The CLR run-time will call the type constructor code exactly once per type per AppDomain. Further CLR will ensure thread safety while calling this code
- JIT compiler while generating native code will also create instructions for the executing thread to verify whether a type constructor has been called or not. If it has been called by a previous thread, the current executing thread does not execute that code.
Problem Code Revisited
Let us revisit the code in the light of the above discussion. I have shared the same code again but this time I have added another method (which is the exact copy of the first method) and I have added in-line comments to explain how CLR treats this execution chain.
When the code starts executing the JIT compiler adopts Precise semantics for the WithExplicitConstructor class as the CSC compiler did not add the beforeinitfield flag to this type. Since Method1 is the first method call that uses these types the JIT compiler not only emits the code to call the type constructor for WithExplicitConstructor class within the for loop but as explained above the code to verify whether the constructor code should be executed at all or not is also emitted in this loop! For the other type the JIT optimises the generated code to not have to do the verification within the loop resulting in performance boost.
When the JIT compiler starts producing the native code for Method2 it is smart enough to figure out that the type constructors for both types would have been called in Method1 execution chain. It thus omits spitting the code to verify and execute the type constructors for these types in the Method2 execution scope which results in both types executing with same time lines.
Following is a screenshot from my laptop displaying the execution times for the above sample
As a last noted please bear in mind that the performance of the methods is determined by execution order of the code at JIT compilation time. If you switch the two calls Method2 will become the slower method as JIT will emit the verification code in the scope chain of Method2.
I hope with this article I have succeeded in explaining the hidden performance penalties that can incur while implementing static constructors for classes. As I mentioned in the description above these constructors can also impact the object initialization and disposal time lines when used with Singleton pattern which can lead to objects staying instantiated for prolonged periods on heap. I will try and explain this in the next post.
As always I have uploaded all the code used in this blog post. The archived file is available to download here