Inspecting aspects and interception in .NET, part 2
Last time we evaluated a number of ways to add instrumentation to your code while keeping your code maintainable and legible. We landed on using interception and the cliffhanger question from last time was: what’s the overhead? Is it fast enough? In this post, we will start looking at ways to answer this question.
Benchmarking is hard
It’s not easy to measure how fast your code runs benchmark your code. There are misconceptions about how to measure how long something takes and what that means. You might think to capture the current date and time, run something in a loop, and again capture the current date and time. Take the difference, divide it by the number of iterations, and you have a relevant number. And you’d be wrong.
Don’t measure using DateTime
Naively measuring your code’s run-time duration looks like this:
DateTime startTime = DateTime.Now;
LongRunningOperation();
DateTime endTime = DateTime.Now;
TimeSpan elapsed = endTime - startTime;
This is wrong for a number of reasons.
DateTime
does not have enough accuracy to measure small increments;- Assuming you’re doing this in production code, you might have gone into or out of daylight savings time, giving you huge or negative readings;
- Even if you’re using
DateTime.UtcNow
, the system time might have been adjusted—perhaps by an automatic time synchronization.
There are probably more of these falsehoods programmers believe about time that apply to measuring how long your code took to execute.
They can all be easily fixed by using something designed for the task: System.Diagnostics.Stopwatch
. It is a highly accurate, monotonically increasing counter which does not suffer from changes to date or time at all. But even Stopwatch
does not have an infinite amount of accuracy.
Don’t run your code just once
Let’s try to measure the run-time of a particularly small piece of code:
int x = 1;
var sw = Stopwatch.StartNew();
x++;
sw.Stop();
On my machine (YMMV), for about half the times, the elapsed time is zero, while the rest of the times, I get 300 nanoseconds. Neither of those sound plausible.
Let’s run it a whole lot of times and extrapolate a mean running time.
int x = 1;
var sw = Stopwatch.StartNew();
for (int i = 0; i < 100000; i++)
{
x++;
}
sw.Stop();
Now I’m getting a total run-time of around 67 000 nanoseconds. But there seems to be a lot of jitter in the results. Here are the results of ten consecutive runs.
74 666,61 ns
51 357,98 ns
36 740,71 ns
79 802,41 ns
76 641,91 ns
41 086,39 ns
82 962,90 ns
117 728,30 ns
41 481,45 ns
40 296,26 ns
My number of iterations is probably still too small. I’d need to do some trial runs to figure out how many iterations I need to reduce the effects of jitter.
Unforeseen consequences
There are a lot of issues that crop up when you try to benchmark your code.
- The JIT compiler might add a delay to the first few calls to your code;
- The garbage collector might randomly pause your code to clean up memory;
- The benchmarking code itself also takes time to execute; you don’t want that included in the test results;
- You might be interested in more statistics than the mean execution time (think median, percentiles, standard deviation...);
- Compilers are pretty good at finding dead code, so you need to make sure your code is not being optimized away;
- Different runtimes might provide different results, so you probably want to see what running in x64 mode does, or when using (or not using) RyuJIT.
There are a thousand other problems you might encounter if you try to write your own benchmarking framework. Benefit from the blood, sweat and tears of other people and use an existing, battle-hardened one. My library of choice is BenchmarkDotNet.
BenchmarkDotNet
Let’s examine what a benchmark looks like when using BenchmarkDotNet. For the sake of example, we’ll be benchmarking string concatenation. I want to compare the speed of string.Concat
, string.Format
and using a StringBuilder
. Yes, I am aware that the comparison is totally unfair. Just play along.
class StringConcatenation
{
[Benchmark]
public string UsingStringConcat()
{
return string.Concat("Hello,", " ", "world!");
}
[Benchmark]
public string UsingStringFormat()
{
return string.Format("{0}{1}{2}", "Hello", " ", "world!");
}
[Benchmark]
public string UsingStringBuilder()
{
var builder = new StringBuilder();
builder.Append("Hello,");
builder.Append(" ");
builder.Append("world!");
return builder.ToString();
}
}
Each method decorated with [Benchmark]
is a candidate to be benchmarked, and when you start the process, like so:
BenchmarkRunner.Run<StringConcatenation>();
... BenchmarkDotNet will start running benchmarks. It will go through a number of steps to find the right iteration count, measure its own overhead and warm up the environment.
After a bit of thinking it will produce something like this:
// * Summary *
Host Process Environment Information:
BenchmarkDotNet=v0.9.8.0
OS=Microsoft Windows NT 6.2.9200.0
Processor=Intel(R) Core(TM) i7-6700HQ CPU 2.60GHz, ProcessorCount=8
Frequency=2531249 ticks, Resolution=395.0619 ns, Timer=TSC
CLR=MS.NET 4.0.30319.42000, Arch=32-bit RELEASE
GC=Concurrent Workstation
JitModules=clrjit-v4.6.1586.0
Type=StringConcatenation Mode=Throughput GarbageCollection=Concurrent Workstation
Method | Median | StdDev |
------------------- |------------ |---------- |
UsingStringConcat | 27.7441 ns | 0.5363 ns |
UsingStringFormat | 125.0597 ns | 1.6980 ns |
UsingStringBuilder | 53.8680 ns | 1.9694 ns |
Let’s look at the table at the bottom, which, incidentally, is valid GFM.
Method | Median | StdDev |
---|---|---|
UsingStringConcat |
27,7441 ns | 0,5363 ns |
UsingStringFormat |
125,0597 ns | 1,6980 ns |
UsingStringBuilder |
53,8680 ns | 1,9694 ns |
This is the final performance report. By default it lists the median duration and the standard deviation. We can see from the median duration that in this run, on my machine, using the regular x86 JIT, string.Concat
is almost twice as fast as StringBuilder
, and from the standard deviation that the duration is also much more stable.
I’ve heard and read rumors that StringBuilder
will be faster than string.Concat
if you concatenate more than 5 substrings. Let’s find out.
class StringConcatenation
{
[Benchmark]
public string UsingStringConcat()
{
return string.Concat("Hello,", " ", "world!", " Something ", "else ", "entirely");
}
[Benchmark]
public string UsingStringFormat()
{
return string.Format("{0}{1}{2}{3}{4}{5}", "Hello", " ", "world!", " Something ", "else ", "entirely");
}
[Benchmark]
public string UsingStringBuilder()
{
var builder = new StringBuilder();
builder.Append("Hello,");
builder.Append(" ");
builder.Append("world!");
builder.Append(" Something ");
builder.Append("else ");
builder.Append("entirely");
return builder.ToString();
}
}
Method | Median | StdDev |
---|---|---|
UsingStringConcat |
125,4313 ns | 2,3578 ns |
UsingStringFormat |
250,0233 ns | 4,5334 ns |
UsingStringBuilder |
164,9556 ns | 2,2380 ns |
string.Format
is still trailing behind, but the gap between string.Concat
and StringBuilder
has shrunk significantly. Let’s see what happens if we add more arguments.
Method | Median | StdDev |
---|---|---|
UsingStringConcat |
183,3770 ns | 11,9913 ns |
UsingStringFormat |
361,9582 ns | 5,2223 ns |
UsingStringBuilder |
198,7081 ns | 4,9657 ns |
Now things are getting interesting. The two methods are almost as fast, but it’s notable that string.Concat
’s standard deviation is much higher than that of StringBuilder
.
Method | Median | StdDev |
---|---|---|
UsingStringConcat |
262,7727 ns | 69,2226 ns |
UsingStringFormat |
504,8921 ns | 15,0032 ns |
UsingStringBuilder |
290,8394 ns | 4,0046 ns |
The relative gap between the two now seems to be increasing, whereas the standard deviation is only increasing for string.Concat
, and decreasing for StringBuilder
. Let’s go all out and double the number of arguments.
Method | Median | StdDev |
---|---|---|
UsingStringConcat |
500,1520 ns | 31,3080 ns |
UsingStringFormat |
882,1520 ns | 37,3244 ns |
UsingStringBuilder |
475,3793 ns | 14,2873 ns |
I ran this one a number of times to be sure, but it looks like StringBuilder
is finally faster than string.Concat
! The conclusion? It takes around 28 substrings to be concatenated for StringBuilder
to be faster than string.Concat
on my machine, in x86 mode, using .NET 4.6.2. I didn’t measure the exact tipping point, but I guesstimate it to be around 25 substrings. Interestingly enough, when using RyuJIT in x64 mode, string.Concat
was still winning out.
What does this tell us?
Very little. At most, it tells us that there is little to no reason to use StringBuilder
over string.Concat
for scenarios where you know at compile-time which strings you will be concatenating, and you’re not using loops.
It’s important to realize that benchmarks can give you an indication of performance, but they are no replacement for a run-time metric or a profiler when it comes to finding hotspots. Also, YMMV; when you benchmark a piece of code using a certain runtime on a certain architecture, your results are only valid on that specific environment. Nowhere else.
This, however, is what micro-benchmarks are about. They tell you if algorithm A is faster than algorithm B under specific circumstances, and by how much. When the difference is small, unless you have extremely stringent performance requirements, at most you might say: this won’t hurt us.
Now that we’ve looked at micro-benchmarking, next time we’ll look at actually putting some DI frameworks through their paces, combining it with interception, and trying to explain what we find.