paint-brush
A Detailed Guide to String Concatenation in .NETby@fairday
37,204 reads
37,204 reads

A Detailed Guide to String Concatenation in .NET

by AlekseiFebruary 20th, 2024
Read on Terminal Reader
Read this story w/o Javascript
tldt arrow

Too Long; Didn't Read

Delve into the efficiency of string interpolation handlers in C#, uncovering compiler optimizations and performance benchmarks. Learn how to enhance code readability and optimize memory allocation for improved application performance. TLDR (Summary): The use of string interpolation handlers in C# offers significant performance benefits over traditional methods like String.Format, reducing memory allocation and optimizing code execution. Leveraging DefaultInterpolatedStringHandler or StringBuilder in multiple concatenation scenarios enhances efficiency, while custom handlers can be developed for specific use cases, promoting code readability and performance in .NET development.
featured image - A Detailed Guide to String Concatenation in .NET
Aleksei HackerNoon profile picture

Introduction

Everyone who develops apps and services with C# uses string concatenation. Whenever we need to build a formatted message with different types of data or combine several strings to show helpful information anywhere, we usually enlist the help of string interpolation.


In the following code snippet, we have a very primitive example of string interpolation


int orderAmount = 150;
string orderNumber = "ORDER-13";
Console.WriteLine($"The order with number {orderNumber} has amount of {orderAmount} items");


If we execute the code above, the console shows the following output


The order with number ORDER-13 has amount of 150 items


But, what is happening under the hood?

How does the compiler optimize our code?


In general, high-level programming languages offer many abstract programming constructs such as functions, loops, conditional statements, and many other useful things, which help us to be productive and write readable code. Of course, it has a significant drawback — a potential performance decrease. But should developers care about the cost of using such nice abstractions instead of focusing on writing understandable and maintainable code? Ideally, not. For this reason, compilers attempt to optimize our code to improve its performance.


For instance, for the code snippet above and using C# 10, the compiler transforms it into the following code:


int value = 150;
string value2 = "ORDER-13";
DefaultInterpolatedStringHandler defaultInterpolatedStringHandler = new DefaultInterpolatedStringHandler(41, 2);
defaultInterpolatedStringHandler.AppendLiteral("The order with number ");
defaultInterpolatedStringHandler.AppendFormatted(value2);
defaultInterpolatedStringHandler.AppendLiteral(" has amount of ");
defaultInterpolatedStringHandler.AppendFormatted(value);
defaultInterpolatedStringHandler.AppendLiteral(" items");
Console.WriteLine(defaultInterpolatedStringHandler.ToStringAndClear());


And when using C# 9, it translates the same code but using string.Format method instead:

int num = 150;
string arg = "ORDER-13";
Console.WriteLine(string.Format("The order with number {0} has amount of {1} items", arg, num));


As we can see, in the later C# version, the compiler utilizes a new feature, interpolated string handlers.


This article brilliantly explains in-depth how it works.


In a nutshell, interpolated string handlers optimize string-building to avoid performance problems with usingstring.Format methods such as unnecessary allocation of object[] on the heap, boxing of arguments, and intermediate string generations. Moreover, performance improvements come almost without any changes in existing codebases. Also, the exciting side of the new strings-building design is the possibility of skipping allocating any memory for particular conditions.

Approach and Results

Let’s compare the performance of string concatenation with different options:

  1. String.Format

  2. String.Concat

  3. String.Join

  4. String interpolation

  5. StringBuilder

  6. DefaultInterpolatedStringHandler

  7. Enumerable.Aggregate


I usually use Benchmark.DotNet for benchmarking different solutions. I will benchmark concatenation with a mix of value and reference types and only immutable strings.


//case #1
int orderAmount = 150;
string orderNumber = "ORDER-13";


For the first case, we have the following results:

BenchmarkDotNet v0.13.7, Windows 11 (10.0.22621.2134/22H2/2022Update/SunValley2)
AMD Ryzen 7 5700U with Radeon Graphics, 1 CPU, 16 logical and 8 physical cores
.NET SDK 7.0.202
  [Host]   : .NET 7.0.4 (7.0.423.11508), X64 RyuJIT AVX2
  .NET 7.0 : .NET 7.0.4 (7.0.423.11508), X64 RyuJIT AVX2

Job=.NET 7.0  Runtime=.NET 7.0  

|                           Method |      Mean |    Error |   StdDev |   Gen0 | Allocated |
|--------------------------------- |----------:|---------:|---------:|-------:|----------:|
|                     StringFormat |  79.34 ns | 1.002 ns | 0.783 ns | 0.0573 |     120 B |
|              StringInterpolation |  54.01 ns | 0.922 ns | 0.906 ns | 0.0459 |      96 B |
|                     StringConcat |  51.08 ns | 0.208 ns | 0.173 ns | 0.0918 |     192 B |
|                       StringJoin |  74.55 ns | 0.593 ns | 0.526 ns | 0.1032 |     216 B |
|                    StringBuilder |  84.85 ns | 0.311 ns | 0.305 ns | 0.2104 |     440 B |
| DefaultInterpolatedStringHandler |  50.56 ns | 0.431 ns | 0.360 ns | 0.0459 |      96 B |
|              EnumerableAggregate | 150.56 ns | 1.761 ns | 1.648 ns | 0.2716 |     568 B |


As we can see, StringFormat is 30% slower and allocates much more memory than using StringInterpolation or DefaultInterpolatedStringHandler, which are the same after compilator optimizations.


Here is the link for the benchmark source code.


string StringFormat()
{
    int orderAmount = 150;
    string orderNumber = "ORDER-13";
    return string.Format("Order number {0} has {1} items.", orderNumber, orderAmount);
}

string StringInterpolation()
{
    int orderAmount = 150;
    string orderNumber = "ORDER-13";
    return $"Order number {orderNumber} has {orderAmount} items.";
}

string DefaultInterpolatedStringHandler()
{
    int orderAmount = 150;
    string orderNumber = "ORDER-13";
    DefaultInterpolatedStringHandler defaultInterpolatedStringHandler = new DefaultInterpolatedStringHandler(25, 2);
    defaultInterpolatedStringHandler.AppendLiteral("Order number ");
    defaultInterpolatedStringHandler.AppendFormatted(orderNumber);
    defaultInterpolatedStringHandler.AppendLiteral(" has ");
    defaultInterpolatedStringHandler.AppendFormatted(orderAmount);
    defaultInterpolatedStringHandler.AppendLiteral(" items.");
    return defaultInterpolatedStringHandler.ToStringAndClear();
}

[System.Runtime.CompilerServices.NullableContext(1)]
[CompilerGenerated]
internal static string <<Main>$>g__StringFormat|0_0()
{
    int num = 150;
    string arg = "ORDER-13";
    return string.Format("Order number {0} has {1} items.", arg, num);
}

[System.Runtime.CompilerServices.NullableContext(1)]
[CompilerGenerated]
internal static string <<Main>$>g__StringInterpolation|0_1()
{
    int value = 150;
    string value2 = "ORDER-13";
    DefaultInterpolatedStringHandler defaultInterpolatedStringHandler = new DefaultInterpolatedStringHandler(25, 2);
    defaultInterpolatedStringHandler.AppendLiteral("Order number ");
    defaultInterpolatedStringHandler.AppendFormatted(value2);
    defaultInterpolatedStringHandler.AppendLiteral(" has ");
    defaultInterpolatedStringHandler.AppendFormatted(value);
    defaultInterpolatedStringHandler.AppendLiteral(" items.");
    return defaultInterpolatedStringHandler.ToStringAndClear();
}

[System.Runtime.CompilerServices.NullableContext(1)]
[CompilerGenerated]
internal static string <<Main>$>g__DefaultInterpolatedStringHandler|0_2()
{
    int value = 150;
    string value2 = "ORDER-13";
    DefaultInterpolatedStringHandler defaultInterpolatedStringHandler = new DefaultInterpolatedStringHandler(25, 2);
    defaultInterpolatedStringHandler.AppendLiteral("Order number ");
    defaultInterpolatedStringHandler.AppendFormatted(value2);
    defaultInterpolatedStringHandler.AppendLiteral(" has ");
    defaultInterpolatedStringHandler.AppendFormatted(value);
    defaultInterpolatedStringHandler.AppendLiteral(" items.");
    return defaultInterpolatedStringHandler.ToStringAndClear();
}


The curious observation is that the use of StringBuilder is slower than string.Format, but actually, StringBuilder starts to show a drastically better performance in multiple concatenation statements.


[Benchmark]
public string StringBuilder()
{
    var sb = new StringBuilder();
    for (int i = 0; i < 100; i++)
    {
        sb.Append(i);
    }
    return sb.ToString();
}

[Benchmark]
public string StringConcat()
{
    string result = string.Empty;
    for (int i = 0; i < 100; i++)
    {
        result += i;
    }
    return result;
}

[Benchmark]
public string StringInterpolation()
{
    string result = string.Empty;
    for (int i = 0; i < 100; i++)
    {
        result += $"{i}";
    }
    return result;
}

[Benchmark]
public string StringInterpolationHandler()
{
    var handler = new DefaultInterpolatedStringHandler(0, 100);
    for (int i = 0; i < 100; i++)
    {
        handler.AppendFormatted(i);
    }
    return handler.ToStringAndClear();
}


BenchmarkDotNet v0.13.7, Windows 11 (10.0.22621.2134/22H2/2022Update/SunValley2)
AMD Ryzen 7 5700U with Radeon Graphics, 1 CPU, 16 logical and 8 physical cores
.NET SDK 7.0.202
  [Host]   : .NET 7.0.4 (7.0.423.11508), X64 RyuJIT AVX2
  .NET 7.0 : .NET 7.0.4 (7.0.423.11508), X64 RyuJIT AVX2

Job=.NET 7.0  Runtime=.NET 7.0  

|                     Method |       Mean |     Error |    StdDev |    Gen0 | Allocated |
|--------------------------- |-----------:|----------:|----------:|--------:|----------:|
|              StringBuilder |   837.6 ns |  16.78 ns |  47.88 ns |  0.6733 |    1408 B |
|               StringConcat | 2,774.6 ns |  55.47 ns | 106.87 ns | 11.3487 |   23736 B |
|        StringInterpolation | 6,534.7 ns | 170.19 ns | 491.05 ns | 11.4594 |   23976 B |
| StringInterpolationHandler |   681.5 ns |  13.69 ns |  33.05 ns |  0.1945 |     408 B |


As we can see, the application of StringBuilder completely outperforms string.Concat and interpolation approaches, but playbacks to DefaultInterpolatedStringHandler in this scenario.

Now, for completeness, let’s test our second case by only using immutable strings in concatenation.


//case #2
string orderAmount = "150";
string orderNumber = "ORDER-13";


After running benchmarks, we have the following results:

BenchmarkDotNet v0.13.7, Windows 11 (10.0.22621.2134/22H2/2022Update/SunValley2)
AMD Ryzen 7 5700U with Radeon Graphics, 1 CPU, 16 logical and 8 physical cores
.NET SDK 7.0.202
  [Host]   : .NET 7.0.4 (7.0.423.11508), X64 RyuJIT AVX2
  .NET 7.0 : .NET 7.0.4 (7.0.423.11508), X64 RyuJIT AVX2

Job=.NET 7.0  Runtime=.NET 7.0  

|                           Method |      Mean |    Error |   StdDev |    Median |   Gen0 | Allocated |
|--------------------------------- |----------:|---------:|---------:|----------:|-------:|----------:|
|                     StringFormat |  80.89 ns | 1.662 ns | 1.847 ns |  79.89 ns | 0.0459 |      96 B |
|              StringInterpolation |  44.67 ns | 0.319 ns | 0.283 ns |  44.70 ns | 0.0459 |      96 B |
|                       StringJoin |  48.15 ns | 0.168 ns | 0.141 ns |  48.13 ns | 0.0765 |     160 B |
|                    StringBuilder |  80.18 ns | 1.538 ns | 3.656 ns |  78.35 ns | 0.2104 |     440 B |
| DefaultInterpolatedStringHandler |  46.24 ns | 0.931 ns | 1.108 ns |  46.23 ns | 0.0459 |      96 B |
|              EnumerableAggregate | 135.90 ns | 2.160 ns | 1.686 ns | 135.51 ns | 0.2563 |     536 B |


As we can see, using DefaultInterpolatedStringHandler is still the most efficient way to concatenate strings.


High-performance logging

Logging is crucial for software, offering transparency and aiding in debugging complex failures. In developing low-level libraries or optimizing performance-critical paths, reducing memory allocation is key for enhancing performance. Boxing, the conversion of value types (like int, double, char) to reference types stored on the heap, involves additional work for the Garbage Collector (GC), impacting performance.


The standardILogger extension method LogInformation accepts arguments as objects, which means we often need to log value types such as counts, dates, and numbers. This process can involve boxing, which affects performance.


public static void LogInformation(this ILogger logger, string? message, params object?[] args)
{
    logger.Log(LogLevel.Information, message, args);
}


To make our lives easier, Microsoft provides set of useful extensions to overcome this disadvantage.


LoggerMessage provides the following performance advantages over Logger extension methods:

  1. Logger extension methods require "boxing" (converting) value types, such as int, into object. The LoggerMessage pattern avoids boxing by using static Action fields and extension methods with strongly typed parameters.
  2. Logger extension methods must parse the message template (named format string) every time a log message is written. LoggerMessage only requires parsing a template once when the message is defined.


Under the hood, it utilises predefined functions with generic parameters, which allows runtime to optimise invocation and avoid unnecessary boxing.


/// <summary>
/// Creates a delegate which can be invoked for logging a message.
/// </summary>
/// <typeparam name="T1">The type of the first parameter passed to the named format string.</typeparam>
/// <typeparam name="T2">The type of the second parameter passed to the named format string.</typeparam>
/// <typeparam name="T3">The type of the third parameter passed to the named format string.</typeparam>
/// <param name="logLevel">The <see cref="LogLevel"/></param>
/// <param name="eventId">The event id</param>
/// <param name="formatString">The named format string</param>
/// <param name="options">The <see cref="LogDefineOptions"/></param>
/// <returns>A delegate which when invoked creates a log message.</returns>
public static Action<ILogger, T1, T2, T3, Exception?> Define<T1, T2, T3>(LogLevel logLevel, EventId eventId, string formatString, LogDefineOptions? options)
{
    LogValuesFormatter formatter = CreateLogValuesFormatter(formatString, expectedNamedParameterCount: 3);

    void Log(ILogger logger, T1 arg1, T2 arg2, T3 arg3, Exception? exception)
    {
        logger.Log(logLevel, eventId, new LogValues<T1, T2, T3>(formatter, arg1, arg2, arg3), exception, LogValues<T1, T2, T3>.Callback);
    }

    if (options != null && options.SkipEnabledCheck)
    {
        return Log;
    }

    return (logger, arg1, arg2, arg3, exception) =>
    {
        if (logger.IsEnabled(logLevel))
        {
            Log(logger, arg1, arg2, arg3, exception);
        }
    };
}


Let us create a simple benchmark to measure two scenarios: standard logging and one utilising LoggerMessage


Of course, it will involve logging of value types. First, I use the NullLogger class to avoid additional overhead produced by logging into the Console, for instance.


[SimpleJob(RuntimeMoniker.Net80)]
[MarkdownExporterAttribute.GitHub]
[MemoryDiagnoser]
public class Logging
{
    private ILogger _logger = default!;

    private static readonly Action<ILogger, decimal, int, bool, Exception> _logInformation = LoggerMessage.Define<decimal, int, bool>(
        LogLevel.Information,
        new EventId(1, nameof(Logging)),
        "Information message, {0}, {1}, {2}");

    private static readonly Action<ILogger, bool, char, double, Exception> _logDebug = LoggerMessage.Define<bool, char, double>(
        LogLevel.Debug,
        new EventId(1, nameof(Logging)),
        "Information debug, {0}, {1}, {2}");

    private static readonly Action<ILogger, string, int, double, Exception> _logWarning = LoggerMessage.Define<string, int, double>(
        LogLevel.Warning,
        new EventId(1, nameof(Logging)),
        "Information warning, {0}, {1}, {2}");

    [GlobalSetup]
    public void Setup()
    {
        var loggerFactory = NullLoggerFactory.Instance;
        _logger = loggerFactory.CreateLogger<Logging>();
    }

    [Benchmark]
    public void LogInformation_WithMultipleValueTypes()
    {
        _logger.LogInformation("Information message, {0}, {1}, {2}", 15m, 137, true);
        _logger.LogDebug("Information debug, {0}, {1}, {2}", true, 'o', 150e6d);
        _logger.LogWarning("Information warning, {0}, {1}, {2}", "test", 9, 13.1d);
    }

    [Benchmark]
    public void LoggerMessage_WithMultipleValueTypes()
    {
        _logInformation(_logger, 15m, 137, true, null);
        _logDebug(_logger, true, 'o', 150e6, null);
        _logWarning(_logger, "test", 9, 13.1d, null);
    }
}


The results indicate that LoggerMessage performs better than the standard logging extensions.


BenchmarkDotNet v0.13.12, Windows 11 (10.0.22631.3296/23H2/2023Update/SunValley3)
13th Gen Intel Core i7-13700HX, 1 CPU, 24 logical and 16 physical cores
.NET SDK 8.0.101
  [Host]   : .NET 8.0.1 (8.0.123.58001), X64 RyuJIT AVX2 [AttachedDebugger]
  .NET 8.0 : .NET 8.0.1 (8.0.123.58001), X64 RyuJIT AVX2

Job=.NET 8.0  Runtime=.NET 8.0

| Method                                | Mean       | Error     | StdDev     | Median     | Gen0   | Allocated |
|-------------------------------------- |-----------:|----------:|-----------:|-----------:|-------:|----------:|
| LogInformation_WithMultipleValueTypes | 176.656 ns | 7.0850 ns | 20.7790 ns | 182.843 ns | 0.0219 |     344 B |
| LoggerMessage_WithMultipleValueTypes  |   4.574 ns | 0.3076 ns |  0.9069 ns |   4.939 ns |      - |         - |


And the same but with the Console

[GlobalSetup]
public void Setup()
{
    var loggerFactory = LoggerFactory.Create(builder => builder.AddConsole());
    _logger = loggerFactory.CreateLogger<Logging>();
}


As we can see here, the Logger message also performs better while allocating 37% less memory.


BenchmarkDotNet v0.13.12, Windows 11 (10.0.22631.3296/23H2/2023Update/SunValley3)
13th Gen Intel Core i7-13700HX, 1 CPU, 24 logical and 16 physical cores
.NET SDK 8.0.101
  [Host]   : .NET 8.0.1 (8.0.123.58001), X64 RyuJIT AVX2 [AttachedDebugger]
  .NET 8.0 : .NET 8.0.1 (8.0.123.58001), X64 RyuJIT AVX2

Job=.NET 8.0  Runtime=.NET 8.0

| Method                                | Mean     | Error   | StdDev  | Allocated |
|-------------------------------------- |---------:|--------:|--------:|----------:|
| LogInformation_WithMultipleValueTypes | 326.8 us | 6.45 us | 6.90 us |     955 B |
| LoggerMessage_WithMultipleValueTypes  | 309.9 us | 6.14 us | 8.19 us |     608 B |

Conclusion

  1. With C# 10 and higher, use string interpolation instead of string.Format, it allocates much less extra memory besides the final string
  2. Use DefaultInterpolatedStringHandler or StringBuilder in multiple concatenation statements
  3. Try to avoid using DefaultInterpolatedStringHandler directly since it could reduce code readability
  4. Develop custom string interpolation handler in scenarios like Debug.Assert, usually for hot paths and libraries
  5. Utilise LoggerMessage in hot path scenarios or low-level libraries to avoid unnecessary allocation due to boxing of value types


See you next time!