String Concatenation in .NET – what really goes on?

There are a number of ways to concatenate a string in C# and other .NET languages. Is there a *best* way?

Let’s look at how the C# compiler translates our code into IL to see the differences.

Here is a simple little console application to concatenate a string via String.Concat, String.Format, System.String’s (+) operator, System.String’s Join, and StringBuilder’s Append method.


    class Program
    {
        static void Main(string[] args)
        {
            string first = "The cake";
            string second = " is a";
            string third = " lie.";

            string fromConcat = Concat(first, second, third);
            string fromFormat = Format(first, second, third);
            string fromPluses = Pluses(first, second, third);
            string fromBuilder = Builder(first, second, third);
            string fromJoiner = Joiner(first, second, third);

            Console.WriteLine(fromConcat);
            Console.WriteLine(fromFormat);
            Console.WriteLine(fromPluses);
            Console.WriteLine(fromBuilder);
            Console.WriteLine(fromJoiner);

            Console.ReadLine();
        }

        static string Concat(params string[] strings)
        {
            return String.Concat(strings);
        }

        static string Format(params string[] strings)
        {
            return string.Format("{0}{1}{2}", strings);
        }

        static string Pluses(params string[] strings)
        {
            return strings[0] + strings[1] + strings[2];
        }

        static string Builder(params string[] strings)
        {
            StringBuilder sb = new StringBuilder();
            sb.Append(strings[0]);
            sb.Append(strings[1]);
            sb.Append(strings[2]);
            return sb.ToString();
        }

        static string Joiner(params string[] strings)
        {
            return string.Join("", strings);
        }
    }

Which of these do you think will require the most intermediary code? I’d say it’s StringBuilder, since we have to instantiate an object and call a method on that object a number of times. I’d like to go over these individually. The code in Main() works the same for all methods: a new string array is created and passed to the method which returns a string into a new variable. Here are the methods:

Concat

  .method private hidebysig static string 
          Concat(string[] strings) cil managed
  {
    .param [1]
    .custom instance void [mscorlib]System.ParamArrayAttribute::.ctor() = ( 01 00 00 00 ) // Code size       12 (0xc)
    .maxstack  1
    .locals init ([0] string CS$1$0000)
    IL_0000:  nop
    IL_0001:  ldarg.0
    IL_0002:  call       string [mscorlib]System.String::Concat(string[])
    IL_0007:  stloc.0
    IL_0008:  br.s       IL_000a
    IL_000a:  ldloc.0
    IL_000b:  ret
  } // end of method Program::Concat

The C# compiler converts String.Concat into the smallest amount of code, and (I assume), the best as far as performance. It couldn’t get any simpler than this method: it loads the arguments, passes them to the method, stores that result in memory and returns.

Format

  .method private hidebysig static string 
          Format(string[] strings) cil managed
  {
    .param [1]
    .custom instance void [mscorlib]System.ParamArrayAttribute::.ctor() = ( 01 00 00 00 ) // Code size       17 (0x11)
    .maxstack  2
    .locals init ([0] string CS$1$0000)
    IL_0000:  nop
    IL_0001:  ldstr      "{0}{1}{2}"
    IL_0006:  ldarg.0
    IL_0007:  call       string [mscorlib]System.String::Format(string,  object[])
    IL_000c:  stloc.0
    IL_000d:  br.s       IL_000f
    IL_000f:  ldloc.0
    IL_0010:  ret
  } // end of method Program::Format

String.Format (my favorite of all the methods), comes in at a code size of 17 lines. The additional space is required for the string’s template, which of course isn’t necessary with String.Concat, but allows you to perform a number of useful operations on a string. For instance, String.Concat will only string the strings together. If you want to add a space between them, you’d have to create either one string to represent a space and concat that along (which gives you the same amount of code as String.Format anyway), or you could do something like:

string overDoingIt = String.Concat(string[0], " probably", string[1], " big fat", string[2]);

Not to mention, String.Format allows you to easily apply formatting rules without individually creating new objects. Granted, it would be converted into IL as something using a number formatter, but it makes our jobs as developers much easier.

Pluses

  .method private hidebysig static string 
          Pluses(string[] strings) cil managed
  {
    .param [1]
    .custom instance void [mscorlib]System.ParamArrayAttribute::.ctor() = ( 01 00 00 00 ) // Code size       20 (0x14)
    .maxstack  4
    .locals init ([0] string CS$1$0000)
    IL_0000:  nop
    IL_0001:  ldarg.0
    IL_0002:  ldc.i4.0
    IL_0003:  ldelem.ref
    IL_0004:  ldarg.0
    IL_0005:  ldc.i4.1
    IL_0006:  ldelem.ref
    IL_0007:  ldarg.0
    IL_0008:  ldc.i4.2
    IL_0009:  ldelem.ref
    IL_000a:  call       string [mscorlib]System.String::Concat(string, string, string)
    IL_000f:  stloc.0
    IL_0010:  br.s       IL_0012
    IL_0012:  ldloc.0
    IL_0013:  ret
  } // end of method Program::Pluses

I see the (+) operator used a lot for string concatenation. There have been a number of articles that claim this performance is worse than that of String.Concat. Not even bothering with the actual performance aspect of it, you can see that this requires nearly twice as much intermediate language code to be generated than String.Concat.

If you’re writing an application that requires thousands of string concatenation operations, I’d suggest using String.Concat. Even if you have to hard-code single characters as constants and occasionally pass a new string into the mix, it should still offer a great deal less generated code (and presumably better performance) than the (+) operator.

Builder

  .method private hidebysig static string 
          Builder(string[] strings) cil managed
  {
    .param [1]
    .custom instance void [mscorlib]System.ParamArrayAttribute::.ctor() = ( 01 00 00 00 )    // Code size       48 (0x30)
    .maxstack  3
    .locals init ([0] class [mscorlib]System.Text.StringBuilder sb, [1] string CS$1$0000)
    IL_0000:  nop
    IL_0001:  newobj     instance void [mscorlib]System.Text.StringBuilder::.ctor()
    IL_0006:  stloc.0
    IL_0007:  ldloc.0
    IL_0008:  ldarg.0
    IL_0009:  ldc.i4.0
    IL_000a:  ldelem.ref
    IL_000b:  callvirt   instance class [mscorlib]System.Text.StringBuilder [mscorlib]System.Text.StringBuilder::Append(string)
    IL_0010:  pop
    IL_0011:  ldloc.0
    IL_0012:  ldarg.0
    IL_0013:  ldc.i4.1
    IL_0014:  ldelem.ref
    IL_0015:  callvirt   instance class [mscorlib]System.Text.StringBuilder [mscorlib]System.Text.StringBuilder::Append(string)
    IL_001a:  pop
    IL_001b:  ldloc.0
    IL_001c:  ldarg.0
    IL_001d:  ldc.i4.2
    IL_001e:  ldelem.ref
    IL_001f:  callvirt   instance class [mscorlib]System.Text.StringBuilder [mscorlib]System.Text.StringBuilder::Append(string)
    IL_0024:  pop
    IL_0025:  ldloc.0
    IL_0026:  callvirt   instance string [mscorlib]System.Object::ToString()
    IL_002b:  stloc.1
    IL_002c:  br.s       IL_002e
    IL_002e:  ldloc.1
    IL_002f:  ret
  } // end of method Program::Builder

I see this used a lot in ASP.NET for injecting startup JavaScript scripts (even for 5 lines of code!). Aside from the obvious problems with that, I’d like to mention that in .NET 3.0 and higher, Page.ClientScript.RegisterStartupScript now has an overload which adds script tags. That means you don’t have to worry about the curly braces that would otherwise mess up your String.Format.

Anyway, what’s going on here is: in lines 15, 21, and 27, we’re calling the method Append on our instance of StringBuilder (instantiated as ‘sb’ in line 7). The strings are popped off the stack and passed into the method. At the end, we still have to call ToString() on our object. This is obviously way more work than is necessary. Granted, there are times when StringBuilder comes in handy, but simple concatenation really shouldn’t use it.

Joiner

  .method private hidebysig static string 
          Joiner(string[] strings) cil managed
  {
    .param [1]
    .custom instance void [mscorlib]System.ParamArrayAttribute::.ctor() = ( 01 00 00 00 )  // Code size       17 (0x11)
    .maxstack  2
    .locals init ([0] string CS$1$0000)
    IL_0000:  nop
    IL_0001:  ldstr      ""
    IL_0006:  ldarg.0
    IL_0007:  call       string [mscorlib]System.String::Join(string, string[])
    IL_000c:  stloc.0
    IL_000d:  br.s       IL_000f
    IL_000f:  ldloc.0
    IL_0010:  ret
  } // end of method Program::Joiner

String.Join is interesting in that it’s sort of between String.Concat and String.Format. I usually overlook String.Join because, like I said, I prefer String.Format. But, look at how the Joiner method requires the same amount of IL code to be generated as String.Format. There really isn’t any additional formatting required here, so String.Join is probably the second best solution.

Assume for a second that we didn’t have any spacing in our strings, and instead had “The”, “cake”, “is”, “a”, “lie”. Instead of passing these to String.Concat with a space such as:

string space = " ";
// typing the following line gets boring very quickly:
return String.Concat(string[0], space, string[1], space, string[2], space // etc...

we could use String.Join:

return String.Join(" ", strings));

Conclusion
Here is a quick summary of the findings of this little exploration:

  • String.Concat– Good for joining strings without any additional processing
  • String.Join– Good for joining strings with a specified delimeter
  • String.Format– Same amount of code for String.Join, but allows for additional formatting of strings
  • String (+) operator– Unnecessary amount of overhead when used to simply combine supplied strings
  • StringBuilder– Overly bloated method for simple concatenation.