StringBuilder vs String concatenation

Published on

Instances of String are immutable in Java and in Groovy too, of course. This isn’t very apparent when the code looks like this

someString += 'new string part'

Under the hood the JVM creates an entirely new String instance and connects that new thing to someString. Most of the time it is perfectly fine to ignore this implementation detail, but If you have a use case where it is worth having a more efficient approach there is the appropriately named StringBuilder. It is a mutable sequence of characters - basically a buffer object.

StringBuilder is very fast and consumes less memory than a String while performing concatenations.

The Code Comparasaurus

Here is an example and a performance comparison using StringBuilder:

String stringConcatenation() {
	String s = 'Foo '
	for (int i = 0; i < 500; i++) {
		s += 'Bar '
	}
	return s
}

String stringBuilderConcatenation() {
	// we have a known/predictable size of sting this builder will receive  
	StringBuilder sb = new StringBuilder(2500) // 500*5 is being safe
	sb.append('Foo ')
	// use string constructor if you don't have any idea how big of a buffer you might need.
	// StringBuilder sb = new StringBuilder('Foo ')
	for (int i = 0; i < 500; i++) {
		sb.append('Bar ')
	}
	return sb.toString()
}

long start = System.nanoTime()
String s1 = stringConcatenation();
long end = System.nanoTime();

long time1 = end - start

start = System.nanoTime();
String s2 = stringBuilderConcatenation();
end = System.nanoTime();

println 'They are equivalent strings? ' + (s1 == s2)

long time2 = end - start

println "Time taken by String concatenation:        $time1 ns"
println "Time taken by StringBuilder concatenation:  $time2 ns"

// Truncate the tailing percentage by casting as a Long.
Long percentSlower = ((time1 - time2)/time2) * 100
println "Meaning string += is ~${percentSlower}% slower than using a buffer approach"

// It is easy to forget that there are a LOT of ns per ms.
Long timeDiffMs = (time1 - time2)/1000000
println "In real time terms it is really not much... in this case it is only ${timeDiffMs}ms slower" 

Give that a run in a GroovyConsole.

Locally if you have Groovy installed groovyConsole should fire it up. Or there is a web version available over at https://groovyconsole.appspot.com/

Output will vary quite a bit, but here is a fairly average output on the web console I just referred to

They are equivalent strings? true
Time taken by String concatenation:        1263077 ns
Time taken by StringBuilder concatenation:  187213 ns
Meaning string += is ~574% slower than using a buffer approach
In real time terms it is really not much... in this case it is only 1ms slower

Conclusion

Okay, so 500 or 600 % sounds like a lot, but the big caveat is that in absolute terms it isn’t much. Not even in a contrived and somewhat extreme example such as this. Memory usage is almost certainly also significantly lower, but if I were guessing I bet it follows the same pattern - that is, significant in terms of percentage, but probably not significant relative to the memory use of a large JVM application like Spring or Grails. I didn’t have enough ambition today to go about getting an accurate measure on that.

Now if you were using something like Micronaut?!

Here is the other odd thing. JVM manages memory for you. Automatic garbage collection is kinda nice, but I got such wildly varying results with this test case that I can’t help suspect that as the culprit in this case. The variation was especially exaggerated when running via the web console.

They are equivalent strings? true
Time taken by String concatenation:        40640908 ns
Time taken by StringBuilder concatenation:   254922 ns
Meaning string += is ~15842% slower than using a buffer approach
In real time terms it is really not much... in this case it is only 40ms slower

One take-away from that high degree of variation; the ol += approach can sometimes, unpredictably, be a lot LOT slower than StringBuilder, so if you need a method with more consistent performance characteristic or throughput then StringBuilder is probably the way to go.