Wednesday, January 14, 2009

Speed up your string comparisons

One of the first software engineery things I was tasked with at the new job was to do a code review for a peer. It was a good opportunity to look at some of our code. One of the things that stood out to me, however, was the way string comparisons were being performed. I saw a lot of this:

if(stringVariable1.ToUpper() == stringVariable2.ToUpper()) { ... }

One of my comments back on the code review was it might be better to do this:

if(stringVariable1.Equals( stringVariable2, StringComparison.OrdinalIgnoreCase)) { ... }

My peer agreed it might be a good idea but said the coding standard in the shop was the first approach. Not having a second software engineery thing to look at yet I figured I would do a quick benchmark between the two approaches. I also decided to include a comparision using .ToLower() just to be fair since that was in the code also.

To perform the benchmark test I used the SimpleTimer class Bill Wert outlined on his blog.

The test itself was just a console application that builds a List<string> with the number of entries identified by the person running the test. So the tester can enter 1 to whatever long.MaxValue is. I fill that List with the appropriate number of Guid.NewGuid().ToString() values. Then I time how long it takes to invoke an if using the different comparisons.

The code for all this is below, but I'm all about getting to the results...

The Results
I started small with a sample size of only 8,000:

ToUpper
8000 iterations took 0.008 seconds, resulting in 1012736.624 iterations per second

ToLower
8000 iterations took 0.007 seconds, resulting in 1199852.721 iterations per second

Equals
8000 iterations took 0.001 seconds, resulting in 9669591.016 iterations per second

Not bad. Already we see the .Equals method performs faster than changing the case and doing the == thing. I decide to skip going for a medium size test and go right for big.

Here's the results using a sample size of 8,000,000:

ToUpper
8000000 iterations took 4.608 seconds, resulting in 1736263.497 iterations per second

ToLower
8000000 iterations took 4.187 seconds, resulting in 1910494.970 iterations per second

Equals
8000000 iterations took 0.318 seconds, resulting in 25181978.355 iterations per second

Oh sure, the test takes longer - but I think it was worth it. In case you missed it - 3 tenths of a second is less than 4.6 seconds.

I'm going to see if we can get that coding standard changed...

The Code
Sorry the syntax highlighting is missing but the blog editor isn't good about that. I've changed the color on the comments but you'll want to paste this into the IDE of your choice if you want the whole enchilada.
using System;
using System.Collections.Generic;

namespace StringComparisonPerformance
{
class Program
{
static void Main(string[] args)
{
const long defaultSampleSize = 10;
long testRecordCount = defaultSampleSize;

Console.WriteLine("Enter sample size");
string inputValue = Console.ReadLine();

if (string.IsNullOrEmpty(inputValue))
{
Console.WriteLine("No value provided, using default sample size of {0}", testRecordCount.ToString());
}
else
{
if (!long.TryParse(inputValue, out testRecordCount))
{
testRecordCount = defaultSampleSize;
Console.WriteLine("Unrecognized sample size provided. Using default sample size of {0}", testRecordCount.ToString());
}
}

List<string> testValues = GetTestValues(testRecordCount);
string comparisonValue = testValues[1].ToUpper(); //some value

SimpleTimer timer = new SimpleTimer();

Console.WriteLine();
Console.WriteLine();
Console.WriteLine("ToUpper");
timer.StartTimer();
ToUpperTest(comparisonValue, testValues);
timer.StopTimer();
timer.Result(testRecordCount);

//now recast all the test values to upper case to account for the fact
//GetTestValues returns lower case values. This just ensures the test
//is fair between ToUpper and ToLower
for (int i = 0; i <> testValues[i] = testValues[i].ToUpper();

Console.WriteLine();
Console.WriteLine();
Console.WriteLine("ToLower");
timer.StartTimer();
ToLowerTest(comparisonValue, testValues);
timer.StopTimer();
timer.Result(testRecordCount);

Console.WriteLine();
Console.WriteLine();
Console.WriteLine("Equals");
timer.StartTimer();
EqualsTest(comparisonValue, testValues);
timer.StopTimer();
timer.Result(testRecordCount);

Console.ReadLine();
}

static List GetTestValues(long testLength)
{
List
<string> testValues = new List();

for (int i = 0; i <>
testValues.Add(Guid.NewGuid().ToString());

return testValues;
}

static void ToUpperTest(string comparisonValue, List<string> testValues)
{
foreach (string testValue in testValues)
if (testValue.ToUpper() == comparisonValue.ToUpper()) { }
}

static void ToLowerTest(string comparisonValue, List
<string> testValues)
{
foreach (string testValue in testValues)
if (testValue.ToLower() == comparisonValue.ToLower()) { }
}

static void EqualsTest(string comparisonValue, List
<string> testValues)
{
foreach (string testValue in testValues)
if (testValue.Equals(comparisonValue, StringComparison.OrdinalIgnoreCase)) { }
}
}
}