Tests That Involve Time Are Hard

I've written a lot of automated tests that involve time and time-series data. I've also maintained many similar tests written by others. Given that these tests involve time and sometimes subtle timing, they can often be less deterministic than tests that only involve static data. I've written tests that passed consistently on my workstation but failed consistently when run on the automated build machine. I've seen tests that failed when they ran across an hour or a midnight boundary. I've seen tests that passed consistently in the Pacific Time Zone but failed consistently in the Eastern Time Zone. I've also seen tests that occasionally just failed and no one was certain why. This always seems to result in running the test again and if it passed we'd assume everything was ok.

I hate tests that fail sporadically. Without a doubt one of these tests will fail when you are trying to pass the gated check-in before leaving for the day. Or when you are trying to complete a release and push an important patch to production. Tests that sporadically fail also erode confidence. The tests themselves and the code that they cover start to feel fragile. The situation is even worse if it is a test that you didn't write. If this test regularly fails your gated check-in, you'll start to get pretty annoyed. Tests that involve time, especially unreliable ones, can be difficult to reason about. What did the author of this test intend? And is this just an unreliable test or is it really alerting me to some subtle timing bug related to initialization or concurrency?

With a bit of extra effort, however, it is possible to make a test based on time as robust and reliable as any other test.

Keep Assertions About Time and Format Separate

Let's write a test for the following C# class that formats a message with the current time at the beginning of the message.

public static class OutputMessage
{
    public static string Format(string message)
    {
        DateTime time = DateTime.UtcNow;
        return string.Format("{0} | {1}", time.ToString("MM/dd/yyyy HH:mm:ss.fff tt"), message);
    }
}

Let's start the test by formatting a message, recording the time before and after. Then we'll split the formatted message based on the expected delimiter between the timestamp and the message and verify that our message was printed.

var startTime = DateTime.UtcNow;

string message = "Test message";
string outputMessage = OutputMessage.Format(message);

var endTime = DateTime.UtcNow;

var splitMessage = outputMessage.Split(new string[] { " | " }, StringSplitOptions.None);
Assert.AreEqual(message, splitMessage[1]);

Now we want to ensure that the timestamp is formatted correctly and that it represents the correct time, especially since we intend to output UTC time rather than local time. To write robust tests that involve time, it is important to keep these two assertions separate.

First, we'll validate that the time is correct. We should not reason about the correctness of the time using the timestamp string. Instead we should convert the string to an object where we can make more deterministic assertions. We'll parse the timestamp into an object and assert that the timestamp is greater than or equal to the start time and less than or equal to the end time.

DateTime parsedTime = DateTime.Parse(timestamp);
parsedTime = new DateTime(parsedTime.Ticks, DateTimeKind.Utc);

Assert.IsTrue(parsedTime.CompareTo(startTime) >= 0);
Assert.IsTrue(parsedTime.CompareTo(endTime) <= 0); < pre>

If the timestamp format is invalid the DateTime.Parse method will throw an exception. It is important, however, to not rely exclusively on this method to validate the timestamp format as it could be another valid format (e.g., "dd MMM yyyy HH:mm:ss.fff") but not the one we expect.

Let's explicitly assert that the timestamp format is correct by using a regular expression.

string timestamp = splitMessage[0];
Regex regex = new Regex("^[0-9]{2}/[0-9]{2}/[0-9]{4} [0-9]{2}:[0-9]{2}:[0-9]{2}\\.[0-9]{3} [AP]M$");
Assert.IsTrue(regex.IsMatch(timestamp));

This regular expression validates that the entire string (note the ^ and $) is in the expected format. If the timestamp is accidentally changed, for example, such that it no longer includes milliseconds or is no longer terminated by AM or PM, our test will fail and alert us to this fact.

Conclusion

And there we have it. A reliable and robust test that asserts that the time and the format are correct. It makes assertions about time using objects and assertions about format using strings and regular expressions. But this test sometimes fails. What? How did that happen? What went wrong?

The Real Conclusion

I started this post by commenting that tests that involve time are difficult to make robust due to subtle timing issues and that's exactly what we have encountered here. We carefully designed this test to be robust, keeping assertions about format and time independent, and yet this test is still not entirely reliable.

The reason is related to truncation. The time is actually represented in terms of the DateTime.Ticks property. In the Format method, when the time is formatted as a string, it includes milliseconds but beyond this the time is truncated. The start time recorded in the test, however, is not truncated. The result is that the start time can be greater than the time parsed from the message due to truncation.

This is easy to fix. We'll just truncate the start time at the milliseconds range.

var startTime = DateTime.UtcNow;
startTime = startTime.AddTicks(-startTime.Ticks % TimeSpan.TicksPerMillisecond);

Our complete test method is as follows. This test will always be reliable.

[TestMethod]
public void OutputMessageFormatHasCorrectFormat()
{
    var startTime = DateTime.UtcNow;
    startTime = startTime.AddTicks(-startTime.Ticks % TimeSpan.TicksPerMillisecond);

    string message = "My message";
    string outputMessage = OutputMessage.Format(message);

    var endTime = DateTime.UtcNow;

    var splitMessage = outputMessage.Split(new string[] { " | " }, StringSplitOptions.None);
    Assert.AreEqual(message, splitMessage[1]);

    string timestamp = splitMessage[0];
    Regex regex = new Regex("^[0-9]{2}/[0-9]{2}/[0-9]{4} [0-9]{2}:[0-9]{2}:[0-9]{2}\\.[0-9]{3} [AP]M$");
    Assert.IsTrue(regex.IsMatch(timestamp));

    DateTime parsedTime = DateTime.Parse(timestamp);
    parsedTime = new DateTime(parsedTime.Ticks, DateTimeKind.Utc);

    Assert.IsTrue(parsedTime.CompareTo(startTime) >= 0);
    Assert.IsTrue(parsedTime.CompareTo(endTime) <= 0); } < pre>

This is a bit of a contrived example, but it highlights some of the challenges in writing tests that involve time. It is important to keep assertions about time and format independent, but, as we've seen, even that is not always enough. We need to go the extra mile to make these tests robust. I think the best way to do this is live with these test for a while. Run them many times and in different environments and make sure that they are reliable. If a test sometimes fails, there is something wrong.