My Initial Impressions of Scala

Sep 13, 2015

I am going to be using the Scala programming language for some upcoming projects. This will be my first time using a functional programming language and my first time developing for the JVM. Up to this point in my career, I've primarily used C++ and C#. I'm looking forward to the challenge of learning a new language and new ways of approaching problems.

I've spent the last couple of weeks reading the O'Reilly Learning Scala book by Jason Swartz. I enjoyed this book. It is concise, but thorough, and was a good introduction to the language and the Scala ecosystem (e.g., IDE support, build configuration, unit testing frameworks) for someone who already has professional programming experience in another language.

Scala has some very powerful features in terms of pattern matching and transforming data and, being new to the language, I'm not going to discuss them, as I don't feel I can do them justice at this point. Rather, in this article, I'm going to detail my initial impressions of Scala coming from C++ and C#. I'll provide my overall impressions, then I'll detail three examples where I think Scala really excels—data transfer objects, error handling, and test writing.

Overall Impressions

Learning Scala talks a lot about functions in Scala being first-class objects, in that they can be used in any aspect of the language, just like any other data type (e.g., stored in a variable or used as a parameter). For someone coming from a strictly imperative language, this may be new, but functions have been first-class objects in both C++ (lambda expressions, std::function) and C# (lambda expressions, Func, Action) for some time, so this seems natural to me.

Scala has some very powerful ways to transform collections with methods like map, reduce, filter, partition, and sortBy, among others. Having used LINQ in C#, however, having an expressive way to transform collections also seems natural to me. This may not necessarily be true coming to Scala straight from an imperative language, but even someone familiar with using standard algorithms in C++ may find this a natural transition. Some of these transformations will take me some time to get my head around, and I think there will always be some doubt as to whether there is a more straightforward or more efficient transformation. I've always liked how the ReSharper Visual Studio plugin suggests LINQ refactorings, so I'm hoping there will be something similar for Scala.

Immutability has been a focus of software development best-practices. It is something I've tried to incorporate, as much as possible, when I write applications, in order to make the code more robust and easier to reason about. But decorating variables and methods with const in C++ can take a lot of discipline and it tends to detract from readability. Some of the conventions for immutability in C# have always seemed a little strange to me. For example, how does one specify a method parameter as immutable? Scala makes writing immutable code easier than in other languages—it is a central focus of the language.

There may be a temptation to write very terse code in Scala. Even some of the features of the language, like implicit parameters—where arguments can be automatically provided, rather than explicitly passed—can reduce readability and should probably be used sparingly. Despite the powerful features of the language, one's focus should remain on the readability and maintainability of the code.

Coming from C++ and C#, one of the biggest adjustments is the diversity of the open-source software community. With C++, most of what I have needed has either been provided by the standard library or Boost. With C#, the .NET platform or other Microsoft Frameworks have essentially provided everything that I've required. With Scala, choosing the correct library or framework seems a bit more challenging. Perhaps this is just the nature of open-source software development. But as the language continues to evolve, hopefully more of the fundamental libraries become part of the standard library, or at least develop de facto standards in the community.

Overall, Scala seems to be a very simple yet powerful language, with a focus on expressions, data transformation, and immutability. There are no operators, just methods. There are no primitive types, as all values are an instance of a class. There are basically classes, broadly defined, and expressions. But they can be combined in flexible ways that make the language very expressive.

Case Classes for Data Transfer Objects

Scala greatly simplifies defining classes to be used as data transfer objects. These classes are a collection of fields and contain no business logic. They are typically used for passing data among expressions in a program, or passing data between processes. A typical application defines a number of these classes.

For example, in C# one might define the following Customer class for passing customer data around.

class Customer
{
    public string Name { get; set; }
    public long Id { get; set; }
    public string Address { get; set; }    
}

This class is usually sufficient for passing customer data to a method, or for serializing or deserializing customer data, but the default Equals method only provides object equality, and the default ToString method is of no value for logging messages, as it only prints the name of the class. Overriding the default Equals and ToString methods can be important for application code, but even if this is not the case, overriding these methods can often make it easier to write effective tests for the business logic surrounding this data.

Overriding these methods adds a significant amount of code to maintain.

class Customer
{
    public string Name { get; set; }
    public long Id { get; set; }
    public string Address { get; set; }

    protected bool Equals(Customer other)
    {
        return string.Equals(Name, other.Name)
            && Id == other.Id
            && string.Equals(Address, other.Address);
    }

    public override bool Equals(object obj)
    {
        if (ReferenceEquals(null, obj)) return false;
        if (ReferenceEquals(this, obj)) return true;
        return obj.GetType() == GetType() && Equals((Customer)obj);
    }

    public override int GetHashCode()
    {
        unchecked
        {
            var hashCode = (Name != null ? Name.GetHashCode() : 0);
            hashCode = (hashCode * 397) ^ Id.GetHashCode();
            hashCode = (hashCode * 397) ^ (Address != null ? Address.GetHashCode() : 0);
            return hashCode;
        }
    }

    public static bool operator ==(Customer left, Customer right)
    {
        return Equals(left, right);
    }

    public static bool operator !=(Customer left, Customer right)
    {
        return !Equals(left, right);
    }

    public override string ToString()
    {
        return string.Format("Name: {0}, Id: {1}, Address: {2}", Name, Id, Address);
    }
}

Making a class like this immutable also poses a challenge. One needs to define a constructor and add read-only properties with read-only backing variables. It has been my experience that most people don't even bother making these classes immutable.

class Customer
{
    private readonly string _name;
    private readonly long _id;
    private readonly string _address;

    public Customer(string name, long id, string address)
    {
        _name = name;
        _id = id;
        _address = address;
    }

    public string Name
    {
        get { return _name; }
    }

    public long Id
    {
        get { return _id; }
    }

    public string Address
    {
        get { return _address; }
    }

    protected bool Equals(Customer other)
    {
        return string.Equals(Name, other.Name)
            && Id == other.Id
            && string.Equals(Address, other.Address);
    }

    public override bool Equals(object obj)
    {
        if (ReferenceEquals(null, obj)) return false;
        if (ReferenceEquals(this, obj)) return true;
        return obj.GetType() == GetType() && Equals((Customer)obj);
    }

    public override int GetHashCode()
    {
        unchecked
        {
            var hashCode = (Name != null ? Name.GetHashCode() : 0);
            hashCode = (hashCode * 397) ^ Id.GetHashCode();
            hashCode = (hashCode * 397) ^ (Address != null ? Address.GetHashCode() : 0);
            return hashCode;
        }
    }

    public static bool operator ==(Customer left, Customer right)
    {
        return Equals(left, right);
    }

    public static bool operator !=(Customer left, Customer right)
    {
        return !Equals(left, right);
    }

    public override string ToString()
    {
        return string.Format("Name: {0}, Id: {1}, Address: {2}", Name, Id, Address);
    }
}

This simple data transfer class is not so simple anymore. With all of the additional boilerplate code, it becomes hard to even distinguish the data the class encapsulates.

Scala's case classes make defining data transfer objects trivial. First, all of the fields in the case class are immutable by default. Second, the compiler automatically generates a hashCode method, an equals method which compares every field, and a toString method that prints the class name and all of the fields. This means that I can define a one line case class that accomplishes everything that the final version of the C# class does.

case class Customer(name: String, id: Long, address: String)

This removes a lot of code and also provides consistency across these classes, as they will all have the same methods and behaviours.

Error Handling

Scala provides an interesting and type-safe approach to error handing using monads. Scala does support the try-catch-finally construct, but its use is generally discouraged. Instead, Scala encourages the use of monads in order to make the type explicit and to allow the compiler to check for correct error handling. A monad is just a type that can be transformed, like a collection, but that can have no more than one element.

Rather than returning null from an expression or passing null as a parameter to a method, an Option can be used to explicitly communicate the presence or absence of a value. An Option can either have the value Some(value) or None. The following is an example of using an Option for divide-by-zero error handling in a divide method.

def divide(numerator: Long, denominator: Long): Option[Double] = {
  if (denominator == 0) None
  else Some(numerator / denominator)
}

An Either is similar to an Option, but can be used when one needs to communicate specific information for expected errors.

def divide(numerator: Long, denominator: Long): Either[Double, String] = {
  if (denominator == 0) Right("Divide by zero")
  else Left(numerator / denominator)
}

A Try can be used to wrap code that can throw an exception and safely return the result as either a Success or Failure. In the case of an exception, the specific exception will be communicated in the Failure.

def divide(numerator: Long, denominator: Long): Try[Double] = {
  Try(numerator / denominator)
}

My initial impression is that this expressive and type-safe error handling is very attractive. I'm looking forward to experimenting with these monads, but I have not used them enough to appreciate how they ultimately improve the robustness of applications. I also think it will take some practice to learn when it is appropriate to use an Option, versus using a Try or an Either.

Expressive Tests

I've written previously about my interest in automated testing and my preference for testing one thing per test method. I haven't written many tests in Scala yet, but my first impression is that the ScalaTest testing frameworks seem very expressive and should help in writing effective tests. So far I've looked at the FlatSpec and the FeatureSpec style traits.

FlatSpec facilitates a behaviour-driven style of development (BDD), where the tests are enumerated with text that specifies the behaviour that the test verifies. My experience has been that people often hesitate to write tests with descriptive names, because it just feels wrong to write a test method with a name as long as CustomerClassEqualsMethodShouldReturnTrueIfAllOfTheFieldsAreTheSame. The result is that the tests are not self-documenting and there is a temptation to test more than one thing per test method, making the tests less effective and less maintainable. I think it will feel more natural to write test specifications in proper English sentences with FlatSpec.

As an example, here are some FlatSpec tests that exercise the toString and equals methods of the Customer case class.

import org.scalatest.{FlatSpec, ShouldMatchers}

class CustomerTests extends FlatSpec with ShouldMatchers {

  "Customer class toString method" should "print a useful string" in {
    val customer = Customer("Arthur Dent", 1, "Earth")
    customer.toString should equal("Customer(Arthur Dent,1,Earth)")
  }

  "Customer class equals method" should "return true if all the fields are the same" in {
    val customer1 = Customer("Arthur Dent", 1, "Earth")
    val customer2 = Customer("Arthur Dent", 1, "Earth")
    customer1 == customer2 shouldBe true
  }

  it should "return false if one of the fields is different" in {
    val customer1 = Customer("Arthur Dent", 1, "Earth")
    val customer2 = Customer("Arthur Dent", 2, "Earth")
    customer1 == customer2 shouldBe false
  }

}

I think the FlatSpec style of testing encourages testing only one thing per test method and I like how it makes the tests self-documenting.

FeatureSpec is for writing acceptance tests where one describes features and scenarios. It still requires the tester to write the correct assertions, but FeatureSpec is straightforward enough that it would allow a product manager, for example, to formulate the acceptance tests in English as part of the development process.

import org.scalatest._

class Account(customer: Customer, active: Boolean) {
  private var _active: Boolean = active
  def isActive = _active
  def deactivate() = { _active = false }
  def activate() = { _active = true }
}

class AccountSpecification extends FeatureSpec with GivenWhenThen {

  info("As an account administrator")
  info("I want to be able to disable an account without deleting it")
  info("So that I can maintain the identity in the system")
  info("And have the option of enabling it again in the future")

  feature("Account deactivation") {
    scenario("Deactivating an active account") {

      Given("An account that is enabled")
      val account = new Account(Customer("Arthur Dent", 1, "Earth"), true)
      assert(account.isActive)

      When("the account is deactivated")
      account.deactivate()

      Then("the isActive method should return false")
      assert(!account.isActive)

    }

    scenario("Reactivating a deactivated account") {

      Given("An account that is disabled")
      val account = new Account(Customer("Arthur Dent", 1, "Earth"), false)
      assert(!account.isActive)

      When("the account is reactivated")
      account.activate()

      Then("the isActive method should return true")
      assert(account.isActive)

    }
  }
}

If used correctly, ScalaTest makes the intent of the test very clear and the result is very readable test reports.