November 11, 2009

A Look at How Scala Compiles to Java

Consider this contrived example, based on an example from Beginning Scala. The point of the snippet was to demonstrate the congruency between using the higher-order functions map, flatMap, foreach, and filter (see Iterable), and performing the same operations inside a for comprehension.

object App {
	
	def isEven(i: Int) = i % 2 == 0
	
	def isOdd(i: Int) = i % 2 == 1
	
	def main(args: Array[String]): Unit = {
		val n = (1 to 10).toList
		n.filter(isEven).flatMap(i => n.filter(isOdd).map(j => i * j))
	}
	
}

Save this code to a file named App.scala, or anything you want (scala doesn’t have the same file/class name restrictions as java). Assuming you chose App.scala, compile it:

clewis$ scalac App.scala

Now check out the generated class files (ls App*class). You should see the following:

App$$anonfun$main$1.class
App$$anonfun$main$2$$anonfun$apply$1.class
App$$anonfun$main$2$$anonfun$apply$2.class
App$$anonfun$main$2.class
App$.class
App.class

Why were 6 classes compiled from this one singleton definition? Let’s start with App.class and App$.class.

The Singleton Object

Scala does away with Java’s statics. Instead we get singleton objects, which are declared with a syntax similar to that used for class declarations, but using the object keyword instead. When a singleton object shares the same name as a class, it becomes that classes “companion” object. Companion objects have special privileges, including access to private instance fields on instances of the companion type. In this example, App is simply a singleton object, since we haven’t also defined a class named App.

The Scala compiler implements companion objects by generating an anonymous class inside the companion class (the class of the same name as the object). The same is done for singleton objects, but the compiler also generates the containing class. Like the Java compiler, Scala names anonymous the class as [class name]$. Because we created an object named App, we get 2 classes: App and App$.

Higher Order Functions and Their Function Arguments

In Scala, functions are objects; instances of one of the FunctionN traits. Of course this knowledge isn’t obvious by the Scala syntax, and that’s part of the beauty: it just feels right. The compiler compiles any functions down to anonymous classes inside the containing class. If you understand Java’s scoping rules for inner classes, then you should now have some understanding of how Scala implements closures.

Methods in Scala are also different from functions. Methods are functions defined as part of a class definition with the the def keyword. Functions are instances of one of the FunctionN traits, and the Scala syntax provides several different ways to express them tersely. Methods are the only primitives (non-objects) in Scala, but the compiler makes it easy to promote methods to function instances. Back to our example, notice that we define the methods isEven and isOdd in our singleton object. They are method primitives, not functions. Because the compiler recognizes that methods often want to be treated as functions, it makes provisions; one such provision is that we can pass a method to a higher-order function.

Method Promotions

In order for the compiler to promote a method to a function instance, it must create an anonymous class for that instance. In our example, we first pass the method isEven as an argument to the higher-order function filter, and so the compiler generated the class App$$anonfun$main$1, which is our promoted method. Note that we also pass the isOdd to a subsequent call to filter, a promotion for which the compiler generated the class App$$anonfun$main$2$$anonfun$apply$1.

Function Literals

We’ve covered all but 2 of the generated classes. As it turns out, there are still 2 functions we haven’t discussed in the series of transformations. Take another look:

n.filter(isEven).flatMap(i => n.filter(isOdd).map(j => i * j))

Did you see them? If you’re new to Scala, you may not spot them at first because they are using function-literal syntax. The functions flatMap and map also take functions as arguments, and so we define them literally, embedding the one passed to map inside the one passed to flatMap. For these two function literals, the compiler generated the anonymous classes App$$anonfun$main$2 and App$$anonfun$main$2$$anonfun$apply$2.

To seal in what’s happening here, fire up the Scala REPL and run this code:

List(1,2,3).map(j => j + 1)

Now run this code:

List(1,2,3).map(new Function1[Int, Int] {
	override def apply(j: Int) = j + 1
})

The two do the exact same thing, because they are exactly the same.

The literal syntax may look strange, and even non-obvious at first. I thought so too, but it didn’t take long for me to greatly appreciate and recognize it as easily as you might recognize a class definition.

A Word on the Class Names

You may have noticed a pattern in the anonymous function class names. For starters, they’re prefixed with App$$anonfun$main$ and then continue in with 1.class and 2.class. There’s also another level nested inside the first level. If you inspect each of these using javap -c [class], you’ll notice the ordering and nesting follows the order in which we use function objects (by passing them as arguments to other functions).

  • isEven is the first function we pass, so it is the first function for which an anonymous class is generated, yielding App$$anonfun$main$1.
  • The second is the literal function passed to flatMap, so App$$anonfun$main$2 is generated for it.
  • The third is isOdd, yielding the class App$$anonfun$main$2$$anonfun$apply$1. This function is used inside the function passed to flatMap. It’s the first function in this scope, and it’s nested, so its name reflects its nesting and order in the sequence.
  • Finally we arrive at the literal function passed to map>, resulting in App$$anonfun$main$2$$anonfun$apply$2. It’s the second function referenced inside the function passed to flatMap, and so its name also reflects its nesting and order.

This ordering and nesting is deliberate. In Java, an inner class has access to its parent’s scope, which includes class instance variables and local variables (if the inner class was created in a method). This is what makes closures in Scala possible. Look again at the literal function passed to map:

n.filter(isEven).flatMap(i => n.filter(isOdd).map(j => i * j))

The function multiplies a “free” variable i by its single argument j. Where did i come from? Again, this function is defined inside another literal function passed to flatMap. That function receives a single argument, which it labels i. Because the function passed to map is nested inside the one passed to flatMap, it has access to that scope.

blog comments powered by Disqus
12:43am  |   permalink
  
FILED UNDER: scala java 
  1. hramos reblogged this from iamchrislewis
  2. iamchrislewis posted this