November 11, 2009

A Look at How Scala Compiles to Java

Consider this contrived example, based on an example from Beginning Scala. The point of the snippet was to demonstrate the congruency between using the higher-order functions map, flatMap, foreach, and filter (see Iterable), and performing the same operations inside a for comprehension.

object App {
	
	def isEven(i: Int) = i % 2 == 0
	
	def isOdd(i: Int) = i % 2 == 1
	
	def main(args: Array[String]): Unit = {
		val n = (1 to 10).toList
		n.filter(isEven).flatMap(i => n.filter(isOdd).map(j => i * j))
	}
	
}

Save this code to a file named App.scala, or anything you want (scala doesn’t have the same file/class name restrictions as java). Assuming you chose App.scala, compile it:

clewis$ scalac App.scala

Now check out the generated class files (ls App*class). You should see the following:

App$$anonfun$main$1.class
App$$anonfun$main$2$$anonfun$apply$1.class
App$$anonfun$main$2$$anonfun$apply$2.class
App$$anonfun$main$2.class
App$.class
App.class

Why were 6 classes compiled from this one singleton definition? Let’s start with App.class and App$.class.

The Singleton Object

Scala does away with Java’s statics. Instead we get singleton objects, which are declared with a syntax similar to that used for class declarations, but using the object keyword instead. When a singleton object shares the same name as a class, it becomes that classes “companion” object. Companion objects have special privileges, including access to private instance fields on instances of the companion type. In this example, App is simply a singleton object, since we haven’t also defined a class named App.

The Scala compiler implements companion objects by generating an anonymous class inside the companion class (the class of the same name as the object). The same is done for singleton objects, but the compiler also generates the containing class. Like the Java compiler, Scala names anonymous the class as [class name]$. Because we created an object named App, we get 2 classes: App and App$.

Higher Order Functions and Their Function Arguments

In Scala, functions are objects; instances of one of the FunctionN traits. Of course this knowledge isn’t obvious by the Scala syntax, and that’s part of the beauty: it just feels right. The compiler compiles any functions down to anonymous classes inside the containing class. If you understand Java’s scoping rules for inner classes, then you should now have some understanding of how Scala implements closures.

Methods in Scala are also different from functions. Methods are functions defined as part of a class definition with the the def keyword. Functions are instances of one of the FunctionN traits, and the Scala syntax provides several different ways to express them tersely. Methods are the only primitives (non-objects) in Scala, but the compiler makes it easy to promote methods to function instances. Back to our example, notice that we define the methods isEven and isOdd in our singleton object. They are method primitives, not functions. Because the compiler recognizes that methods often want to be treated as functions, it makes provisions; one such provision is that we can pass a method to a higher-order function.

Method Promotions

In order for the compiler to promote a method to a function instance, it must create an anonymous class for that instance. In our example, we first pass the method isEven as an argument to the higher-order function filter, and so the compiler generated the class App$$anonfun$main$1, which is our promoted method. Note that we also pass the isOdd to a subsequent call to filter, a promotion for which the compiler generated the class App$$anonfun$main$2$$anonfun$apply$1.

Function Literals

We’ve covered all but 2 of the generated classes. As it turns out, there are still 2 functions we haven’t discussed in the series of transformations. Take another look:

n.filter(isEven).flatMap(i => n.filter(isOdd).map(j => i * j))

Did you see them? If you’re new to Scala, you may not spot them at first because they are using function-literal syntax. The functions flatMap and map also take functions as arguments, and so we define them literally, embedding the one passed to map inside the one passed to flatMap. For these two function literals, the compiler generated the anonymous classes App$$anonfun$main$2 and App$$anonfun$main$2$$anonfun$apply$2.

To seal in what’s happening here, fire up the Scala REPL and run this code:

List(1,2,3).map(j => j + 1)

Now run this code:

List(1,2,3).map(new Function1[Int, Int] {
	override def apply(j: Int) = j + 1
})

The two do the exact same thing, because they are exactly the same.

The literal syntax may look strange, and even non-obvious at first. I thought so too, but it didn’t take long for me to greatly appreciate and recognize it as easily as you might recognize a class definition.

A Word on the Class Names

You may have noticed a pattern in the anonymous function class names. For starters, they’re prefixed with App$$anonfun$main$ and then continue in with 1.class and 2.class. There’s also another level nested inside the first level. If you inspect each of these using javap -c [class], you’ll notice the ordering and nesting follows the order in which we use function objects (by passing them as arguments to other functions).

  • isEven is the first function we pass, so it is the first function for which an anonymous class is generated, yielding App$$anonfun$main$1.
  • The second is the literal function passed to flatMap, so App$$anonfun$main$2 is generated for it.
  • The third is isOdd, yielding the class App$$anonfun$main$2$$anonfun$apply$1. This function is used inside the function passed to flatMap. It’s the first function in this scope, and it’s nested, so its name reflects its nesting and order in the sequence.
  • Finally we arrive at the literal function passed to map>, resulting in App$$anonfun$main$2$$anonfun$apply$2. It’s the second function referenced inside the function passed to flatMap, and so its name also reflects its nesting and order.

This ordering and nesting is deliberate. In Java, an inner class has access to its parent’s scope, which includes class instance variables and local variables (if the inner class was created in a method). This is what makes closures in Scala possible. Look again at the literal function passed to map:

n.filter(isEven).flatMap(i => n.filter(isOdd).map(j => i * j))

The function multiplies a “free” variable i by its single argument j. Where did i come from? Again, this function is defined inside another literal function passed to flatMap. That function receives a single argument, which it labels i. Because the function passed to map is nested inside the one passed to flatMap, it has access to that scope.

12:43am  |   permalink
  
FILED UNDER: scala java 
August 16, 2009

Dependency Injection Using Language-only Constructs

I’ve been exploring dependency injection in Scala. Digesting this article by Debasish Ghosh, I was inspired to play with the code. At the time of the article’s posting it needed a few pieces to compile. You can find the inferred working source here. As an exploratory exercise, I transposed it to java. Chew on it a bit; it clarified some things for me about both the technique and the workings of Scala abstractions.

10:29pm  |   permalink
FILED UNDER: scala java dependencyinjection ioc 
March 23, 2009

Legacy PHP on Quercus: Checklist

Since Friday I’ve been playing with Quercus off and on. Running pieces of my company’s legacy code, playing, integrating and accessing Hibernate in PHP code (pretty sweet), and then actually adding some simple (but sane) Hibernate management (initialization and implementing the Hibernate session-per-request pattern for PHP scripts).

Throughout my work today, I experimented with deploying a full legacy application to Jetty. For the uninitiated (like me), Jetty can be somewhat of a bear, but it is extremely fast and very flexible. It’s also a bit complicated when you’re coming from mod_php’s narcissistic, share-nothing paradigm, to Java’s threaded, share-everything opinion. In deploying our applications, we’ve got several things to deal with, apart from the wholesale switch from C-PHP to Quercus:

  • Virtual hosts. Our applications (well over 20) all run on our dedicated server, served by Apache and set up as virtual hosts. Some of them are mapped to their own IP address, others share one.
  • SSL. Most, if not all of our applications, require SSL.
  • URL rewriting. Most of our newer applications rely on Apache’s mod_rewrite to clean up URLs, so something like http://foo.com/index.php?page=MyAccount becomes http://foo.com/MyAccount.
  • Symbolic links. Most, if not all of our applications make a ridiculously large use of symlinks.

All this along side digesting the JEE architecture of servlets, and how Jetty implements that. I don’t intend to suggest to my company that old-school servlet technology is the way to go; quite the opposite in fact (when in straight Java I’m a T5 guy). However, when getting into an environment where concurrency matters and there’s such a thing as an actual execution lifecycle, it would be prudent to have a thorough understanding of what’s going on, where, when and why.

Assuming my company elects this route, and I sincerely hope they do, I’ll be documenting the journey. The server side tool-chain will be Jetty, Quercus, and UrlRewrite.

8:10pm  |   permalink
FILED UNDER: java php programming jetty quercus 
March 4, 2009
RubyJax: March 24, 2009 - Groovy/Grails/Java MVC

Bring the nerdy.

9:36pm  |   permalink
FILED UNDER: programming java ruby 
March 2, 2009
Actors, Mina, and Naggati

Interesting article on concurrent programming in Scala using some functional voo-doo.

8:59pm  |   permalink
FILED UNDER: programming java scala 

A Java Interface to the Bit.ly API

I’ve just pushed some alpha code to a google code project site, implementing a Java interface to the bit.ly URL shortening service API. There’s still a bit to do in the way of negative testing and data access for the /info and /stats calls, but it’s a start. Give it a hack if you’re so inclined: http://code.google.com/p/bitlyj/.

9:16am  |   permalink
FILED UNDER: java programming 
January 19, 2009

Ganymede, run-jetty-run, and Tapestry 5

Eclipse Ganymede 3.4.1 is so far a great environment for Tapestry 5, when paired with the m2eclipse plugin and run-jetty-run launcher (gorgeous live class reloading). A couple of quick notes:

  1. Jetty’s slf4j breaks Tapestry’s. This is easily correctable by adding the VM argument to the launch configuration: -Dorg.mortbay.jetty.webapp.parentLoaderPriority=true (wiki article).
  2. If using run-jetty-run 1.0.1, know that it will not add the servlet api jar to the classpath. The funny thing is that it does include this jar as a dependency, so it ends up in your maven repository (servlet-api-2.5-6.1.9.jar for example). You can easily “Add external jar” to include it, and be on your way.

12:00am  |   permalink
FILED UNDER: java software tapestry 5