Saturday, February 23, 2008

ARM Blocks in Scala, Part 3: The Concession

Update: Here's a better approach to Automatic Resource Management in Scala.

After a couple of attempts (part 1, part 2) at implementing Automatic Resource Management in Scala, I've decided not to "reinvent the wheel" here. I will defer instead to the implementation found in Scalax. I did not know about their ManagedResource class before my first post (many thanks to the commenters who pointed me there), and if I had known about it I may not have made the attempt. I'm glad I did, however, because it gave me a chance to improve my Scala skills. I'm not "there" yet, though, as I still find myself writing Java-like code in Scala. When I catch myself doing so, I am usually able to refactor it into the Scala style, which ends up being more compact and elegant. Come to think of it, that is why I recommend ManagedResource over my approach. I would say mine is the Java-like approach, while ManagedResource is more elegant and more consistent with Scala style.

Let's look at the same example from my first two posts using ManagedResource:


def createReader = ManagedResource(new BufferedReader(new FileReader("test.txt")))
def createWriter = ManagedResource(new BufferedWriter(new FileWriter("test_copy.txt")))

//copy a file, line by line
for(reader <- createReader; writer <- createWriter) {
var line = reader.readLine
while (line != null) {
writer.write(line)
writer.newLine
line = reader.readLine
}
}



Why is ManagedResource better?
It may be clear from the example, but let's discuss what it is that makes the usage of ManagedResource cleaner than the previous approach. ManagedResource uses for-comprehensions, and that alone solves many of the problems I encountered. For example, I had the problem of being able to define and initialize a resource and still being able to reference it inside of the block of code. For that reason, I had to define an initialization function (in part 2), but "for" takes care of this nicely: for(a <- ManagedResource(new SomeResource())) .... It also takes care of the many-resources problem elegantly, without using varargs: for(a <- createA; b <- createB; ...). In short, "for" seems like the right tool for the job.

That being said, I think ManagedResource does have some room for improvement. For example, consider the following code segment:


def createReader = ManagedResource(new BufferedReader(new FileReader("test.txt")))
def createWriter = ManagedResource(new BufferedWriter(new FileWriter("test_copy.txt")))

try {
//copy a file, line by line
for(reader <- createReader; writer <- createWriter) {
try {
var line = reader.readLine
while (line != null) {
writer.write(line)
writer.newLine
line = reader.readLine
}
} catch {
case e: IOException => println("Exception thrown while copying: " + e.getMessage)
}
}
} catch {
case e: IOException => println("Exception thrown upon open or close: " + e.getMessage)
}


As you can see, we have total control inside of the block of code, but if an exception occurs while initializing or disposing we have no way of knowing which of the two steps was the culprit. This is probably not an issue most of the time, but I could see a possible need for handling exceptions in initialization differently from exceptions in disposal. Maybe this can be improved (it would have to be non-intrusive for the more common, general case), or maybe some level of control must be sacrificed.

Also, at the time of writing, ManagedResource exposes methods for opening (initializing) and closing (disposing) a resource: "unsafeOpen" and "unsafeClose". These cannot be called from within the block of a for-comprehension, but I see no need to make them public, "protected[control]" should be the maximum visibility - if that. Making them public is a mistake because it allows for the same type of resource leaks we set out to quash. In fact, anyone who is calling these methods externally requires more control over where and when resources are initialized and disposed, and should not be using ManagedResource to begin with. If there is a legitimate reason for exposing these methods, I would like to see it.

Overall, however, I have to congratulate Scalax (Jamie Webb in particular) for getting it right. Scalax is still in a very early stage, so maybe the concerns addressed here will be addressed by the time it is ready for release. These concerns are relatively minor anyway, so I recommend using ManagedResource as is.

Friday, February 15, 2008

ARM Blocks in Scala, Part 2

Update: Here's a better approach to Automatic Resource Management in Scala.

I didn't plan on writing a follow-up to my first post on ARM Blocks in Scala, but I think I will be able to improve upon some things. I don't plan on any future posts on this topic, but I will not hesitate to do another one if sufficient progress can being made.

Here's where we left off with Arm:


object Arm {
type CloseType = { def close() }

def manage(resources: CloseType*)(block: => Unit)
(implicit exceptionHandler: (Exception) => Unit) = {
try {
block
} finally {
resources.foreach( resource => {
try {
resource.close()
} catch {
case e: Exception =>
try {
exceptionHandler(e)
} catch {
case fatal: Throwable => fatal.printStackTrace() //last resort
}
}
})
}
}
}


Motivation

I chose to write a follow-up because of a couple of important omissions that I made in my previous post. The first and foremost is that of resource initialization (as "helium" pointed out). The problem is that resource initialization happens outside of the scope of the Arm.manage method. This violates the principle motivation for ARM Blocks by forcing the developer to correctly handle resource disposal in the case that one or more resources is initialized successfully and others are not.
For example:

val reader = new BufferedReader(new FileReader("test.txt"))
val writer = new BufferedWriter(new FileWriter("test_copy.txt"))

If reader initializes successfully and writer does not, reader needs to be closed. This is exactly the type of problem we were trying to avoid with ARM Blocks in the first place. I didn't have a solution for this problem at the time, but I forgot to mention it in my post - a mistake on my part.

Secondly, I didn't mention how an exception within the block of code itself should/would be handled. This means that one would have to wrap the entire call to Arm.manage in a try/catch block in order to handle any exceptions that may occur inside the block itself. Not ideal.

In this post, I hope to address these two concerns. The solution I present here is by no means the best solution, and I am confident it can be improved upon as well.

For starters, let's take care of the first issue, which is the easier of the two:

object Arm {
type CloseType = { def close() }

def manage(resources: CloseType*)(block: => Unit)
(implicit exceptionHandler: (Exception) => Unit) = {
try {
try {
block
} catch {
case e: Exception => exceptionHandler(e)
}
} finally {
resources.foreach( resource => {
try {
if (resource != null)
resource.close()
} catch {
case e: Exception =>
try {
exceptionHandler(e)
} catch {
case fatal: Throwable => fatal.printStackTrace() //last resort
}
}
})
}
}
}

Here we reuse the existing exception handler function. This is more convenient that declaring two different exception handlers, but has the issue of knowing if the exception being handled occurred upon closing or in the block itself. We will correct this after we address the second issue.

The issue of initialization is a more complicated one. The approach taken here is to define an initialization function, much like our exception handling function. It should be implicit as well in the case that resources have already been initialized and no initialization is necessary. Here's an attempt:

object Arm {
implicit def defaultInitializer() = {}
implicit def defaultExceptionHandler(e: Exception) = {}

type CloseType = { def close() }

def manage(resources: (() => CloseType)*)(block: => Unit)
(implicit initializer: () => Unit, exceptionHandler: (Exception) => Unit) = {
try {
try {
initializer()
block
} catch {
case e: Exception => exceptionHandler(e)
}
} finally {
resources.foreach( resource => {
try {
val value = resource()
if (value != null)
value.close()
} catch {
case e: Exception =>
try {
exceptionHandler(e)
} catch {
case fatal: Throwable => fatal.printStackTrace() //last resort
}
}
})
}
}
}

Here we define a default initializer as well as a default exception handler. Defining the exception handler is not really necessary because of the implicit identity function in Scala's Predef, but defining the default initializer may be helpful. In Arm.manage, the initialization and exception handling functions are included in the list of implicit parameters. Only one such list is allowed in the Scala language and it is required to be the last list of params. Also, if we are to use an initialization function, the resources themselves cannot be passed as arguments because it will be call-by-value (i.e. in Arm.manage, they will remain uninitialized). Instead, we use call-by-name and accept an arbitrary number of functions that return resources - remember that methods are functions in Scala. This is a necessary change, but adds a bit of clutter when it comes to usage. We will save an example until later because of a deficiency in this implementation.

In this version of Arm.manage, when an exception occurs and our handler is called, we have no idea if the exception happened in initialization, in the block of code, or when trying to close the resources.

Let's address this issue:

object Arm {
abstract class ManagementException(val cause: Exception) extends Exception(cause) {
override def getCause() = cause
}
class InitializationException(cause: Exception) extends ManagementException(cause)
class ExecutionException(cause: Exception) extends ManagementException(cause)
class ClosingException(cause: Exception) extends ManagementException(cause)

implicit def defaultInitializer() = {}
implicit def defaultExceptionHandler(e: ManagementException) = {}

type CloseType = { def close() }

def manage(resources: (() => CloseType)*)(block: => Unit)
(implicit initializer: () => Unit, exceptionHandler: (ManagementException) => Unit) = {
try {
var executeBlock = true

try {
initializer() //initialize the resources
} catch {
//forward exceptions to the handler
case e: Exception =>
executeBlock = false //do not continue if initialization fails
exceptionHandler(new InitializationException(e))
}

if (executeBlock) {
try {
block // execute the block
} catch {
//forward exceptions to the handler
case e: Exception => exceptionHandler(new ExecutionException(e))
}
}
} finally {
//close all of the resources properly
resources.foreach( resource => {
try {
val value = resource()
if (value != null)
value.close()
} catch {
case e: Exception =>
try {
exceptionHandler(new ClosingException(e))
} catch {
case fatal: Throwable => fatal.printStackTrace() //last resort
}
}
})
}
}
}

So now, our exception handler signature takes a ManagementException, which has the original exception as its cause - accessable by e.cause or e.getCause(). We know, based on the type of ManagementException, where it is coming from. Instead, we could define three different exception handling methods, but I think it's easier to use if we only have to define (at most) one exception handler.

Let's see an example:

var reader: BufferedReader = null
var writer: BufferedWriter = null

def getReader() = reader
def getWriter() = writer

def initResources() = {
reader = new BufferedReader(new FileReader("test.txt"))
writer = new BufferedWriter(new FileWriter("test_copy.txt"))
}

def handle(e: ManagementException) = {
e match {
case ie: InitializationException =>
println("Failed to initialize: " + ie.getCause())
case ee: ExecutionException =>
println("Could not copy files: " + ee.getCause())
case ce: ClosingException =>
println("Failed to close a resource: " + ce.getCause())
}
}

//copy a file, line by line
manage(getReader, getWriter) {
var line = reader.readLine
while (line != null) {
writer.write(line)
writer.newLine
line = reader.readLine
}
} (initResources, handle)

This one passes the methods for initialization and exception handling as explicit parameters.

Here's the same example with a different approach:

var reader: BufferedReader = null
var writer: BufferedWriter = null

implicit def initResources() = {
reader = new BufferedReader(new FileReader("test.txt"))
writer = new BufferedWriter(new FileWriter("test_copy.txt"))
}

implicit def handle(e: ManagementException) = {
e match {
case ie: InitializationException =>
println("Failed to initialize: " + ie.getCause())
case ee: ExecutionException =>
println("Could not copy files: " + ee.getCause())
case ce: ClosingException =>
println("Failed to close a resource: " + ce.getCause())
}
}

//copy a file, line by line
manage(() => reader, () => writer) {
var line = reader.readLine
while (line != null) {
writer.write(line)
writer.newLine
line = reader.readLine
}
}

This one uses implicit defs for our initialization and exception handling functions. For this to work, we cannot import the implicit defs of these functions from Arm because they would conflict. I should point out that if you provide one function explicitly, you have to provide the other explicitly as well. Also, this example defines getters for reader and writer "inline".

Room for Improvement

The new syntax is not nearly as easy to read as the original. This concession was necessary, though, for the sake of completeness. The problem of an exception thrown from the handler still remains from the first post, as well.

A couple of readers pointed me toward scalax, which provides a ManagedResource class which provides the same functionality using for-comprehensions. I took a look at it and it looks pretty well done, but I haven't yet had an opportunity to use it so I can't comment on how it works in practice. I would be interested to see an example of this which manages more than one resource.

Tuesday, February 12, 2008

ARM Blocks in Scala

Update: Here's a better approach to Automatic Resource Management in Scala.

Currently, there is a proposal by Joshua Bloch for the addition of Automatic Resource Management Blocks (ARM Blocks) to Java, possibly as soon as version 7. The idea is that Java should take the responsibility of correctly disposing of resources out of the hands of the developer, much like it did with memory management in its initial design. Personally, I think it is a great idea that wouldn't add too much baggage to the language.

After seeing a couple of implementations of this concept in Scala (here and here), I decided to give it a try myself. I think you'll see that Scala is a very flexible language that already facilitates things like ARM Blocks without explicit language support - mostly thanks to functional programming constructs like closures. In this post, I will assume you have some basic to intermediate knowledge of Scala, including a basic knowledge of its function syntax: "(ArgType1, ArgType2, ...) => ReturnType".

We will construct an Arm object (singleton) which will provide a usable syntax for an ARM Block and offer correct resource disposal.

Here is a first pass:


object Arm {
type CloseType = { def close() }

def manage(resource: CloseType)(block: => Unit) = {
try {
block
} finally {
resource.close()
}
}
}


...and an example:


import Arm.manage

//read in a file
val reader = new BufferedReader(new FileReader("test.txt"))
manage(reader) {
var line: String = reader.readLine
while (line != null) {
println(line)
line = reader.readLine
}
}


Arm.manage takes a resource to manage, and a block of code (no-arg function that does not return anything) to execute. The only restriction on the resource is that it has a close() method that does not return anything (implicit return type of Unit). The definition of CloseType is just for readability. This is a form of statically checked "duck typing". Notice, in the example, that the block of code passed into Arm.manage looks seamless - as if manage was a Scala keyword. The only extra bit of clutter that is necessary is the "import Arm.manage" (or Arm._ if you prefer), but I can live with that.

This is simple and does exactly what we need, but it is limited to only one resource. We can do better:


object Arm {
type CloseType = { def close() }

def manage(resources: CloseType*)(block: => Unit) = {
try {
block
} finally {
resources.foreach( resource => {
try {
resource.close()
} catch {
case e: Exception => e.printStackTrace()
}
})
}
}
}


Here we use Scala's syntax for varargs: CloseType*. Passing zero parameters is allowed here because resources.foreach() does not execute if resources is empty, but I could see an argument for requiring at least one param. This version of Arm.manage solves the problem of managing several resources, but introduces a new problem - what to do when the call to close() throws an exception. These difficulties are mentioned in the ARM proposal, with no solution settled upon. I propose using an optional callback, which is easy enough to do in Scala (and a variant of that would probably work in Java too, now that I think about it):


object Arm {
type CloseType = { def close() }

def manage(resources: CloseType*)(block: => Unit)(implicit exceptionHandler: (Exception) => Unit) = {
try {
block
} finally {
resources.foreach( resource => {
try {
resource.close()
} catch {
case e: Exception =>
try {
exceptionHandler(e)
} catch {
case fatal: Throwable => fatal.printStackTrace() //last resort
}
}
})
}
}
}


...with an example:


val reader = new BufferedReader(new FileReader("test.txt"))
val writer = new BufferedWriter(new FileWriter("test_copy.txt"))

//copy a file, with exception handling
manage(reader, writer) {
var line = reader.readLine
while (line != null) {
writer.write(line)
writer.newLine
line = reader.readLine
}
} {e => //handle it}


...and another:


val reader = new BufferedReader(new FileReader("test.txt"))
val writer = new BufferedWriter(new FileWriter("test_copy.txt"))

//copy a file, no exception handler
manage(reader, writer) {
var line = reader.readLine
while (line != null) {
writer.write(line)
writer.newLine
line = reader.readLine
}
}


Here is Arm in its first non-beta version. The exception handler is declared as "implicit", which means that it is not required - as in the second example. The callback has to take an Exception as a parameter, however, which is not as specific as one could hope for. In Scala, this does not present too much of a problem because of pattern matching. The real problem is what to do when the exception handler throws an exception and we need to keep closing the resources. I don't have a good answer for that one, but the "give up and just print a stack trace" approach is not unprecedented. If you don't believe me, just run the following Java code (I think I saw this concept in a Java Puzzler once):


Thread currentThread = Thread.currentThread();
currentThread.getThreadGroup().uncaughtException(currentThread,
new RuntimeException("Oh no!"));
System.out.println("Moving on");


Furthermore, we have no way to associate an exception with the object which was the source of the exception. There is no doubt in my mind that Arm.manage in this form can be improved upon, but it satisfies our goals. It even gives us control over what to do when an exception is thrown during disposal (within limitations).

One alternative to this implementation would be to declare an overridden Arm.manage method for each type of resource that one would like to manage (e.g. java.io.Closeable, java.sql.Statement, java.sql.Connection, etc.). This gives the benefit of dealing with a more specific type of exception in the callback method, and the possibility of resources that have a method other than close() to dispose them. This comes at the cost of a bit of clutter and rigidity since it does not support an arbitrary object with a close() method. It is definitely worth considering, but in the end I think the flexibility of the "duck typing" approach wins out here.

Another technique that can be used with Arm.manage is implicit type conversion (aka views). This could be useful if you would like Arm.manage to call a method other than close(). Of course it won't call a method other than close(), but you can achieve the same effect. For example:


class Disposable {
def dispose(): Boolean = {
println("Disposing")
return true
}
}

abstract class AbstractCloseType {
def close()
}

implicit def disposable2closeable(disposable: Disposable): CloseType = {
new AbstractCloseType() {
def close() = {
disposable.dispose()
}
}
}

val disposable = new Disposable()
manage(disposable) {
//do something
}


I hope I've shown some of the power and flexibility of Scala. Many of you already know this and it's is why you're reading posts about Scala. To the others - this is just scratching the surface of Scala's potential, I believe. Scala is malleable enough to let you do things like creating DSLs or to write simple code that is virtually indistinguishable from Java. All while not having to give up on static typing or the JVM (or CLR for that matter).

I look forward to writing more code in Scala. I don't know if it will be the "Next Big Thing(TM)", but it is certainly refreshing - and that doesn't hurt when it comes to success.

Update: See the follow-up.

Monday, February 11, 2008

Welcome

It's not much to look at as of yet, but welcome my blog anyway. My name is Chris W. Hansen, and I'm a computer programmer by trade. Others would demand the title "Software Developer" or "Software Engineer", and they are fine titles, but I feel that programmer is a term that sufficiently describes what it is I actually do. Besides, whenever I tell someone I'm a Software Engineer, they inevitably say something like, "Oh... so what is it exactly that you do?"

In future posts, I plan to discuss various aspects of computers, technology and programming, including programming languages, techniques and computer science theory (you know, the "fun stuff"). Right now, I'm in the process of learning an exciting (to me) language called Scala, so I imagine many of my posts (at least initially) will be on that topic.

I'm relatively new to blogging in general, so any tips or advice as I go along are welcome.

Before moving on to my next post (when it arrives), you may want to brush up on Scala. Here are a few links to get started (in no particular order).