Every now and then we find ourselves working with a relational database. This means working with transactions, and specifying transactions boundaries: where transactions should start and commit.
There’s a number of approaches to tame transactions, from very manual to fully automatic. The most popular one in the Java world is the @Transactional annotation, available in different flavours both in JavaEE and Spring.
By annotating a method with @Transactional, we specify that it should execute within a scope of a running transaction. Looks simple and attractive!
The annotation approach is declarative: we are not concerned with boring details of transaction implementation, there’s no transaction-related logic in our code. Transactions are started and committed by the framework. Finally, the transactional context is automatically propagated through bean invocations. This might seem convenient: one thing less to worry about! But is that really so?
Reality quickly verifies the above. If you’ve worked on a JavaEE/Spring system, chances are you’ve hit problems with a missing transactional context, or surprising transaction boundaries. Without going into too much detail, here are some of the main problems of the @Transactional annotation:
- Lack of precision. Without looking at the code in its entirety, we can’t be sure where transactions exactly begin and commit. An @Transactional method might define the transaction boundary, or might take part in a broader transaction. It’s not possible to differentiate between these two scenarios.
- A framework is needed to decorate our code with the transaction-handling aspect (AOP). This will only work if we annotate non-private methods and invoke them through framework-managed proxies (invoking a method on an unmanaged bean instance won’t interact with the transaction context; same for invoking a public method from another method in the same managed bean)
- Propagating the transaction context across thread boundaries (when the business logic is asynchronous) is tricky at least
- and more
The three problems that we have identified:
- no local reasoning;
- the need of a framework;
- working with multi-threaded code
are general pain points of many Java-based systems. They are often solved by using functional programming, or more precisely, libraries which are written using this style. Hence: is there an alternative? And who knows — maybe it offers something better, when it comes to transactions?
The crucial observation is that while transactions might seem like a technical detail, they are in fact usually an important part of the business logic. What is done inside the transaction, that is, which operations are performed atomically, stored durably, in isolation and leaving the system in a consistent state (in other words, ACID), has a very real impact on the business functionality of the system we develop.
Also the transaction isolation levels have consequences on what kind of business guarantees the system offers!
Consequently, it might be beneficial to precisely track which operations are grouped into transactions, and when commits/rollbacks happen. Very often, such tracking can be done using types, and thus verified at compile-time. Let’s try that!
In Scala (a type-safe, functional programming language), two most popular libraries for working with relational databases are Slick and Doobie (see here for a comparison of their features, including other alternatives). Although they vary in detail, they are based on the same concept: separating the description of the problem from its interpretation. Here, the “problem” is querying and updating the database. Both when using Slick and Doobie, the first thing we need to do is to describe how the database should be queried/updated.
The description of database operations is represented as a value: a data class. In case of Doobie, the type of this value is ConnectionIO[T] (I/O which needs a connection to a database), while in case of Slick, it is DBIOAction[T]. For simplicity, going forward we’ll focus on Doobie alone.
In Scala, data classes are called case classes. This is similar to Lombok’s @Value, or the upcoming Java records.
For example, a description of a query reading a list of Person objects from the persons table would look as follows:
Creating an instance of the ConnectionIO class has no side effects: it’s just a description of the operations we want to perform. It’s an immutable data structure, which can be shared freely across threads and re-used in different contexts.
What’s next? At some point, we’ll need to interpret the description into side-effects: actually run the queries/updates. For that, we’ll need a connection pool, and a way to start/commit a transaction. In Doobie, this functionality is embedded in a Transactor. Having a Transactor instance, we can interpret a ConnectionIO[T] into a IO[T] side-effect:
In Slick, we have a Database class, which manages the connection pool. It contains a run method, which can interpret a DBIOAction[T] into a Future[T]. There are important differences between a Future[T] and an IO[T], but they both represent side-effecting computations which result in a value of type T.
You’ll say wait — what’s a value of type IO? Weren’t we supposed to run the query in a transaction, getting back the result — a list of persons? Yes, that’s what is happening, however we aren’t running the side-effects just yet. The IO datatype is also a description, but this time of arbitrary side-effects. Just as ConnectionIO, it is lazily evaluated: needs to be interpreted into the final result for the side effects to actually happen.
Notice that with IO we’re dealing with side-effects at a different level. A ConnectionIO value is a description of a single (or a series of) operation that will be run against a database, as part of a transaction. An IO value is a description of a single (or a series of) side-effect. A “single” side-effect might be running an entire database transaction (consisting of a number of queries/updates), but also other things: sending an email, or making an HTTP service call.
Once again, we are using the type system to keep track of an important notion. First, this is keeping track of creating the query description (ConnectionIO). Second, keeping track of arbitrary side effects (IO).
At some point, we’ll need to run the side effects described by IO. For that, we can invoke the built-in interpreter:
It might seem that we are going through a lot of levels of indirection! And indeed, there are two additional stages: (1) interpreting the query description into a transactional side effect, and (2) interpreting the IO into real side effects. But that’s also where we gain all of the power! By getting the possibility to manipulate the descriptions before they are run, we are back in control over our code.
In a real-world system you’ll only encounter the first level of interpretation (ConnectionIO → IO) frequently. The second, IO → side effects, is typically and preferably done only once, at “the end of the world”, e.g. in the main() method.
Notice that thanks to the additional level of indirection, there’s no transaction context to propagate! We compose description of operations first, and pass them to .transact when we want to run them.
So far we’ve run a single query — not very impressing, and not all that useful. The main power of ConnectionIO descriptions is that they compose. That is, we can take two such descriptions and create a value which describes running them sequentially. This is done using flatMap:
Note that the second query may (but doesn’t have to) rely on the results of the first (here we ignore the result of the insert: the number of rows that have been affected). What happens when we compose two ConnectionIOs and interpret them using .transact(Transactor)? They’ll both be run inside a single transaction!
If you don’t want to use a concrete implementation of the database-related effects (ConnectionIO, DBIOAction) in your code, you can always create your own type alias (e.g. type Transactional[T] = ConnectionIO[T]), and use that instead.
Because ConnectionIOs are values: immutable data classes, they can be freely re-used and shared among multiple threads or invocations. There are no frameworks that need to instrument the code, we can create these query descriptions anywhere. We get a lot of freedom, flexibility, and foremost abstraction possibilities when defining transaction-related code. For example, if there’s a code fragment that yields a ConnectionIO value appearing in the codebase twice, we can simply extract it to a constant, or a parametrised method.
Using the mechanism described above we can be absolutely precise as to when transactions begin and commit. As long as you have a value of type ConnectionIO[T], you can be sure you are dealing with a single, or a sequence of database queries, which have yet to be composed into a transactional unit. When you see .transact(Transactor), that’s the place where the transaction will begin, execute the described queries and commit.
Other operations might be run inside the transaction (even encapsulating calls to .transact— more on that later), but you can be sure that the transaction will happen precisely as is visible in code.
We can reason locally as to where the transaction boundaries are, without having to know where a given code fragment is used or called from.
Combining with other side-effects
When determining the boundaries of a transaction, an important aspect is deciding where other side effects should happen. For example, when registering a user, we might have two side effects:
- inserting a row to the database
- sending a welcome email.
As email is not a transactional system, they can’t be run together atomically; both can fail independently. We have to be prepared that one will fail, and the other succeed. There’s a couple of possibilities as to how we can sequence these operations:
All of these have different characteristics and choosing one is ultimately a business decision. (1) minimises the risk of inserting a user when sending the email fails, but it’s still possible that we send the email, and not store user information; moreover, this is a resource-costly solution: the database connection is blocked for the whole duration of sending the email.
(2) has a risk of storing user information, not sending the welcome email, but is more resource-friendly than (1).
(3) always sends the email, making sure we engage with the user, but has the risk of not storing user information (even in a simple case, if there’s a user with the same information in the database already). (4) is the same as (3) but unnecessarily allocates a database connection.
Whichever route you choose, the good news is, all of the above choices are expressible using our type-safe approach! We’re missing one piece, though. We need a way to “lift” an arbitrary side-effect into the scope of a transaction. That is, we need a way to wrap an IO[T] into a ConnectionIO[T]. Luckily that’s possible using the IO.to[ConnectionIO] method.
This might be a bit puzzling at first, so let’s take a look at a couple of examples.
We first define a side-effect (printing to the console). We then lift it to a ConnectionIO, and interpret it inside a transaction. Notice that the result isn’t equivalent to the original side-effect: when running result, we’ll:
- (a) start a transaction, allocating a connection;
- (b) run the side effect: print to the console;
- (c) commit the transaction
What about our email examples? (1) would look as follows: we create a description of database operations, which is a sequence of running a query, and then a lifted side-effect. This is run in a transaction:
(2) and (3) are similar. We no longer lift the side-effect into a ConnectionIO, as we don’t want to run it inside a transaction. Depending how we sequence the IO values, we get either (2) or (3).
Note that the order in which we create the descriptions is not important; it’s completely arbitrary, the descriptions can be created anywhere. What’s important is the final sequencing of the operations:
We did some type gymnastics, what did we gain? Clarity, preciseness and full control over how side-effects happen: reading the code, we know exactly what will happen and in what order. How the code will be run, and from what context, does not affect its behaviour.
Going further, we might end up nesting the two types we have mentioned: ConnectionIO and IO. Let’s look at the possibilities.
First, we can have a value of type IO[ConnectionIO[T]]. What does it represent? Well, reading the types: a description of a side-effecting computation, which as a result returns the description of a database query, which results in a value of type T.
For example, this might be an HTTP service call (a side-effect); depending on the results of the call, we construct a query (but only construct, without running it!):
We can then compose this result with other queries, and ultimately run them in a transaction, flattening the side effects using flatMap:
(a >> b is a shorthand notation for a.flatMap(_ => b), and map+flatten is equivalent to flatMap).
We can also have a type going the other way round: ConnectionIO[IO[T]]. What would that be? Again, reading the types: a description of a query, which results in a description of a side effect, which results in a value of type T.
For example, we might read data from a database, and depending on the result, create a description of a side-effect. Notice, however, that this side-effect doesn’t necessarily have to run inside the transaction; we only create the description inside the transaction. What we do with the result — when, and if we use it — is up to the calling code:
Considering the problem of demarcating transaction boundaries, we first looked at @Transactional—a popular, declarative, but imprecise and non-composable solution. Next, we inspected an alternative: working with ConnectionIO and IO values, which are descriptions of side-effecting operations.
Transactions provide important business properties for the system. Hence, we are not attempting to hide that aspect of our code. Instead of relying on implicit propagation of the transactional context, we make this concept explicit.
Apart from the benefits of working with immutable data structures (which work great in a multi-threaded environment), using libraries and not frameworks, we are also able to understand what a given fragment of code is doing without having to know the context, from which it is called. We can communicate clearly our intents as to where transactions boundaries are, and how they should interact with other side-effects.
One might say that using standards such as JavaEE or almost-standards such as Spring is safer, as libraries have a higher risk of becoming abondoned. That’s true, but on the other hand, the cost of swapping a library is small, as we are dealing with loosely coupled dependencies, not an all-encompassing framework.
Additionally, we get quick feedback from the compiler, in case we use a value of an incorrect type, such as incorrect ConnectionIO and IO usage.
The types take a bit of time getting used to. But once you come to terms with the idea of separating description and interpretation, by looking at the type of a method or value, you can quite precisely determine what kind of effects does it has: should it be part of a transaction, or not? Does the method have side effects?
How does it work in practice? Nothing beats looking at real code! For that purpose, you might want to take a look at the Bootzooka project. It’s a template Scala microservice/web application, which uses the approach described above to access the database.