Union Types in the Practical Type System (PTS)

First Published



Christian Neumanns


Tristano Ajmone


CC BY-ND 4.0


This is part 4 in a series of articles titled How to Design a Practical Type System to Maximize Reliability, Maintainability, and Productivity in Software Development Projects.

It is recommended (but not required for experienced programmers) to read the articles in their order of publication, starting with Part 1: What? Why? How?.

Please be aware that PTS is a new, not-yet-implemented paradigm. As explained in section History of the article Essence and Foundation of the Practical Type System (PTS), PTS has been implemented in a proof-of-concept project, but a public PTS implementation isn't available yet — you can't try out the PTS source code examples shown in this article.

For a quick summary of previous articles you can read Summary of the Practical Type System (PTS) Article Series.

Union type example


Union types (aka sum types, variants, choice types) are a prime feature in a modern type system.

This article explains why union types are essential, how they are supported in PTS, and which benefits they offer.

As you'll see, union types are surprisingly useful and versatile, despite their simplicity. For example, they provide an elegant foundation for two critical, recurring, but often problematic aspects of software development: null- and error-handling.

Why Do We Need Union Types?

Consider a function that reads text stored in a file. The function takes a file path as input and returns one of the following:

  • a string representing the text in the file

  • null if the file is empty

  • an error if the file doesn't exist or if there was any other I/O error


In most software libraries, a function like this would return an empty string if the text file is empty.

However, our example function does not return an empty string if the file is empty, it returns null — for reasons explained in a subsequent article.

The above specification evokes a compelling question: How should the three output alternatives ("text", "no text", and "error") be expressed in the function signature?

Let's see!

Existing Solutions

To get an overview of different approaches used in popular programming languages, let's have a look at the signature of this function in JavaScript, Java, Kotlin, and Rust.


Readers only interested in the PTS solution can skip the following sections.


Here's the JavaScript code of our example function:

function readTextFile ( filePath ) {
    return "dummy";


We're not interested in the function body, just its signature — that's why "dummy" is returned.

In JavaScript, the return type of a function is not specified in the function signature.

Every JavaScript function can return anything (including null and undefined).

Section Return value on MNDN states:

By default, if a function's execution doesn't end at a return statement, or if the return keyword doesn't have an expression after it, then the return value is undefined. The return statement allows you to return an arbitrary value from the function.

This means that the only reliable way to know what the function returns is to look at its body — if we have access to it. Worse, if the function calls other functions, we might also need to inspect the body of all these other functions involved in its call tree.

If we're lucky, the developer(s) left behind a comment annotating the function and its return type. Furthermore, if the function underwent changes later on, the developer(s) hopefully were kind enough to also update its comment.

If the return type is changed later on, and we (or other developers) forget to update one or more function calls, then the application is at risk of breaking in undefined and unanticipated ways.

Surely, we want a better solution.

Let's move on.


Our example function looks like this in Java:

    public static String readTextFile ( Path filePath ) throws IOException {
        return "dummy";

The method clearly states that it returns a String or throws an IOException. However, we don't know if null might be returned, because reference types in Java are all nullable.


To state that the function might return null (in case of an empty file), we could add a Nullable annotation (metadata added to Java source code):

    public static @Nullable String readTextFile2 ( Path filePath ) throws IOException {
        return "dummy";

However, Nullable is a non-standard Java annotation. We need to create it ourselves, or use a third-party library that provides it. Therefore a Nullable annotation results in non-idiomatic Java code.

Moreover, the Java compiler doesn't take into consideration this annotation, and doesn't check for potential null pointer errors (because Java is not a null-safe language). There are, however, very useful tools and IDE plugins that report potential null pointer errors, by leveraging annotations.

Unfortunately, we would be using three different techniques now for the three outcomes:

  • a return type for String

  • an annotation for null

  • an exception for IOException

What's more, the Nullable annotation is just a workaround for an important concept (i.e. the absence of a value) that should be supported natively in the language.

Conclusion: A function signature in idiomatic Java doesn't tell us if null might be returned.


This is the code written in Kotlin, a modern JVM language:

    fun readTextFile(filePath: Path): String? {
        return "dummy"

While reference types in Java are nullable, they are non-null in Kotlin. The ? suffix after a type name must be used to state that a type is nullable. Hence the String? return type clearly states that the function returns a String object or null. Moreover, Kotlin is null-safe (no null-pointer errors in idiomatic Kotlin code), which is possibly the prevalent reason for some people to prefer Kotlin over Java.

Java uses checked exceptions for anticipated runtime errors. On the other hand, Kotlin doesn't support checked exceptions — it only supports unchecked exceptions (for a quick explanation of the differences, read What are checked vs. unchecked exceptions in Java?). Therefore, a Kotlin function signature doesn't tell us if an exception might be thrown. To understand why the creators of Kotlin decided against checked exceptions, you can read section Checked exceptions in the official Kotlin documentation.


Kotlin provides a Throws annotation in its standard library to state exceptions that might be thrown:

    fun readTextFile2(filePath: Path): String? {
        return "dummy"

However, this annotation targets the Java environment and is not used in idiomatic Kotlin code. The official Kotlin documentation states: "This annotation indicates what exceptions should be declared by a function when compiled to a JVM method."

Conclusion: A function signature in idiomatic Kotlin doesn't tell us if a function call might fail.


The same is true in C# and a few other languages: exceptions are not declared in function signatures.


In Rust, our function looks like this:

fn read_text_file(_file_path: String) -> Result<Option<String>, io::Error> {

Rust doesn't support null. To handle the absence of a value, Rust uses the Option type. We can think of an Option instance as a container that is either empty or contains a value: it is either an instance of Some, containing a value, or an instance of None, which has no content. A few other languages adopt a similar concept: for example, F# uses an Option monad, and Haskell uses a Maybe monad.

Furthermore, Rust doesn't throw exceptions if functions fail. The Rust Programming Language states:

Rust doesn't have exceptions. Instead, it has the type Result<T, E> for recoverable errors and the panic! macro that stops execution when the program encounters an unrecoverable error.

Thus, the function returns a Result type, which is either an instance of Ok, containing a valid return value, or an instance of Err containing an error object. This construct is similar to the Result monad in F#, or the Either monad in Haskell and other languages.

It's nice to see that:

  • All three outcomes are clearly stated in the function signature: the function returns Some, None, or Err.

  • A single technique is used for all outcomes: a return type (Result<Option<String>, io::Error>) that expresses the three alternatives.


The following table summarizes the outcomes stated in the function signatures:


No text
(empty file)






Rust is clearly the winner, because all three possible outcomes are covered in its function signature.

The Rust compiler also ensures that all three outcomes are being handled by the code that calls the function. If we forget to handle a case, the compiler gently reminds us to do so. Even better, if the return type of the function is changed later on, the compiler also checks that all function calls are updated accordingly. The advantages are obvious: more reliable and maintainable code; less time wasted finding and fixing bugs.

Here is an example of calling the function in Rust, using a match expression to handle the three outcomes:

    match read_text_file(String::from("file.txt")) {
        Ok(Some(string)) => println! ( "{}", string ),
        Ok(None) => println! ( "Empty" ),
        Err(_) => println! ( "Error" ),

A Better Solution

Rust Option and Result types are wrappers. Instances of these types contain a value — except None (an instance of Option), which doesn't contain a value. For example, if the function returns a string (the most common case for this function), then the string is wrapped in an Option instance, which is itself wrapped in a Result instance. The wrapping becomes obvious when we look at the dummy body of the function: Ok(Some(String::from("dummy"))). We can't simply write "dummy", or return "dummy", as in other languages.


The need to write String::from("dummy") instead of just "dummy" is irrelevant to the topic at hand (but you can read Rust - String for an explanation).

Looking at these examples in different languages begs the question: Why can't we simply state what we want, i.e. a function that returns a string or null or an error?

Well, we couldif the type system supported union types.

And that's one of the reasons why union types are supported in PTS. Here's a preview of the function in PTS:

fn read_text_file ( file file_path ) -> string or null or file_error
    return "dummy"

The output type string or null or file_error is a key point. Null-handling (or, more generally, handling the absence of a value) and error-handling are both crucial aspects in pretty much all software development projects. And now we have a straightforward and elegant solution that utilizes a single, simple concept (union types) for both aspects. This is a solid foundation that allows us to simplify null- and error-handling, ultimately leading to increased reliability, maintainability, and productivity.

What's more, union types have other interesting use cases, as we'll see soon.


Given the preponderance of null- and error-handling, these topics will be covered extensively in the next two PTS articles.

How Does It Work?

In this section we'll explore PTS union types, and illustrate each point via simple source code examples.


The basic idea of union types (aka sum types, variants, choice types) is roughly the same in different programming languages. However, the implementation and usage of union types vary largely. The following description only applies to union types in PTS.

Basic Idea

In a statically typed programming language, an object reference (variable, input parameter, etc.) is restricted to a single type. For example, input parameter name being of type string.

The basic idea of a union type is amazingly simple: Instead of restricting an object reference to a single type, any type among a defined set of types is allowed — type_1 or type_2 or type_3 or .... We are all familiar with this concept in daily life: Your birthday present will be a dog, a bicycle, or a violin (i.e. a dog or a bicycle or a violin). Next weekend we'll go to the beach, the mountains, or the city.

Here is an example of a PTS function that uses union types:

fn foo ( item string or character or number ) -> boolean or null or error
    // function body

This function has a single input parameter named item, whose type can be string, character, or number. Hence, the following function calls are all valid: foo ( "abc" ), foo ( 'a' ), and foo ( 123 ).

Moreover, the function can return a boolean, null, or an error. Thus, the following statements in the function body are all valid: return true, return null, and return error.create(...). In a subsequent section we'll see how to handle the returned value in the code that called the function.

The individual types declared in a union type are its member types. Thus, union type string or character or number has three member types: string, character, and number.


A PTS union type is conceptually similar to a sum type, and a PTS record type is similar to a product type. The terms sum and product (predominant in functional programming) are rooted in the cardinality of these types. (Remember: the cardinality of a type is the number of allowed values.)

The cardinality of a union type is the sum of the cardinalities of its member types. For example, type boolean or null has a cardinality of 3, because type boolean has a cardinality of 2 (true, false), and type null has a cardinality of 1 (null). Hence, boolean or null has a cardinality of 2 + 1 = 3.

The cardinality of a record type is the product of the cardinalities of its attribute/field types. For example, a record type with an attribute of type boolean or null and another attribute of type boolean has a cardinality of 3 * 2 = 6.

PTS adopts the term union, borrowed from set theory, a branch of mathematical logic.

Helpful Syntax Constructs

PTS provides three syntax constructs to handle union types in source code:

  • operator is, to check the type of a value

  • a case type of statement, to execute code that depends on the type of a given value

  • a case type of expression, to compute a value that depends on the type of another value

Let's look at examples.

Operator is

In section A Better Solution we introduced the following function which returns a string, null, or a file_error:

fn read_text_file ( file file_path ) -> string or null or file_error
    // function body

After calling read_text_file, we first need to check the type returned by this function, and then execute code that depends on this type.

To check the type of a value, PTS provides the infix operator is. The syntax is:

<expression> "is" <type>

Operator is evaluates to a boolean value which is true if the type of the expression on the left-hand side is equal to the type specified on the right-hand side.

Suppose we call read_text_file, and store its returned value in constant result:

const result string or null or file_error = read_text_file ( file_path.create ( "example.txt" ) )

We can then use the expression result is string to check wether the function returned a value of type string.

The is operator can be used in a classic if then else statement to handle the returned value:

const result = read_text_file ( file_path.create ( "example.txt" ) )
if result is string then
    write_line ( "Content of file:" )
    write_line ( result )
else if result is null then
    write_line ( "The file is empty." )
    write_line ( """The following error occurred: {{result.message}}""" )

In this code, we used operator is to check the type of the value stored in constant result. For example, result is string evaluates to true if the function returns a string.

Note how the above code benefits from two helpful features:

  • Type inference

    The type of constant result is inferred by the compiler to be string or null or file_error, since that's the return type of function read_text_file.

  • Flow-sensitive typing

    Within the three if branches, the compiler adapts the type of result as follows:

    • In the first then branch, the compiler deduces the type of result to be string, because this branch is only executed if result is string evaluates to true.

    • In the second branch (result is null), the type of result is deduced to be null.

      Thus a compile-time error would occur if we accidentally used an expression like result.message in this branch.

    • In the final else branch, the type of result is deduced to be file_error, because this is the remaining member type not yet covered in the previous branches.

      And that's the reason why we can write result.message without first casting result to type file_error. The expression result.message is valid because here result is guaranteed to be of type file_error, and message is an attribute defined in type file_error (inherited from type error, as will be explained in a subsequent article).

    Flow-sensitive typing (also called flow typing or occurrence typing) is practical because it allows us to write succinct code that remains type-safe. We'll see more examples in subsequent articles.


Type inference should not be overused, because there is a risk of hiding information that would be useful to keep in the source code, especially for developers who didn't write the code but need to understand and maintain it (e.g. in case of open source libraries, large code bases, etc.).

Obviously, a statement like this:

const name string = "Albert"

... can be shortened to:

const name = "Albert"

... without reducing readability.

However, look at this code:

const price = get_product_price ( "123" )

What does the function return? An integer? A decimal? Something else? We simply can't know by just looking at this line of code. Ambiguities like this disappear, and readability increases if the type of price is stated explicitly:

const price money_amount or inexistent_product_id_error = get_product_price ( "123" )

Yes, the code is more verbose — but it's also more expressive. Now it clearly states that get_product_price returns the union type money_amount or inexistent_product_id_error.

case type of Statement

Instead of using the is operator in an if then else statement for conditional execution, there is a much better way to execute type-dependent code: pattern matching.

The idiomatic way to check the type returned by a function is to use pattern matching via a case type of statement, as follows:

case type of read_text_file ( file_path.create ( "example.txt" ) )
    is string as text // the string is stored in constant 'text'
        write_line ( "Content of file:" )
        write_line ( text ) // the previously defined constant 'text' is now used
    is null
        write_line ( "The file is empty." )
    is file_error as error
        write_line ( """The following error occurred: {{error.message}}""" )

While the above code is semantically equivalent to the previous version that uses an if then else statement, it has the following advantages:

  • The code is shorter and easier to read.

  • The compiler ensures that all members of the union type are covered in the branches of the case type of statement: Leaving out any of the three is branches of the above example results in a compile-time error.

    This feature is invaluable also when working with third party libraries. Furthermore, if the members of a union type change later on (e.g. a member is added or removed), the compiler ensures that all case type of statements have been updated in the code — a most welcome aid in complex, multi-developer projects.

  • The compiler can optimize the generated binary code to render it smaller and faster (depending on implementation details not covered here).

Instead of handling each member type individually (as shown above), sometimes only a few member types need individual handling, while all remaining member types can be handled in the same way. In such situations, the last branch of a case type of statement can be an otherwise branch, which covers all remaining member types not yet handled in preceding is branches:

case type of read_text_file ( file_path.create ( "example.txt" ) )
    is file_error
        write_line ( "An error occurred!" )
        write_line ( "Ok" )

In the above code, member type file_error is handled individually, and member types string and null are handled the same way in the otherwise branch.

Instead of using an otherwise branch, a better approach is to use a union type in an is branch:

case type of read_text_file ( file_path.create ( "example.txt" ) )
    is file_error
        write_line ( "An error occurred!" )
    is string or null
        write_line ( "Ok" )

An advantage of this style is that the code explicitly and reliably mentions all types possibly returned by read_text_file (which is not the case if an otherwise branch is used).

Plus, if the members of the union type change later on, the compiler reminds us to adapt any case type of statement if we forget to do so. This eliminates the risk of handling a new member type in the otherwise branch, when it actually needs to be handled individually.

As a general rule, the otherwise branch should be used sparingly — we should think twice and anticipate potential maintenance problems.

case type of Expression

Besides a case type of statement there is also a case type of expression available:

const message = case type of read_text_file ( file_path.create ( "example.txt" ) )
    is string: "a string"
    is null: "null"
    is error: "an error"
write_line ( "The result is " + message )

In this code, the value "a string", "null", or "an error" is assigned to constant message (inferred to be of type string), depending on the type returned by function read_text_file.

if type of Expression

PTS also provides an if type of expression that can be used as follows:

const message = if type of read_text_file ( file_path.create ( "example.txt" ) ) is string \
    then "a string" \
    else "null or an error"
write_line ( "The result is " + message )

Behind the Scenes

Hopefully, the previous sections demonstrated that union types are simple to understand and easy to use.

That doesn't mean, however, that the compiler has an easy job too. On the contrary, the compiler needs to ensure type compatibility, infer types, deduce types in the branches of control flow statements, take into account type inheritance and type parameters, etc.

The compiler must prevent any misuses of types, and display helpful error messages whenever rules are violated.


Readers not interested in this excursion can skip this section.

In the previous section we already saw how the compiler ensures that all members of a union type are covered in pattern matching statements and expressions.

Now let's have a quick look at a few additional compiler tasks related to union types.

In the following examples we assume that types fruit and vegetable are child-types of product.

Union Type Declaration

The compiler checks the coherence of members declared in a union type. Suppose we declare union type product or fruit or null. This declaration is invalid and reported by a comprehensible error message like this:

Union type 'product or fruit or null' is invalid.
Reason: 'fruit' is a child-type of 'product', and therefore 'fruit' is already covered by member 'product' in the union type 'product or fruit or null'.
Possible solution: Change to 'product or null'.

Type Compatibility Checks

Type string is compatible with type string or null.

But the inverse is not true — string or null is not compatible with string, because null is valid for type string or null, but not for type string.

More generally:

  • T is always compatible with T or null.

  • T or null is never compatible with T.

Type compatibility checks can get complex when several factors need to be taken into account.

For example, fruit or vegetable or null is compatible with product or null. However, product or null is compatible with fruit or vegetable or null only if the following two conditions are fulfilled:

  • product is an abstract type, which means that no instances of product can be created — i.e. only instances of child-types are allowed.

  • fruit and vegetable are the only direct child-types of product.


The above two conditions would be defined as follows in PTS code:

type product \
    factories: none \
    child_types: fruit, vegetable

    // more code

Type Inference

Suppose that:

  • Function foo returns string.

  • Function bar returns number or null.

Now consider the following code that uses an if then else expression:

const c = if condition then foo else bar

In this case the compiler infers the type of constant c to be string or number or null — the compiler merges the possible return types of foo and bar.

Now suppose that:

  • foo returns product or null.

  • bar returns fruit or vegetable.

The expression if condition then foo else bar is then inferred to be of type product or null, because fruit and vegetable are covered already by product. The compiler first merges the output types of foo and bar to product or null or fruit or vegetable, and then normalizes the result to product or null.

Further Examples and Benefits

Besides being essential for null- and error-handling, union types have other interesting use cases. For instance, they can help to simplify APIs, provide type safety that couldn't be achieved without union types, and simplify eager/lazy evaluation.

Let's look at a few examples.

Simpler APIs

Let's say we have a function that checks text. It should be possible to provide the text directly as a string or indirectly via a file path or an URL pointing to text content. Here is the function signature:

fn check_text ( source string or file_path or URL ) -> text_error or null
    // function body

If union types weren't supported for input parameters, we would need three functions to cover the three types for input parameter source:

fn check_text ( text string ) -> text_error or null
    // function body

fn check_text_file ( file_path file_path ) -> text_error or null
    // function body

fn check_text_URL ( URL URL ) -> text_error or null
    // function body


Various languages support function overloading (e.g. C++, C#, and Java), which allows all three functions to have the same name, differing only in their parameter type.

Union Types in Record Types

Since union types can be used wherever single (non-union) types can be used, they can also be used for attributes in record types. Here is a record type with two attributes using union types:

record type text_source
    att name string or null default:null
    att source string or file_path or URL

Now we could improve function check_text to also accept a text_source record as input:

fn check_text ( source string or file_path or URL or text_source ) -> text_error or null


Union types should not be overused, because they require type-dependent code (e.g. case type of statements) in function bodies. Besides increasing cyclomatic complexity, there is also a tiny performance penalty involved, which could be an issue in performance-critical parts of an application.

Hence, instead of having a single function check_text that accepts four types as its input parameter (string or file_path or URL or text_source), it might be better (depending on the context) to use four individual functions to cover each case individually.


Suppose we need a type-safe list that contains only strings and characters, such as: ["abc", 'a', "hi", '!']. In this context, type-safe means that only objects of type string or character can be added to the list and retrieved from it. A compile-time error occurs whenever some code violates this rule.

Without union types we typically have two options:

  • We use a heterogenous list (e.g. List<Object> in Java).

  • We create a special class with at least:

    • a method to add a string

    • another method to add a character

    • an iterator that returns the lowest common parent type for string and character (if such a type exists), otherwise the root type in the type hierarchy (e.g. Object in Java)

The first solution is simple, but not type-safe, since objects of any type can be added — neither a compile- nor a run-time error is generated if we accidentally add a number or a pink elephant.

The second solution requires boilerplate code to be written, tested, and maintained. Moreover, compile-time type safety is only guaranteed when elements are added, but not when they are retrieved (e.g. looped over), because they must be casted to string or character, and the compiler doesn't report an error if we accidentally cast to the wrong type (e.g. number).

In practice, most developers (including myself) would therefore opt for the first solution, sacrificing type-safety at the altar of convenience.

A union type removes the quandary. We can use the standard syntax to declare the element type of the list, while keeping the list type-safe: list<string or character>.

A type-safe list literal looks like this:

[list<string or character> "abc" 'a' "hi" '!']

Here is an example of how we could create a list programmatically and then iterate over its elements:

const list = mutable_list<string or character>.create
list.add ( "abc" )       // OK
list.add ( 'a' )         // OK
// list.add ( 123 )      <- compile-time error !!!

// Loop without type check
repeat for each element in list
    write_line ( element.to_string )

// Loop with type check
repeat for each element in list
    case type of element
        is string
            write_line ( """String: {{element}}""" )
        is character
            write_line ( """Char: {{element.to_string}}""" )


String: abc
Char: a

Eager vs Lazy Evaluation

Union types allow either eager or lazy (i.e. immediate or delayed) evaluation of input parameters.

Consider the error_message input parameter in the following function, which defines a specific error message to be used whenever the function fails and returns a file_read_error object:

fn read_text_file (
    file file_path
    error_message string ) -> string or null or file_read_error

The point is that input parameter error_message is used only when execution of the function fails.

Now consider an application serving an international audience that retrieves locale-dependent error messages from a database. The application executes function calls like this:

const result = read_text_file (
    file = "example.txt"
    error_message = get_error_message_from_DB ( error_id = "123" ) )

Obviously, function calls like this can cause serious performance penalties, because each time read_text_file is called, the error message is retrieved from the database, although it is needed only if something goes wrong.

A simple solution is to use a union type for input parameter error_message, as follows:

fn read_text_file (
    file file_path
    error_message string or string_supplier ) -> string or null or file_read_error

Type string_supplier is defined as follows:

type string_supplier
    fn get -> string

As we can see, type string_supplier has a single function, named get, which returns a string.

Here's a simplified excerpt of the read_text_file body:

fn read_text_file (
    file file_path
    error_message string or string_supplier ) -> string or null or file_read_error

    if something_went_wrong then
        const message string = if error_message is string then error_message else error_message.get
        // create and return file_read_error

The advantage is that error_message.get is now evaluated only if something goes wrong.

Instead of providing a string, the application must now provide a string_supplier. This can easily be done with a closure:

const result = read_text_file (
    file = "example.txt"
    error_message = { get_error_message_from_DB ( error_id = "123" ) } )

Explaining PTS closures is beyond the scope of this article. However, as shown in the above code, eager evaluation can now easily be turned into lazy evaluation by just embedding an expression in a pair of curly braces ( {...}). For general information about closures you can read the Wikipedia article Closure (computer programming).

The performance bottleneck has been eliminated, since the error message is now retrieved from the database only if a file read error occurs.


Union types are simple to understand, easy to use, and they provide an elegant solution for frequent programming tasks.

They provide a sound foundation for uniform null- and error-handling — two critical aspects of a practical type system.

Moreover, union types help to simplify APIs, increase type-safety (by minimizing cardinality and thus supporting the PTS Coding Rule), facilitate eager/lazy evaluation, and provide additional benefits.

What's Next?

The next two PTS articles will be dedicated to null-handling and error-handling.


Many thanks to Tristano Ajmone for his useful feedback to improve this article.