This chapter covers
Representing optional data in computer programs has always been a problem. The concept of optional data is very simple in everyday life. Representing the absence of something when this something is contained in a container is easy—whatever it is, it can be represented by an empty container. An absence of apples can be represented by an empty apple basket. The absence of gasoline in a car can be visualized as an empty gas tank. Representing the absence of data in computer programs is more difficult. Most data is represented as a reference pointing to it, so the most obvious way to represent the absence of data is to use a pointer to nothing. This is what a null pointer is.
In Java, a variable is a pointer to a value. Variables may be created null (static and instance variables are created null by default), and they may then be changed to point to values. They can even be changed again to point to null if data is removed. To handle optional data, Java 8 introduced the Optional type. However, in this chapter, you’ll develop your own type, which you’ll call Option. The goal is to learn how this kind of structure works. After completing this chapter, you should feel free to use the standard Java 8 library version Optional, but you’ll see in the upcoming chapters that it’s much less powerful than the type you’ll create in this chapter.
Problems with the null pointer
One of the most frequent bugs in imperative programs is the NullPointerException. This error is raised when an identifier is dereferenced and found to be pointing to nothing. In other words, some data is expected but is found missing. Such an identifier is said to be pointing to null. The null reference was invented in 1965 by Tony Hoare while he was designing the ALGOL object-oriented language. Here’s what he said 44 years later:
Although it should be well known nowadays that null references should be avoided, that’s far from being the case. The Java standard library contains methods and constructors taking optional parameters that must be set to null if they’re unused. Take, for example, the java.net.Socket class. This class defines the following constructor:
This kind of value is sometimes called a sentinel value. It’s not used for the value itself (it doesn’t mean port 0) but to specify the absence of a port value. There are many other examples of handling the absence of data in the Java library. This is really dangerous because the fact that the local address is null could be unintentional and due to a previous error. But this won’t cause an exception. The program will continue working, although not as intended.
There are other cases of business nulls. If you try to retrieve a value from a HashMap using a key that’s not in the map, you’ll get a null. Is this an error? You don’t know. It might be that the key is valid but has not been registered in the map; or it might be that the key is supposedly valid and should be in the map, but there was a previous error while computing the key. For example, the key could be null, whether intentionally or due to an error, and this wouldn’t raise an exception. It could even return a non-null value because the null key is allowed in a HashMap. This situation is a complete mess.
Of course, you know what to do about this. You know that you should never use a reference without checking whether it’s null or not. (You do this for each object parameter received by a method, don’t you?) And you know that you should never get a value from a map without first testing whether the map contains the corresponding key. And you know that you should never try to get an element from a list without verifying first that the list is not empty and that it has enough elements if you’re accessing the element through its index. And you do this all the time, so you never get a NullPointerException or an IndexOutOfBoundsException.
If you’re this kind of perfect programmer, you can live with null references. But for the rest of us, an easier and safer way of dealing with the absence of a value, whether intentional or resulting from an error, is necessary. In this chapter, you’ll learn how to deal with absent values that aren’t the result of an error. This kind of data is called optional data.
Tricks for dealing with optional data have always been around. One of the best known and most often used is the list. When a method is supposed to return either a value or nothing, some programmers use a list as the return value. The list may contain zero or one element. Although this works perfectly, it has several important drawbacks:
Alternatives to null references
It looks like our goal is to avoid the NullPointerException, but this isn’t exactly the case. The NullPointerException should always indicate a bug. As such, you should apply the “fail fast” principle: if there’s an error, the program should fail as fast as possible. Totally removing business nulls won’t allow you to get rid of the NullPointerException. It will just ensure that null references will only be caused by bugs in the program and not by optional data.
The following code is an example of a method returning optional data:
Another solution is to throw an exception:
You could also return null and let the caller deal with it:
A better solution would be to ask the user to provide a special value that will be returned if no data is available. For example, this function computes the maximum value of a list:
The Option data type
The Option data type you’ll create in this chapter will be very similar to the List data type. Using an Option type for optional data allows you to compose functions even when the data is absent (see figure 6.1). It will be implemented as an abstract class, Option, containing two private subclasses representing the presence and the absence of data. The subclass representing the absence of data will be called None, and the subclass representing the presence of data will be called Some. A Some will contain the corresponding data value.
Figure 6.1. Without the Option type, composing functions wouldn’t produce a function because the resulting program would potentially throw a NullPointerException.
The following listing shows the code for these three classes.
- Listing 6.1. The Option data type
But as it is, the Option class isn’t very useful. The only way to use an Option would be to test the actual class to see if it’s a Some or a None, and call the getOrThrow method to obtain the value in the former case. And this method will throw an exception if there’s no data, which isn’t very functional. To make it a powerful tool, you’ll need to add some methods, in the same way you did for List.
Getting a value from an Option
Many methods that you created for List will also be useful for Option. In fact, only methods related to multiple values, such as folds, may be useless here. But before you create these methods, let’s start with some Option-specific usage. To avoid testing for the subclass of an Option, you need to define methods that, unlike getOrThrow, may be useful in both subclasses, so you can call them from the Option parent class. The first thing you’ll need is a way to retrieve the value in an Option. One frequent use case when data is missing is to use a default value.
Let's implement a getOrElse method that will return either the contained value if it exists, or a provided default one otherwise. Here’s the method signature (Exercise 6.1):
Let's fix the previous problem by using lazy evaluation for the getOrElse method parameter (Exercise 6.2). Use the Supplier class you defined in chapter 3 (exercise 3.2). The signature of the method will be changed to:
Applying functions to optional values
One very important method in List is the map method, which allows you to apply a function from A to B to each element of a list of A, producing a list of B. Considering that an Option is like a list containing at most one element, you can apply the same principle. Let's create a map method to change an Option into an Option by applying a function from A to B (Exercise 6.3).
Define an abstract method in the Option class with one implementation in each subclass. The method signature in Option will be
The Some implementation isn’t much more complex. All you need to do is get the value, apply the function to it, and wrap the result in a new Some:
As you’ll soon realize, functions from A to B aren’t the most common ones in functional programming. At first you may have trouble getting acquainted with functions returning optional values. After all, it seems to involve extra work to wrap values in Some instances and later retrieve these values. But with further practice, you’ll see that these operations occur only rarely. When chaining functions to build a complex computation, you’ll often start with a value that’s returned by some previous computation and pass the result to a new function without seeing the intermediate result. In other words, you’ll more often use functions from A to Option than functions from A to B.
Think about the List class. Does this ring a bell? Yes, it leads to the flatMap method. Let's create a flatMap instance method that takes as an argument a function from A to Option and returns an Option (Exercise 6.4). You can define different implementations in both subclasses; but you should try to devise a unique implementation that works for both subclasses and put it in the Option class. Its signature will be:
If you already know about the Java 8 Optional class, you may have remarked that Optional contains an isPresent() method allowing you to test whether the Optional contains a value or not. (Optional has a different implementation that’s not based on two different subclasses.) You can easily implement such a method, although you’ll call it isSome() because it will test whether the object is a Some or a None. You could also call it isNone(), which might seem more logical because it would be the equivalent of the List.isEmpty() method.
Although the isSome() method is sometimes useful, it’s not the best way to use the Option class. If you were to test an Option through the isSome() method before calling getOrThrow() to get the value, it wouldn’t be much different from testing a reference for null before dereferencing it. The only difference would be in the case where you forget to test first: you’d risk seeing an IllegalStateException instead of a NullPointerException.
The best way to use Option is through composition. To do this, you must create all the necessary methods for all use cases. These use cases correspond to what you’d do with the value after testing that it’s not null. You could do one of the following:
The first and third use cases have already been made possible through the methods you’ve already created. Applying an effect can be done in different ways that you’ll learn about in chapter 13. As an example, look at how the Option class can be used to change the way you use a map. Listing 6.2 shows the implementation of a functional Map. This is not a functional implementation, but only a wrapper around a legacy ConcurrentHashMap to give it a functional interface.
- Listing 6.2. Using Option in a functional Map
- Listing 6.3. Putting Option to work
The first line is Mickey’s email. The second line says “No data” because Minnie has no email. The third line says “No data” because Goofy isn’t in the map. Clearly, you’d need a way to distinguish these two cases. The Option class doesn’t allow you to distinguish the two. You’ll see in the next chapter how you can solve this problem.
Implement the variance function in terms of flatMap. The variance of a series of values represents how those values are distributed around the mean. If all values are very near to the mean, the variance is low. A variance of 0 is obtained when all values are equal to the mean. The variance of a series is the mean of Math.pow(x - m, 2) for each element x in the series, m being the mean of the series. Here’s the signature of the function (Exercise 6.7):
you can create an equivalent function by writing this:
Other ways to combine options
Deciding to use Option may seem to have huge consequences. In particular, some developers may believe that their legacy code will be made obsolete. What can you do now that you need a function from Option to Option, and you only have an API with methods for converting an A into a B? Do you need to rewrite all your libraries? Not at all. You can easily adapt them.
Define a lift method that takes a function from A to B as its argument and returns a function from Option to Option. As usual, use the methods you’ve defined already (Exercise 6.8). Figure 6.2 shows that the lift method works.
Figure 6.2. Lifting a function
Use the map method to create a static method in the Option class. The solution is pretty simple:
What if you want to use a legacy method taking two arguments? Let’s say you want to use the Integer.parseInt(String s, int radix) with an Option
Let's write a method map2 taking as its arguments an Option, an Option, and a function from (A, B) to C in curried form, and returning an Option
Composing List with Option
Composing Option instances is not all you need. Each new type you define must be, at some point, composable with any other. In the previous chapter, you defined the List type. To write useful programs, you need to be able to compose List and Option. The most common operation is converting a List<Option> into an Option<List>. A List is what you get when mapping a List with a function from B to Option. Usually, what you’ll need for the result is a Some
Write a function sequence that combines a List into an Option
Let's define a traverse method that produces the same result but invokes foldRight only once. Here’s its signature (Exercise 6.12):
* FP with Java - Ch6 - Dealing with optional data - Part1
* FP with Java - Ch6 - Dealing with optional data - Part2