This chapter covers
Some languages are said to be lazy, while others are not. Does this mean that some languages work harder than others? Not at all. Laziness is opposed to strictness. It has nothing to do with how hard a language can work, although you could sometimes think of lazy languages as languages that don’t require the programmer to work as hard as they must with strict ones.
Laziness, as you’ll see, has many advantages for some specific problems, such as composing infinite data structures and evaluating error conditions.
Understanding strictness and laziness
When applied to method arguments, strictness means that arguments are evaluated as soon as they’re received by the method. Laziness means that arguments are evaluated only when they’re needed. Of course, strictness and laziness apply not only to method arguments, but to everything. For example, consider the following declaration:
- int x = 2 + 3;
- int x = getValue();
- public static void main(String... args) {
- int x = getValue();
- }
- public static int getValue() {
- System.out.println("Returning 5");
- return 5;
- }
This program will print Returning 5 on the console because the getValue method will be called, although the returned value will never be used. In a lazy language, nothing would be evaluated, so nothing would be printed on the console.
Java is a strict language
Java, in principle, has no option concerning laziness. Java is strict. Everything is evaluated immediately. Method arguments are said to be passed by value, which means first they’re evaluated, and then the evaluated value is passed. On the other hand, in lazy languages, arguments are said to be passed by name, which means unevaluated. Don’t be confused by the fact that method arguments in Java are often references. References are addresses, and these addresses are passed by value.
Some languages are strict (like Java); others are lazy; some are strict by default and are optionally lazy; and others are lazy by default and are optionally strict. Java, however, isn’t always strict. These are some lazy constructs in Java:
If you think about it, you’ll soon realize that not much could be done if Java weren’t sometimes lazy. Can you imagine an if ... else structure where both branches were systematically evaluated? Or can you imagine a loop from which it was impossible to escape? All languages have to be lazy sometimes. This said, standard Java is often not lazy enough for functional programming.
The problem with strictness
Strictness is so fundamental in languages like Java that it’s seen by many programmers as the only possibility for evaluating expressions, even if, in reality, nothing would be possible with a totally strict language. Moreover, Java’s documentation doesn’t use the words non-strict or lazy when describing lazy constructs. For example, the Boolean operators || and && aren’t called lazy, but short-circuiting. But the simple reality is that these operators are non-strict regarding their arguments. We can easily show how this is different from a “strict” evaluation of method arguments.
Imagine that you wanted to simulate Boolean operators with a function. The following listing shows what you could do.
- Listing 9.1. The and and or logical methods
- public class BooleanMethods {
- public static void main(String[] args) {
- System.out.println(or(true, true));
- System.out.println(or(true, false));
- System.out.println(or(false, true));
- System.out.println(or(false, false));
- System.out.println(and(true, true));
- System.out.println(and(true, false));
- System.out.println(and(false, true));
- System.out.println(and(false, false));
- }
- public static boolean or(boolean a, boolean b) {
- return a ? true : b ? true : false;
- }
- public static boolean and(boolean a, boolean b) {
- return a ? b ? true : false : false;
- }
- }
So far, so good. But now try running the following program.
- Listing 9.2. The problem with strictness
- public class BooleanMethods {
- public static void main(String[] args) {
- System.out.println(getFirst() || getSecond());
- System.out.println(or(getFirst(), getSecond()));
- }
- public static boolean getFirst() {
- return true;
- }
- public static boolean getSecond() {
- throw new IllegalStateException();
- }
- public static boolean or(boolean a, boolean b) {
- return a ? true : b ? true : false;
- }
- public static boolean and(boolean a, boolean b) {
- return a ? b ? true : false : false;
- }
- }
Obviously, the or method isn’t equivalent to the || operator. The difference is that || evaluates its operand lazily, which means the second operand isn’t evaluated if the first one is true, because it’s not necessary for computing the result. But the or method evaluates its arguments strictly, which means that the second argument is evaluated even if its value isn’t needed, so the IllegalStateException is always thrown. In chapters 6 and 7 you encountered this problem with the getOrElse method because its argument was always evaluated, even if the computation was successful.
Implementing laziness
Laziness is necessary on many occasions. Java does in fact use laziness for constructs like if ... else, loops, and try ... catch blocks. Without laziness, a catch block, for example, would be evaluated even in the absence of an exception. Implementing laziness is a must when it comes to providing behavior for errors, as well as when you need to manipulate infinite data structures. Implementing laziness in Java isn’t fully possible, but you can produce a good approximation using the Supplier class you used in previous chapters:
- public interface Supplier<T> {
- T get();
- }
- Listing 9.3. Using laziness to emulate Boolean operators
- public class BooleanMethods {
- public static void main(String[] args) {
- System.out.println(getFirst() || getSecond());
- System.out.println(or(() -> getFirst(), () -> getSecond()));
- }
- public static boolean getFirst() {
- return true;
- }
- public static boolean getSecond() {
- throw new IllegalStateException();
- }
- public static boolean or(Supplier<Boolean> a, Supplier<Boolean> b) {
- return a.get() ? true : b.get() ? true : false;
- }
- public static boolean and(Supplier<Boolean> a, Supplier<Boolean> b) {
- return a.get() ? b.get() ? true : false : false;
- }
- }
This programs prints out the following:
The problem of laziness is nearly solved, although you’ve been forced to change the signature of your method. This is a low price to pay for using laziness. Of course, it might be overkill if the parameters are very quick to evaluate, or if they’re already evaluated, such as when using literal values. But it may save a great deal of time when evaluation requires a long computation. And if that evaluation isn’t free of side effects, it may completely change the outcome of the program.
Things you can’t do without laziness
So far, it may seem that the absence of laziness in evaluating expressions in Java isn’t a big deal. After all, why should you bother rewriting Boolean methods when you can use Boolean operators? There are, however, other cases where laziness would be useful. There are even several algorithms that can’t be implemented without resorting to laziness. I’ve already talked about how useless a strict version of if ... else would be. Think about the following algorithm:
This is an algorithm for finding the first ten primes, but this algorithm can’t be implemented without laziness. If you don’t believe me, just try it. Start with the first line. If you’re strict, you’ll first evaluate the list of positive integers. You’ll never have the opportunity to go to the second line, because the list of integers is infinite, and you’ll exhaust available memory before reaching the (nonexistent) end. Clearly, this algorithm can’t be implemented without laziness, but you know how to replace it with a different algorithm. The preceding algorithm was functional. If you want to find the result without resorting to laziness, you’ll have to replace it with an imperative algorithm, like this:
Sure, it works. But what a mess! First, it’s a bad recipe. Shouldn’t you increment the tested integer by 2 rather than by 1, in order to not test even numbers? And why test multiples of 3, 5, and so on? But more importantly, it doesn’t express the nature of the problem. It’s only a recipe for computing the result. This isn’t to say that the implementation details (such as not testing even numbers) aren’t important for getting good performance. But these implementation details should be clearly separated from the problem definition. The imperative description isn’t a description of the problem—it’s a description of another problem giving the same result.
In functional programming, you generally solve this kind of problem with a special structure: the lazy list, called Stream.
Why not use the Java 8 Stream?
Java 8 introduced a new structure called Stream. Can you use it for this type of computation? Well, you could, but there are several reasons not to do this:
For all these reasons, you’ll define your own streams in this chapter. After you’ve finished this chapter, you may prefer to use the Java 8 streams, but you’ll do so fully understanding what’s missing in the Java 8 implementation.
Creating a lazy list data structure
Now that you know how to represent non-evaluated data as instances of Supplier, you can easily define a lazy list data structure. It will be called Stream and will be very similar to the singly linked list you developed in chapter 5, with some subtle but very important differences. The following listing shows the starting point of your Stream data type.
- Listing 9.4. The Stream data type
- package fp.stream;
- import fp.compose.ch3.Supplier;
- public abstract class Stream<A> {
- @SuppressWarnings("rawtypes")
- private static Stream EMPTY = new Empty();
- public abstract A head();
- public abstract Stream<A> tail();
- public abstract Boolean isEmpty();
- private Stream(){}
- private static class Empty<A> extends Stream<A>{
- @Override
- public Stream<A> tail(){ throw new IllegalStateException("tail called on empty"); }
- @Override
- public A head(){ throw new IllegalStateException("head called on empty"); }
- @Override
- public Boolean isEmpty(){ return true; }
- }
- private static class Cons<A> extends Stream<A>{
- private final Supplier<A> head;
- private final Supplier<Stream<A>> tail;
- private Cons(Supplier<A> h, Supplier<Stream<A>> t)
- {
- head = h; tail = t;
- }
- @Override
- public A head(){ return head.get(); }
- @Override
- public Stream<A> tail(){ return tail.get(); }
- @Override
- public Boolean isEmpty(){ return false; }
- }
- static <A> Stream<A> cons(Supplier<A> hd, Supplier<Stream<A>> tl){ return new Cons<>(hd, tl); }
- @SuppressWarnings("unchecked")
- public static <A> Stream<A> empty(){ return EMPTY; }
- public static Stream<Integer> from(int i){ return cons(()->i, ()->from(i + 1));}
- }
- Stream<Integer> stream = Stream.from(1);
- System.out.println(stream.head());
- System.out.println(stream.tail().head());
- System.out.println(stream.tail().tail().head());
Memoizing evaluated values
The idea behind laziness is that you can save time by evaluating data only when it’s needed. This implies that you must evaluate data when it’s first accessed. But reevaluating it on subsequent accesses is a waste of time. Because you’re writing functional programs, multiple evaluation won’t harm anything, but it will slow the program. One solution is to memoize the evaluated value.
To do this, you’ll have to add fields for evaluated values in the Cons class:
- private final Supplier<A> head;
- private A h;
- private final Supplier<Stream<A>> tail;
- private Stream<A> t;
- public A head() {
- if (h == null) {
- h = head.get();
- }
- return h;
- }
- public Stream<A> tail() {
- if (t == null) {
- t = tail.get();
- }
- return t;
- }
Some languages offer lazy evaluation as a standard feature, whether by default or optionally. With such languages, you don’t need to resort to null references and mutable fields. Unfortunately, Java isn’t one of these languages. In Java, the most frequent approach when a value is to be initialized later is to first assign it the null reference if it’s an object type, or a sentinel value if it’s a primitive. This is risky because there’s no guarantee that the value will indeed be initialized to a significant value when needed. A null reference will probably cause a NullPointerException to be thrown, which at least will be noticed if exception handling has been implemented correctly, but a zero value could be an acceptable business value, leading to a program silently using this acceptable but incorrect value.
Alternatively, you could use a Result<A> to represent the value. This would avoid the use of the null reference, but you’d still have to use mutable fields. Because all this stuff is private, it’s acceptable to use null. But if you prefer, you can use a Result (or an Option) to represent the h and t fields.
Note that although the h and t fields must be mutable, they don’t need synchronization. The worst thing that may happen is that one thread will test the field and find it null, and then a second thread might also test the field before it has been initialized by the first one. The end result is that the field will have been initialized twice with potentially different (although equal) values. By itself, this isn’t a big problem; writing references is atomic, so the data can’t be corrupted. However, this could cause two instances of the corresponding object to coexist in memory. This won’t be a problem if you only test objects for equality, but it could be if you test them for identity (which, of course, you never do).
Also note that it’s possible to completely avoid null references and mutable fields at the cost of slight modifications in other places. Try to figure out how to do this. If you don’t know how, keep this idea in mind. We’ll come back to it near the end of this chapter. Now let's write a headOption method that returns the evaluated head of the stream. This method will be declared in the Stream parent class with the following signature (Exercise 9.1):
- public abstract Result<A> headOption();
- @Override
- public Result<A> headOption() {
- return Result.empty();
- }
- @Override
- public Result<A> headOption() {
- return Result.success(head());
- }
In the remainder of this chapter, you’ll learn how to compose streams while making the most of the fact that the data is unevaluated. But in order to look at the streams, you’ll need a method to evaluate them. Evaluating all the elements of a stream can be done by converting it to a List. Or you can process a stream by evaluating only the first n elements, or by evaluating elements as long as a condition is met.
Firstly, let's create a toList method to convert a Stream into a List (Exercise 9.2). A recursive version will simply cons the head of the stream to the result of the toList method applied to the tail. Of course, you’ll need to make this process tail recursive in order to use TailCall to get a stack-safe implementation:
- public List<A> toList() {
- return toList(this, List.list()).eval().reverse();
- }
- private TailCall<List<A>> toList(Stream<A> s, List<A> acc) {
- return s.isEmpty()
- ? ret(acc)
- : sus(() -> toList(s.tail(), acc.cons(s.head())));
- }
To continue, let's write a take(n) method that returns the first n elements of a stream, and a drop(n) method that returns the remaining stream after removing the first n elements. Note that you have to ensure that no evaluation occurs while calling these methods. Here are the signatures in the Stream parent class (Exercise 9.3):
- public abstract Stream<A> take(int n);
- public abstract Stream<A> drop(int n);
- @Override
- public Stream<A> take(int n){
- return n <= 0
- ? empty()
- : cons(head, () -> tail().take(n - 1));
- }
- List<Integer> list = Stream.from(1).take(10).toList();
- List<Integer> list = Stream.from(1).toList().takeAtMost(10);
- @Override
- public Stream<A> drop(int n) {
- return drop(this, n).eval();
- }
- public TailCall<Stream<A>> drop(Stream<A> acc, int n) {
- return n <= 0
- ? ret(acc)
- : sus(() -> drop(acc.tail(), n - 1));
- }
- public abstract Stream<A> takeWhile(Function<A, Boolean> p)
- public Stream<A> takeWhile(Function<A, Boolean> f) {
- return f.apply(head())
- ? cons(head, () -> tail().takeWhile(f))
- : empty();
- }
Write a dropWhile method that returns a stream with the front elements removed as long as they satisfy a condition. Here’s the signature in the Stream parent class (Exercise 9.5):
- public Stream<A> dropWhile(Function<A, Boolean> p);
- public Stream<A> dropWhile(Function<A, Boolean> p) {
- return dropWhile(this, p).eval();
- }
- private TailCall<Stream<A>> dropWhile(Stream<A> acc, Function<A, Boolean> p) {
- return acc.isEmpty()
- ? ret(acc)
- : p.apply(acc.head())
- ? sus(() -> dropWhile(acc.tail(), p))
- : ret(acc);
- }
沒有留言:
張貼留言