For the time being, assume that
you can work on a large string. You have numerous ways of splitting this string into words. But how do you count and store the word frequencies? You cannot have a distinct variable for each possible word you encounter. Finding a way of storing frequencies in a list is possible but inconvenient—more suitable for a brain teaser than for good code. Maps come to the rescue.
Some pseudocode to solve the problem could look like this:
The specification of maps is analogous to the list specification that you saw in the previous section. Just like lists, maps make use of the subscript operator to retrieve and assign values. The difference is that maps can use any arbitrary type as an argument to the subscript operator, where lists are bound to integer indexes. Whereas lists are aware of the sequence of their entries, maps are generally not. Specialized maps like java.util.TreeMap may have a sequence to their keys, though.
Simple maps are specified with square brackets around a sequence of items, delimited with commas. The key feature of maps is that the items are key-value pairs that are delimited by colons:
The character sequence [:] declares an empty map. Maps are by default of type java.util.HashMap and can also be declared explicitly by calling the respective constructor. The resulting map can still be used with the subscript operator. In fact, this works with any type of map, as you see in listing 4.11 with type java.util.TreeMap.
- Listing 4.11 Specifying maps
This notation can also get in the way when, for example, the content of a local variable is used as a key. Suppose you have local variable x with content 'a' . Because [x:1]is equal to ['x':1] , how can you make it equal to ['a':1] ? The trick is that you can force Groovy to recognize a symbol as an expression by putting it inside parentheses:
Using map operators:
The simplest operations with maps are storing objects in the map with a key and retrieving them back using that key. Listing 4.12 demonstrates how to do that. One option for retrieving is using the subscript operator. As you have probably guessed, this is implemented with map’s getAt method. A second option is to use the key like a property with a simple dot-syntax. You will learn more about properties in chapter 7. A third option is the get method, which additionally allows you to pass a default value to be returned if the key is not yet in the map. If no default is given, null will be used as the default. If on a get(key,default) call the key is not found and the default is returned, the key:default pair is added to the map.
- Listing 4.12 Accessing maps (GDK map methods)
- Listing 4.13 Query methods on maps
- Listing 4.14 Iterating over maps (GDK)
- Listing 4.15 Changing map content and building new objects from it
Maps in action:
Let's revisit our initial example of counting word frequencies in a text corpus. The strategy is to use a map with each distinct word serving as a key. The mapped value of that word is its frequency in the text corpus. We go through all words in the text and increase the frequency value of that respective word in the map. We need to make sure that we can increase the value when a word is hit the first time and there is no entry yet in the map. Luckily, the get(key,default) method does the job.
- Listing 4.16 Counting word frequency with maps