程式扎記: [ In Action ] The Simple Groovy datatypes - Working with strings

標籤

2013年12月30日 星期一

[ In Action ] The Simple Groovy datatypes - Working with strings

Preface:
Groovy strings come in two flavors: plain strings and GStrings. Plain strings are instances of java.lang.String , and GStrings are instances of groovy.lang.GString .GStrings allow placeholder expressions to be resolved and evaluated at runtime. Many scripting languages have a similar feature, usually called string interpolation, but it’s more primitive than the GString feature of Groovy. Let’s start by looking at each flavor of string and how they appear in code.

Varieties of string literals:
Java allows only one way of specifying string literals: placing text in quotes “like this.” If you want to embed dynamic values within the string, you have to either call a formatting method (made easier but still far from simple in Java 1.5) or concatenate each constituent part. If you specify a string with a lot of backslashes in it (such as a Windows file name or a regular expression), your code becomes hard to read, because you have to double the backslashes. If you want a lot of text spanning several lines in the source code, you have to make each line contain a complete string (or several complete strings).

Groovy recognizes that not every use of string literals is the same, so it offers a variety of options. These are summarized in table 3.5.



The aim of each form is to specify the text data you want with the minimum of fuss. Each of the forms has a single feature that distinguishes it from the others:
- The single-quoted form
The single-quoted form is never treated as a GString, whatever its contents. This is closely equivalent to Java string literals.

- The double-quoted form
The double-quoted form is the equivalent of the single-quoted form, except that if the text contains unescaped dollar signs, it is treated as a GString instead of a plain string. GStrings are covered in more detail in the next section.

- The triple-quoted form 
The triple-quoted form (or multiline string literal) allows the literal to span several lines. New lines are always treated as '\n' regardless of the platform, but all other whitespace is preserved as it appears in the text file. Multiline string literals may also be GStrings, depending on whether single quotes or double quotes are used. Multiline string literals act similar to HERE -documents in Ruby or Perl.

- The slashy form 
The slashy form of string literal allows strings with backslashes to be specified simply without having to escape all the backslashes. This is particularly useful with regular expressions, as you’ll see later. Only when a backslash is followed by a u does it need to be escaped —at which point life is slightly harder, because specifying \u involves using a GString or specifying the Unicode escape sequence for a backslash.

Groovy uses a similar mechanism for specifying special characters, such as linefeeds and tabs. In addition to the Java escapes, dollar signs can be escaped in Groovy to allow them to be easily specified without the compiler treating the literal as a GString. The full set of escaped characters is specified in table 3.6.


Note that in a double-quoted string, single quotes don’t need to be escaped, and vice versa. In other words, 'I said, "Hi."' and "don't" both do what you hope they will. For the sake of consistency, both still can be escaped in each case. Likewise, dollar signs can be escaped in single-quoted strings, even though they don’t need to be. This makes it easier to switch between the forms.

Note that Java uses single quotes for character literals, but as you have seen, Groovy cannot do so because single quotes are already used to specify strings. However, you can achieve the same as in Java when providing the type explicitly:
  1. char      a = 'x'  // or Character a = 'x'  
The java.lang.String 'x' is coerced into a java.lang.Character . If you want to coerce a string into a character at other times, you can do so in either of the following ways:
  1. 'x' as char    
  2. 'x'.toCharacter()   
Whichever literal form is used, unless the compiler decides it is a GString, it ends up as an instance of java.lang.String , just like Java string literals. So far, we have only teased you with allusions to what GStrings are capable of. Now it’s time to spill the beans.

Working with GStrings:
GStrings are like strings with additional capabilities. They are literally declared in double quotes. What makes a double-quoted string literal a GString is the appearance of placeholders. Placeholders may appear in a full ${expression} syntax or an abbreviated $reference syntax. See the examples in listing 3.2.
- Listing 3.2 Working with GStrings
  1. me      = 'Tarzan'                         
  2. you     = 'Jane'                           
  3. line    = "me $me - you $you"    // Abbreviated dollar syntax          
  4. assert  line == 'me Tarzan - you Jane'     
  5. date = new Date(0)                                            
  6. out  = "Year $date.year Month $date.month Day $date.date"  // Extended abbreviation  
  7. assert out == 'Year 70 Month 0 Day 1'                         
  8. out = "Date is ${date.toGMTString()} !"  // Full syntax with curly braces              
  9. assert out == 'Date is 1 Jan 1970 00:00:00 GMT !'  
  10.   
  11. // Multiple-line GString  
  12. sql = """  
  13. SELECT FROM MyTable   
  14.        WHERE Year = $date.year  
  15. """            
  16.                  
  17. assert sql == """  
  18. SELECT FROM MyTable   
  19.        WHERE Year = 70  
  20. """  
  21.   
  22. // Literal dollar sign  
  23. out = "my 0.02\$"  
  24. assert out == 'my 0.02$'  
Although GStrings behave like java.lang.String objects for all operations that a programmer is usually concerned with, they are implemented differently to capture the fixed and the dynamic parts (the so-called values) separately. This is revealed by the following code:
  1. me      = 'Tarzan'  
  2. you     = 'Jane'  
  3. line    = "me $me - you $you"  
  4. assert line == 'me Tarzan - you Jane'  
  5. assert line instanceof GString  
  6. assert line.strings[0] == 'me '  
  7. assert line.strings[1] == ' - you '  
  8. assert line.values[0]  == 'Tarzan'  
  9. assert line.values[1]  == 'Jane'  
By the time the GString is converted into a java.lang.String (its toString method is called explicitly or implicitly), each value gets written to the string. Because the logic of how to write a value can be elaborate for certain value types, this behavior can be used in advanced ways.

From Java to Groovy:
Now that you have your strings easily declared, you can have some fun with them. Because they are objects of type java.lang.String , you can call String ’s methods on them or pass them as parameters wherever a string is expected, such as for easy console output:
  1. System.out.print("Hello Groovy!");  
This line is equally valid Java and Groovy. You can also pass a literal Groovy string in single quotes:
  1. System.out.print('Hello Groovy!');  
Because this is such a common task, the GDK provides a shortened syntax:
  1. print('Hello Groovy!');  
You can drop parentheses and semicolons, because they are optional and do not help readability in this case. The resulting Groovy style boils down to:
  1. print 'Hello Groovy!'  
Looking at this last line only, you cannot tell whether this is Groovy, Ruby, Perl, or one of several other line-oriented scripting languages. It may not look sophisticated, but in a way it is. It shows expressiveness—the art of revealing intent in the simplest possible way.

Listing 3.3 presents more of the mix-and-match between core Java and additional GDK capabilities. How would you judge the expressiveness of each line?
- Listing 3.3 What to do with strings
  1. greeting = 'Hello Groovy!'  
  2. assert greeting.startsWith('Hello')  
  3. assert greeting.getAt(0) == 'H'  
  4. assert greeting[0]       == 'H'  
  5. assert greeting.indexOf('Groovy') >= 0  
  6. assert greeting.contains('Groovy')  
  7. assert greeting[6..11]  == 'Groovy'  
  8. assert 'Hi' + greeting - 'Hello' + '?' - '!' == 'Hi Groovy?'   
  9. assert greeting.count('o') == 3  
  10. assert 'x'.padLeft(3)      == '  x'  
  11. assert 'x'.padRight(3,'_') == 'x__'  
  12. assert 'x'.center(3)       == ' x '  
  13. assert 'x' * 3             == 'xxx'  
These self-explanatory examples give an impression of what is possible with strings in Groovy. If you have ever worked with other scripting languages, you may notice that a useful piece of functionality is missing from listing 3.3: changing a string in place. Groovy cannot do so because it works on instances of java.lang.String and obeys Java’s invariant of strings being immutable.

Before you say “What a lame excuse!” here is Groovy’s answer to changing strings: Although you cannot work on String , you can still work on StringBuffer! On aStringBuffer, you can work with the << left shift operator for appending and the subscript operator for in-place assignments. Using the left shift operator on Stringreturns a StringBuffer. Here is the StringBuffer equivalent to listing 3.3:
  1. greeting = 'Hello'  
  2. assert greeting instanceof java.lang.String  
  3. greeting <<= ' Groovy'  // 1) Leftshift and assign at once  
  4. assert greeting instanceof java.lang.StringBuffer  
  5. greeting << '!'  // 2) Leftshift on StringBuffer  
  6. assert greeting.toString() == 'Hello Groovy!'  
  7. greeting[1..4] = 'i'  // Substring ‘ello’ becomes ‘i’  
  8. assert greeting.toString() == 'Hi Groovy!'  
Ps.
Although the expression stringRef << string returns a StringBuffer , that StringBuffer is not automatically assigned to the stringRef (see (1)). When used on a String, it needs explicit assignment; on StringBuffer it doesn’t. With a StringBuffer, the data in the existing object changed (see (2))—with a String we can’t change the existing data, so we have to return a new object instead.


沒有留言:

張貼留言

網誌存檔

關於我自己

我的相片
Where there is a will, there is a way!