程式扎記: [ In Action ] Working with the GDK - Working with files and I/O

標籤

2014年8月20日 星期三

[ In Action ] Working with the GDK - Working with files and I/O

Preface:
Hardly any script (let alone whole applications) can do without file access and other input/output-related issues. The JDK addresses this need with its java.io and java.netpackages. It provides elaborate support with the File and URL classes and numerous versions of streamsreaders, and writers.

However, the programmer is left with the repetitive, tedious, and error-prone task of managing I/O resources, such as properly closing an opened file even if exceptions occur while processing. This is where the GDK steps in and provides numerous methods that let you focus on the task at hand rather than thinking about I/O boilerplate code. This results in faster development, better readability of your code, and more stable solutions, because resource leaks are less likely with centralized error-handling. Having read chapter 5, you may correctly surmise that this is a job for closures.

In table 9.3 (If the target is File, Iterator over lines), you saw that File objects work with Object’s iteration methods. Listing 9.4 uses this approach to print itself to the console: The output is exactly what you see as listing 9.4. Assertions are used to show the use of any , findAll, and grep. Note that file.grep{it} returns only non-empty lines, because empty strings evaluate to false.
- Listing 9.4 File’s object iteration method examples
  1. file = new File('Listing_9_4_File_Iteration.groovy')  
  2. file.each{println it}  
  3. assert file.any {it =~ /File/}  
  4. assert 3 == file.findAll{it =~ /File/}.size()  
  5.   
  6. assert 5 ==  file.grep{it}.size()  
Additionally, the GDK defines numerous methods with overloaded variants on FileURLReaderWriterInputStream, and OutputStream. We will present detailed explanations and examples for at least one variant of every important or commonly used method. The usage of the remaining methods/variants is analogous.

Traversing the filesystem:
Groovy follows the Java approach of using the File class for both files and directories, where a File object represents a location (not content, contrary to a common misconception). Using a File object from Groovy often includes calling its JDK methods in a property-style manner. For example, to display information about the current directory, you can use:
  1. file = new File('.')  
  2. println file.name  
  3. println file.absolutePath  
  4. println file.canonicalPath   
  5. println file.directory  
Listing 9.5 shows this in conjunction with the GDK methods eachDir , eachFileeachFileMatch, and eachFileRecurse. They all work with a closure that gets a File object passed into it, disregarding the filesystem entries that represent the current and parent dir (“.” and “..”). Whereas eachFile yields File objects that may represent files or directories, eachDir yields only the latter.
- Listing 9.5 File methods for traversing the filesystem
  1. homedir = new File('/java/groovy')  
  2. dirs = []  
  3. homedir.eachDir{dirs << it.name } // Closure recording directory names  
  4. assert ['bin','conf','docs','embeddable','lib'] == dirs  
  5.   
  6. cvsdir = new File('/cygwin/home/dierk/groovy')  
  7. files = []  
  8. cvsdir.eachFile{files << it.name} // Closure recording filenames  
  9. assert files.contains('.cvsignore')  
  10. assert files.contains('CVS')  
  11.   
  12. files = []  
  13. cvsdir.eachFileMatch(~/groovy.*/){files << it.name} // Closure recording filenames matching a pattern  
  14. assert ['groovy-core''groovy-native'] == files  
  15.   
  16. docsdir = new File('/java/groovy/docs')  
  17. count = 0  
  18. docsdir.eachFileRecurse{if (it.directory) count++}  // Closure counting directories recursively  
  19. assert 104 == count  
Reading from input sources:
Suppose we have a file example.txt in the data directory below our current one. It contains:
  1. line one  
  2. line two  
  3. line three  
One of the most common operations with such small text files is to read them at once into a single string. Doing so and printing the contents to the console is as easy as calling the file’s text property (similar to the getText method):
  1. println new File('data/example.txt').text  
What’s particularly nice about the text property is that it is available not only on File, but also on ReaderInputStream , and even URL. Where applicable, you can pass aCharset to the getText method. See the API documentation of java.nio.charset.Charset for details of how to obtain a reference to a Charset.
BY THE WAY
Groovy comes with a class groovy.util.CharsetToolkit that can be used to guess the encoding. See its API documentation for details.

Listing 9.6 goes through some examples of file reading with more fine-grained control. The readLines method returns a list of strings, each representing one line in the input source with newline characters chopped.
- Listing 9.6 File-reading examples
  1. example = new File('data/example.txt')  
  2. lines = ['line one','line two','line three']  
  3. assert lines == example.readLines()  
  4. example.eachLine {  
  5.     assert it.startsWith('line')  
  6. }  
  7. hex = []  
  8. example.eachByte { hex << it }  
  9. assert hex.size() == example.length()  
  10. example.splitEachLine(/\s/){  
  11.     assert 'line' == it[0]  
  12. }  
  13. example.withReader { reader ->  
  14.     assert 'line one' == reader.readLine()  
  15. }  
  16. example.withInputStream { is ->  
  17.     assert 'line one' == is.readLines()[0]  
  18. }  
The eachLine method works on files exactly like the iteration method each does. The method is also available on Reader , InputStream, and URL . Input sources can be read a byte at a time with eachByte, where an object of type java.lang.Byte gets passed into the closure.

When the input source is made of formatted lines, splitEachLine can be handy. For every line, it yields a list of items to its closure determined by splitting the line with the given regular expression. Generally, the with<Resource> methods pass the <Resource> into the closure, handling resource management appropriately. So do the methodswithReader and withInputStream. The readLine method can then be used on such a given Reader or InputStream.

Writing to output destinations:
Listing 9.7 uses the corresponding methods for writing to an output destination. Writing a whole file at once can be achieved with File ’s write method; appending is done with append . The with<Resource> methods work exactly as you would expect. The use of withWriter and withWriterAppend is shown in the listing; withPrintWriter andwithOutputStream are analogous. The leftshift operator on File has the meaning of append.
- Listing 9.7 File-writing examples
  1. def outFile = new File('data/out.txt')  
  2. def lines = ['line one','line two','line three']  
  3.   
  4. // Writing/appending with simple method calls  
  5. outFile.write(lines[0..1].join("\n"))  
  6. outFile.append("\n"+lines[2])  
  7. assert lines == outFile.readLines()  
  8.   
  9. // Writing/appending with closures  
  10. outFile.withWriter { writer ->  
  11.     writer.writeLine(lines[0])  
  12. }  
  13. outFile.withWriterAppend('ISO8859-1') { writer ->  
  14.     writer << lines[1] << "\n"  
  15. }  
  16. outFile << lines[2// Appending with the leftshift operator  
  17.   
  18. assert lines == outFile.readLines()  
The example file in listing 9.7 has been opened and closed seven times: five times for writing, two times for reading. You see no error-handling code for properly closing the file in case of exceptions. File ’s GDK methods handle that on our behalf. Note the use of the writeLine and << leftshift methods. Other classes that are enhanced by the GDK with the leftshift operator with the exact same meaning are Process and Socket.

The leftshift operator on Writer objects is a clever beast. It relays to Writer’s write method, which in the GDK makes a best effort to write the argument. The idea is to write a string representation with special support for arrays, maps, and collections. For general objects, toString is used.

If the argument is of type InputStream or Reader, its content is pumped into the writer. Listing 9.8 shows this in action.
- Listing 9.8 Using Writer’s smart leftshift operator
  1. reader = new StringReader('abc')  
  2. writer = new StringWriter()  
  3. writer << "\nsome String"   << "\n"  
  4. writer << [a:1, b:2]        << "\n"  
  5. writer << [3,4]             << "\n"  
  6. writer << new Date(0)       << "\n"  
  7. writer << reader            << "\n"  
  8. assert writer.toString() == '''  
  9. some String  
  10. [a:1, b:2]  
  11. [34]  
  12. Thu Jan 01 08:00:00 CST 1970  
  13. abc  
  14. '''  
Note that connecting a reader with a writer is as simple as:
  1. writer << reader  
It may seem like magic, but it is a straightforward application of operator overriding done by the GDK.

Finally, the leftshift operator on Writer objects has special support for arguments of type Writable. In general, a Writable is an object with a writeTo method: It knows how to write something. This makes a Writable applicable to:
  1. writer << writable  
The Writable interface is newly introduced by the GDK and used with Groovy’s template engines, as you will see in section 9.4. It is also used with filtering, as shown in the next section.

Filters and conversions:
There are times when ready-made resource handling as implemented by the with<Resource> methods is not what you want. This is when you can use the methodsnewReader , newInputStream , newOutputStream , newWriter , and newPrintWriter to convert from a File object to the type of resource you need.

Two other conversions of this kind are from String and StringBuffer to their respective Writer s via:
  1. StringWriter writer = myString.createStringWriter()  
  2. StringBufferWriter sbw = myStringBuffer.createStringBufferWriter()  
A second kind of conversion is transformation of the content, either character by
character or line by line. Listing 9.9 shows how you can use transformChar and transformLine for this task. They both take a closure argument that determines the transformation result. Whatever that closure returns gets written to the writer argument.
- Listing 9.9 Transforming and filtering examples
  1. reader = new StringReader('abc')  
  2. writer = new StringWriter()  
  3.   
  4. // Transform ‘abc’ to ‘bcd’  
  5. reader.transformChar(writer) { it.next() }  
  6. assert 'bcd' == writer.toString()  
  7.   
  8. // Chop ‘line’ from each line of the example file  
  9. reader = new File('data/example.txt').newReader()  
  10. writer = new StringWriter()  
  11. reader.transformLine(writer) { it - 'line' }  
  12. assert " one\r\n two\r\n three\r\n" == writer.toString()  
  13.   
  14. // Read only lines containing “one”  
  15. input  = new File('data/example.txt')  
  16. writer = new StringWriter()  
  17. input.filterLine(writer) { it =~ /one/ }  
  18. assert "line one\r\n" == writer.toString()  
  19.   
  20. // 1) Read only long lines  
  21. writer = new StringWriter()  
  22. writer << input.filterLine { it.size() > 8 }  
  23. assert "line three\r\n"  == writer.toString()  
Note that the last example of filterLine at (1) doesn’t take a writer argument but returns a Writable that is then written to the writer with the leftshift operator.
NOTE.
The *Line methods use the newLine method of the according writer, thus producing system-dependent line feeds. They also produce a line feed after the last line, even if a source stream did not end with it.

Finally, a frequently used conversion is from binary data to strings with base-64 encoding, where binary data is represented only in printable characters, as specified inRFC 2045. This can be useful for sending binary coded data in an email, for example. The name of this codec comes from it having 64 symbols in its “alphabet”, 7 just as the decimal system is base 10 (10 symbols: 0–9) and binary is base 2 (2 symbols: 0 and 1):
  1. byte[] data = new byte[256]  
  2. for (i in 0..255) { data[i] = i }  
  3. store = data.encodeBase64().toString()  
  4. assert store.startsWith('AAECAwQFBg')  
  5. assert store.endsWith  ('r7/P3+/w==')  
  6. restored = store.decodeBase64()  
  7. assert data.toList() == restored.toList()  
An interesting feature of the .html#encodeBase64()]encodeBase64 method is that it returns a Writable and can thus be used with writers, whereas the returned object also implements toString conveniently. This has saved us the work of pushing the Writable into a StringWriter.

Base-64 encoding works with arbitrary binary data with no meaning attached to it. In order to encode objects instead, we need to venture into the world of serialization, which is the topic of the next section.

Streaming serialized objects:
Java comes with a serialization protocol that allows objects of type Serializable to be stored in a format so that they can be restored in VM instances that are disconnected in either space or time. Serialized objects can be written to ObjectOutputStreams and read from ObjectInputStreams. These streams allow making deep copies of objects, sending objects across networks, and storing objects in files or databases.

Listing 9.10 shows the special GDK support for reading serialized objects from a file. First, an Integer , a String , and a Date are written to a file. They are then restored with File’s new eachObject method. A final assertion checks whether the restored objects are equal to the original.
- Listing 9.10 Reading serialized objects from files
  1. file = new File('data/objects.dta')  
  2. out  = file.newOutputStream()  
  3. oos  = new ObjectOutputStream(out)  
  4. objects = [1"Hello Groovy!"new Date()]  
  5. // Serialize each object in the list in turn  
  6. objects.each {  
  7.     oos.writeObject(it)  
  8. }  
  9. oos.close()  
  10.   
  11. // Deserialize each object in turn  
  12. retrieved = []  
  13. file.eachObject { retrieved << it }  
  14. assert retrieved == objects  
As a variant,
  1. file.eachObject   
can be written as:
  1. file.newObjectInputStream().eachObject  
Supplement:
Documentation > User Guide > Input Output
Stackoverflow > Groovy write to file with customized newline
You can always get the correct new line character through System.getProperty("line.separator") for example.


沒有留言:

張貼留言

網誌存檔

關於我自己

我的相片
Where there is a will, there is a way!