程式扎記: [ In Action ] Everyday Groovy - Tips and tricks : Writing automation scripts

Preface
A software developer’s range of responsibilities includes many activities that require monitoring either constantly or on a repetitive schedule. Is the web server still running? Is the latest state on the build server OK? Is there so much data in the spam folder that it needs to be cleaned up? Did some prospect download an evaluation copy of our product? Are all of those tasks being handled by automation script without our constant attention?

Groovy is well suited to writing those little “house-elf” scripts that automate our daily work. We will go through some issues that are special to command-line scripts, explore the support provided by Groovy, and visit a series of examples. In particular, we examine the simple processing of command-line options, starting Java programs with the minimum of fuss, and scheduling tasks for delayed or repeated execution.

Supporting command-line options consistently
Helper scripts are often started automatically from a scheduler such as crontab or at, or as a service. Therefore, they have no graphical user interface but receive all necessary configuration on the command line. Starting a script generally looks like this:

> groovy MyScript –o value

where –o value stands for assigning value to the o option.

The standard option handling
An option can have a short name and a long name, where the short name consists of only one character. Short options are tagged on the command line with a single dash, such as -h ; long names use two dashes, such as --help . Most options are optional, but certain options may be required.

Options may have zero, one, or multiple trailing arguments such as filename in –f filename . Multiple arguments may be separated by a character. When the separation character is a comma, this looks like --lines 1,2,3 .

When the user enters an invalid command, it is good practice to give an error indication and print a usage statement. Options may be given in any sequence, but when multiple arguments are supplied with an option, they are sequence dependent. If you had to re-implement the option-parsing logic for every script, you would probably shy away from the work. Luckily, there’s an easy way to achieve the standard behavior.

Declaring command-line options
Groovy provides special support for dealing with command-line options. The Groovy distribution comes with the Jakarta Commons command-line interface (CLI). Groovy provides a specialized wrapper around it.

The strategy is to specify what options should be supported by the current script and let the CLI do the work of parsing, validating, error handling, and capturing the option values for later access in the script. The specification is done with CliBuilder. With this builder, you specify an option by calling its short name as a method on the builder, provide a map of additional properties, and provide a help message. You specify a help option, for example, via

view plaincopy to clipboardprint?
def cli = new CliBuilder()  
cli.h(longOpt: 'help', 'usage information')  

Table 13.4 contains the properties that you can use to specify an option with CliBuilder.

When the options are specified to the builder, the Groovy command-line support has all the information it needs to achieve the standard behavior. CliBuilder exposes two special methods:

■ parse(args) to parse the command line
■ usage() to print the usage statement

We will explain each of these before embarking on a full example.

Working with options
Letting CliBuilder parse the command-line arguments is easy. Just use its parse method, and pass it the arguments the script was called with. Groovy puts the list of command-line arguments in the binding of the script under the name args . Therefore, the call reads

view plaincopy to clipboardprint?
def options = cli.parse(args)   

with options being an OptionAccessor that encapsulates what options the user requested on the command line. When parsing fails, it prints the usage statement and returns null . If parsing succeeds, you can ask options whether a certain option was given on the command line—for example, whether –h was requested—and print the usage statement if requested:

view plaincopy to clipboardprint?
if (options.h) cli.usage()  

The options object is a clever beast. For any option x , the property options.x returns the argument that was given with –x somearg . If no argument was supplied with –x, it returns true . If –x was not on the command line at all, it returns false .

If an argName such as myArgName was specified for the x option, then options.x and options.myArgName return the same value; If the x option is specified to have multiple arguments, the list of values can be obtained by appending an s character to the property name—for example, options.xs or options.myArgNames .

Finally, options has a method arguments to return a list of all arguments that were trailing after all options on the command line. Let’s go through an example to see how all this fits together.

The Mailman example
Assume we set out to provide a Groovy command-line script that sends a message via email on our behalf. Our Mailman script should be reusable, and therefore it cannot hard-wire all the details. On the command line, it expects to get information about the mail server, the mail addresses it should use, the text to send, and optionally the mail subject. Here is how a casual user can request the information about the script and its options:

The user will also see this output whenever they pass options and arguments that are incomplete or otherwise insufficient.

Listing 13.10 implements the script starting with a specification of its command-line options. It proceeds with parsing the given arguments and using them for instrumenting the Ant task that finally delivers the mail.
- Listing 13.10 Mailman.groovy script using CliBuilder

view plaincopy to clipboardprint?
def cli = new CliBuilder( usage: 'groovy Mailman -sft[mh] "text"' )  
cli.h(longOpt: 'help', 'usage information')  
cli.s(argName:'host',    longOpt:'smtp',   args: 1, required: true,  
    'smtp host name')  
cli.f(argName:'address', longOpt:'from',   args: 1, required: true,  
    'from mail address (like me@home.com)')  
cli.t(argName:'address', longOpt:'to',     args: 1, required: true,  
    'to address (like you@home.com)')  
cli.m(argName:'matter', longOpt:'subject', args: 1,  
    'subject matter (default: no subject)')  
def opt = cli.parse(args)  
if (!opt)  return  
if (opt.h) cli.usage()  
def ant = new AntBuilder()  
def subj = (opt.matter) ? opt.matter : 'no subject'  
ant.mail(mailhost: opt.host, subject: subj) {  
    from(address: opt.f)  
    to  (address: opt.t)  
    message( opt.arguments().join(' '))  
}  

There are multiple aspects to consider about listing 13.10. It shows how the compact declarative style of CliBuilder not only simplifies the code, but also improves the documentation as well: better for the user because of the instant availability of the usage statement, and better for the programmer because of the inherent self-documentation. The multiple uses for documentation, parsing, and validation pay off after the initial investment in the specification. With this support in place, you are likely to produce professional command-line interfaces more often.

Expanding the classpath with RootLoader
Suppose you’d like to start a script using groovy MyScript but your script code depends on libraries that are not on the default classpath (/lib/*.jar and/.groovy/lib/*.jar). In this case, you’d need to set the classpath before calling the script, just like you need to do for any Java program.

Starting Java is considered tricky
When starting a Java program, you have to either make sure your CLASSPATH environment variable is set up correctly for specifically this program or you have to pass the classpath command-line option to the java executable.

Either way is cumbersome, requires a lot of typing, and is hard to remember how to do correctly. The common solution to this problem is to write a shell script for the startup. This works but requires knowledge about yet another language: your shell script language (Windows command script or bash ).

Java is platform independent, but this value is lost if you cannot start your program on all platforms. When trying to provide startup scripts for all popular systems (Windows in its various versions, Cygwin, Linux, Solaris), things get complex. For examples, look at Ant’s various starter scripts in /bin.

All the work is required only because a Java program cannot easily expand the classpath programmatically to locate the classes it needs. But Groovy can.

Groovy starters
Groovy comes with a so-called RootLoader , which is available as a property on the current classloader whenever the Groovy program was started by the groovy starter. It is not guaranteed to be available for Groovy code that is evaluated from Java code.

That means the RootLoader can be accessed as

view plaincopy to clipboardprint?
def loader = this.class.classLoader.rootLoader  

It has an addURL(url) method that allows you to add a URL at runtime that points to the classpath entry to add, for example, the URL of a jar file:

view plaincopy to clipboardprint?
loader.addURL(new File('lib/mylib.jar').toURL())  

Sometimes it is also useful to know what URLs are currently contained in the RootLoader, such as for debugging classloading problems:

view plaincopy to clipboardprint?
loader.URLs.each{ println it }  

With this, you can easily write a platform-independent starter script in Groovy. Let’s go through a small example. We need a Groovy script that depends on an external library. For the fun of it, we shall use JFugue , an open-source Java library that allows us to play music as defined in strings. Download jfugue.jar and copy it into a subdirectory named libs.

Listing 13.11 contains an example that uses the JFugue library to play a theme from Star Wars. Save it to file StarWars.groovy.
- Listing 13.11 StarWars.groovy uses the JFugue external library

view plaincopy to clipboardprint?
import org.jfugue.*  
def darthVaderTheme = new Pattern('T160 I[Cello] '+  
     'G3q G3q G3q Eb3q Bb3i G3qi Eb3q Bb3i G3hi')  
       
new Player().play(darthVaderTheme)  

To start this script, we would normally need to set the classpath from the outside to contain libs/jfugue.jar. Listing 13.12 calls the StarWars script by making up the classpath. It adds all jar files from the lib subdirectory to the RootLoader before evaluating StarWars.groovy.
- Listing 13.12 Starting JFugue by adding all *.jar files from lib to RootLoader

view plaincopy to clipboardprint?
def loader = this.class.classLoader.rootLoader  
def dir = new File('libs')  
dir.eachFileMatch(~/.*\.jar$/) {  
    loader.addURL(it.toURI().toURL())  
}  
evaluate(new File('StarWars.groovy'))  

With this functionality in place, you can easily distribute your automated player together with the libraries it depends on. There is no need for the user to install libraries in their /.groovy/lib directory or change any environment variables.

Also, everything is self-contained, and the user is less likely to run into version conflicts with the external libraries. If you use dependency resolution packages such asMaven or Ivy, you can directly refer to their downloaded artifacts. Groovy may provide even more sophisticated support for this scenario in the future.

We’ve been trying to lower the difficulty level of starting Groovy programs, and we’ve made it simple to start them from the command line. The next obvious step is to make programs so simple to run that the user doesn’t even need to use the command line.

Scheduling scripts for execution
Automation scripts really shine when running unattended on a background schedule. As the saying goes, “They claim it’s automatic, but actually you have to press this button.” There are numerous ways to schedule your automation scripts:
- Your operating system may provide tools for scheduled execution.

The standard mechanisms are the cron scheduler for UNIX/Linux/Solaris systems and the at service on Windows platforms. The downsides with these solutions are that you might not be authorized to use the system tools and that you cannot ship a system-independent scheduling mechanism with your application.

- Java Built-in Approach

The Java platform supports scheduling with the Timer class. It uses an implementation based on Java threads and their synchronization features. Although this cannot give any real-time guarantees, it is good enough for many scenarios and scales well.

- Java 3'rd Party Libraries

There also several third-party scheduler libraries for Java, both open-source and commercial. The Quartz scheduler is a well-known example, and one that is supported in Spring. It’s available from http://www.opensymphony.com/quartz/. Of course, the cost of using advanced features tends to be higher complexity.

- Roll your own scheduler with the simplest possible means.

In a lot of scenarios, it is sufficient to schedule an execution like so:

view plaincopy to clipboardprint?
while(true) {  
    println "execution called at ${new Date().toGMTString()}"  
    // call execution here  
    sleep 1000  
}  

Listing 13.13 extends this simple scheduling to a real-life 9 scenario. A task should be scheduled to run all working days (Monday through Friday) at office hours (08:00 a.m. to 06:00 p.m.). Within this timeframe, the task is to be started every 10 minutes.
- Listing 13.13 Scheduling a task for every 10 minutes during office hours

view plaincopy to clipboardprint?
def workDays    = Calendar.MONDAY..Calendar.FRIDAY  
def officeHours = 8..18  
Calendar cdr = Calendar.getInstance()  
while(true) {  
    def now = new Date()  
    if (          
        workDays.contains(Calendar.getAt(Calendar.DAY_OF_WEEK))      &&  
        officeHours.contains(Calendar.getAt(Calendar.HOUR_OF_DAY)) &&  
        0 == Calendar.getAt(Calendar.MINUTE) % 10  
    )   
    {  
        println "execution called at ${now.toGMTString()}"  
        // call execution here  
        sleep 31 * 1000  
    }  
    sleep 31 * 1000  
}  

The purpose of sleeping 31 seconds is to make sure the check is performed at least once per minute. The extra sleep after execution is needed to avoid a second execution within the same minute.

The solution in listing 13.13 is certainly not suited for scheduling at the granularity of milliseconds. It is also not perfect, because it uses deprecated Date methods. 10 However, it is sufficient for the majority of scheduling tasks, such as checking the source code repository for changes every 10 minutes, generating a revenue report every night, or cleaning the database every Sunday at 4:00 a.m.

Supplement:
* 符碼記憶 - Java Timer：排程、定時、週期性執行工作任務
* [ Java 套件 ] Quartz Scheduler 2.2 - 入門學習

程式扎記

標籤

2014年8月22日星期五

[ In Action ] Everyday Groovy - Tips and tricks : Writing automation scripts

沒有留言:

張貼留言

[Git 常見問題] error: The following untracked working tree files would be overwritten by merge

檢舉濫用情形

學習筆記

標籤

2014年8月22日 星期五