程式扎記: [Gradle in action] Ch5. Dependency management

(https://www.manning.com/books/gradle-in-action)

Preface
In chapter 3, you learned how to declare a dependency on the Servlet API to implement web components for the To Do application. Gradle’s DSL configuration closures make it easy to declare dependencies and the repositories to retrieve them from. First, you define what libraries your build depends on with the dependencies script. Second, you tell your build the origin of these dependencies using the repositories closure. With this information in place, Gradle automatically resolves the dependencies, downloads them to your machine if needed, stores them in a local cache, and uses them for the build.

This chapter covers Gradle’s powerful support for dependency management. We’ll take a close look at key DSL configuration elements for grouping dependencies and targeting different types of repositories.

Dependency management sounds like an easy nut to crack, but can become difficult when it comes to dependency resolution conflicts. Transitive dependencies, the dependencies a declared dependency relies on, can be a blessing and a curse. Complex dependency graphs can cause a mix-up of dependencies with multiple versions resulting in unreliable, nondeterministic builds. Gradle provides dependency reports for analyzing the dependency tree. You’ll learn how to find answers to questions like “Where does a specific dependency come from?” and “Why was this specific version picked?” to resolve version conflicts.

Gradle rolls its own dependency management implementation. Having learned from the shortcomings of other dependency managers like Ivy and Maven, Gradle’s special concern is performance, build reliability, and reproducibility.

5.1 A quick overview of dependency management
Almost all JVM-based software projects depend on external libraries to reuse existing functionality. For example, if you’re working on a web-based project, there’s a high likelihood that you rely on one of the popular open source frameworks like Spring MVC or Play to improve developer productivity. Libraries in Java get distributed in the form of a JAR file. The JAR file specification doesn’t require you to indicate the version of the library. However, it’s common practice to attach a version number to the JAR filename to identify a specific release (for example, spring-web-3.1.3.RELEASE.jar). You’ve seen small projects grow big very quickly, along with the number of third-party libraries and modules your project depends on. Organizing and managing your JAR files is critical.

Imperfect dependency management techniques
Because the Java language doesn’t provide or propose any tooling for managing versioned dependencies, teams will have to come up with their own strategies to store and retrieve them. You may have encountered the following common practices:
* Manually copying JAR files to the developer machine.

This is the most primitive, nonautomated, and error-prone approach to handle dependencies.

* Using a shared storage for JAR files (for example, a folder on a shared network drive), which gets mounted on the developer’s machine, or retrieving binaries over FTP.

This approach requires the developer to initially establish the connection to the binary repository. New dependencies will need to be added manually, which potentially requires write permissions or access credentials.

* Checking JAR files that get downloaded with the project source code into the VCS.

This approach doesn’t require any additional setup and bundles source code and all dependencies as one consistent unit. Your team can retrieve changes whenever they update their local copy of the repository. On the downside, binary files unnecessarily use up space in the repository. Changing working copies of a library requires frequent check-ins whenever there’s a change to the source code. This is especially true if you’re working with projects that depend on each other.

Importance of automated dependency management
While all of these approaches work, they’re far from being sufficient solutions, because they don’t provide a standardized way to name and organize the JAR files. At the very least, you’ll need to know the exact version of the library and the dependencies it depends on, the transitive dependencies. Why is this so important?

KNOWING THE EXACT VERSION OF A DEPENDENCY
Working with a project that doesn’t clearly state the versions of its dependencies quickly becomes a maintenance nightmare. If not documented meticulously, you can never be sure which features are actually supported by the library version in your project. Upgrading a library to a newer version becomes a guessing game, because you don’t know exactly what version you’re upgrading from. In fact, you may actually be downgrading without knowing it.

MANAGING TRANSITIVE DEPENDENCIES
Transitive dependencies are of concern even at an early stage of development. These are the libraries your first-level dependencies require in order to work correctly. Popular Java development stacks like the combination of Spring and Hibernate can easily bring in more than 20 additional libraries from the start. A single library may require many other libraries in order to work correctly. Figure 5.1 shows the dependency graph for Hibernate’s core library.

Trying to manually determine all transitive dependencies for a specific library can be a real time-sink. Many times this information is nowhere to be found in the library’s documentation and you end up on a wild-goose chase to get your dependencies right.

As a result, you can experience unexpected behavior like compilation errors and runtime class-loading issues.

I think we can agree that a more sophisticated solution is needed to manage dependencies. Optimally, you’ll want to be able to declare your dependencies and their respective versions as project metadata. As part of an automated process, they can be retrieved from a central location and installed for your project. Let’s look at existing open source solutions that support these features.

Using automated dependency management
The Java space is mostly dominated by two projects that support declarative and automated dependency management: Apache Ivy, a pure dependency manager that’s mostly used with Ant projects, and Maven, which contains a dependency manager as part of its build infrastructure. I’m not going to go into deep details of any of these solutions. Instead, the purpose of this section is to explain the concepts and mechanics of automated dependency management.

In Ivy and Maven, dependency configuration is expressed through an XML descriptor file. The configuration consists of two parts: the dependency identifiers plus their respective versions, and the location of the binary repositories (for example, an HTTP address you want to retrieve them from). The dependency manager evaluates this information and automatically targets those repositories to download the dependencies onto your local machine. Libraries can define transitive dependencies as part of their metadata. The dependency manager is smart enough to analyze this information and resolve those dependencies as part of the retrieval process. If a dependency version conflict is recognized, as demonstrated by the example of Hibernate core, the dependency manager will try to resolve it. Once downloaded, the libraries are stored in a local cache. Now that the configured libraries are available on your developer machine, they can be used for your build. Subsequent builds will first check the local cache for a library to avoid unnecessary requests to a repository. Figure 5.2 illustrates the key elements of automated dependency management.

Using a dependency manager frees you from the burden of manually having to copy or organize JAR files. Gradle provides a powerful out-of-the-box dependency management implementation that fits into the architecture just described. It describes the dependency configuration as part of Gradle’s expressive DSL, has support for transitive dependency management, and plays well with existing repository infrastructures. Before we dive into the details, let’s look at some of the challenges you may face with dependency management and how to cope with them.

Challenges of automated dependency management
Even though dependency management significantly simplifies the handling of external libraries, at some point you’ll find yourself dealing with certain shortcomings that may compromise the reliability and reproducibility of your build.

POTENTIAL UNAVAILABILITY OF CENTRALLY HOSTED REPOSITORIES
It’s not uncommon for enterprise software to rely on open source libraries. Many of these projects publish their releases to a centrally hosted repository. One of the most widely used repositories is Maven Central. If Maven Central is the only repository your build relies on, you’ve automatically created a single point of failure for your system. In case the repository is down, you’ve stripped yourself of the ability to build your project if a dependency is required that isn’t available in your local cache.

You can avoid this situation by configuring your build to use your own custom in-house repository, which gives you full control over server availability. If you’re eager to learn about it, feel free to directly jump to chapter 14, which talks about how to set up and use open source and commercial repository managers like Sonatype Nexus and JFrog’s Artifactory.

BAD METADATA AND MISSING DEPENDENCIES
Earlier you learned that metadata is used to declare transitive dependencies for a library. A dependency manager analyzes this information, builds a dependency graph from it, and resolves all nested dependencies for you. Using transitive dependency management is a huge timesaver and enables traceability for your dependency graph.

Unfortunately, neither the metadata nor the repository guarantees that any of the artifacts declared in the metadata actually exist, are defined correctly, or are even needed. You may encounter problems like missing dependencies, especially on repositories that don’t enforce any quality control, which is a known issue on Maven Central. Figure 5.3 demonstrates the artifact production and consumption lifecycle for a Maven repository.

Gradle allows for excluding transitive dependencies on any level of the dependency graph. Alternatively, you can omit the provided metadata and instate your own transitive dependency definition.

You’ll find that popular libraries will appear in your transitive dependency graph with different versions. This is often the case for commonly used functionality like logging frameworks. The dependency manager tries to find a smart solution for this problem by picking one of these versions based on a certain resolution strategy to avoid version conflicts. Sometimes you’ll need to tweak those choices. To do so, you’ll first want to find out which dependencies bring in what version of a transitive dependency. Gradle provides meaningful dependency reports to answer these questions. Later, we’ll see these reports in action. Now let’s see how Gradle implements these ideas with the help of a full-fledged example.

5.2 Learning dependency management by example
In chapter 3, you saw how to use the Jetty plugin to deploy a To Do application to an embedded Jetty Servlet container. Jetty is a handy container for use during development. With its lightweight container implementation, it provides fast startup times. Many enterprises use other web application container implementations in their production environments. Let’s assume you want to build support for deploying your web application to a different container product, such as Apache Tomcat

The open source project Cargo (http://cargo.codehaus.org/) provides versatile support for web application deployment to a variety of Servlet containers and application servers. Cargo supports two implementations you can use in your project. On the one hand, you can utilize a Java API, which gives you fine-grained access to each and every aspect of configuring Cargo. On the other hand, you can choose to execute a set of preconfigured Ant tasks that wrap the Java API. Because Gradle provides excellent integration with Ant, our examples will be based on the Cargo Ant tasks.

Let’s revisit figure 5.1 and see how the components change in the context of a Gradle use case. In chapter 3 you learned that dependency management for a project is configured with the help of two DSL configuration blocks: dependencies and repositories. The names of the configuration blocks directly map to methods of the interface Project. For your use case, you’re going to use Maven Central because it doesn’t require any additional setup. Figure 5.4 shows that dependency definitions are provided through Gradle’s DSL in the build.gradle file. The dependency manager will evaluate this configuration at runtime, download the required artifacts from a central repository, and store them in your local cache. You’re not using a local repository, so it’s not shown in the figure.

The following sections of this chapter discuss each of the Gradle build script configuration elements one by one. Not only will you learn how to apply them to the Cargo example, you’ll also learn how to apply dependency management to implement the requirements of your own project. Let’s first look at a concept that will become more important in the context of our example: dependency configurations.

5.3 Dependency configurations
In chapter 3, you saw that plugins can introduce configurations to define the scope for a dependency. The Java plugin brings in a variety of standard configurations to define which bucket of the Java build lifecycle a dependency should apply to. For example, dependencies required for compiling production source code are added with the compile configuration. In the build of your web application, you used the compile configuration to declare a dependency on the Apache Commons Lang library. To get a better understanding of how configurations are stored, configured, and accessed, let’s look at responsible interfaces in Gradle’s API.

Understanding the configuration API representation
Configurations can be directly added and accessed at the root level of a project; you can decide to use one of the configurations provided by a plugin or declare your own. Every project owns a container of class ConfigurationContainer that manages the corresponding configurations. Configurations are very flexible in their behavior. You can control whether transitive dependencies should be part of the dependency resolution, define the resolution strategy (for example, how to respond to conflicting artifact versions), and even make configurations extend to each other. Figure 5.5 shows the relevant Gradle API interfaces and their methods.

Another way of thinking of configurations is in terms of a logical grouping. Grouping dependencies by configuration is a similar concept to organizing Java classes into packages. Packages provide unique namespaces for classes they contain. The same is true for configurations. They group dependencies that serve a specific responsibility.

The Java plugin already provides six configurations out of the box: compile, runtime, testCompile, testRuntime, archives, and default. Couldn’t you just use one of those configurations to declare a dependency on the Cargo libraries? Generally, you could, but you’d mix up dependencies that are relevant to your application code and the infrastructure code you’re writing for deploying the application. Adding unnecessary libraries to your distribution can lead to unforeseen side effects at runtime and should be avoided at all costs. For example, using the compile configuration will result in a WAR file that contains the Cargo libraries. Next, I’ll show how to define a custom configuration for the Cargo libraries.

Defining a custom configuration
To clearly identify the dependencies needed for Cargo, you’ll need to declare a new configuration with the unique name cargo, as demonstrated in the following listing.
- Listing 5.1 Defining a configuration for Cargo libraries

view plaincopy to clipboardprint?
configurations {  
    cargo {  
        description = 'Classpath for Cargo Ant tasks.'  
        visible = false  
    }  
}  

For now, you’re only dealing with a single Gradle project. Limiting the visibility of this configuration to this project is a conscious choice in preparation for a multiproject setup. If you want to learn more about builds consisting of multiple projects, check out chapter 6. You don’t want to let configurations spill into other projects if they’re not needed. The description that was set for the configuration is directly reflected when you list the dependencies of the project:

# gradle dependencies
...
------------------------------------------------------------
Root project
------------------------------------------------------------

archives - Configuration for archive artifacts.
No dependencies

cargo - Classpath for Cargo Ant tasks.
No dependencies

After adding a configuration to the configuration container of a project, it can be accessed by its name. Next, you’ll use the cargo configuration to make the third-party Cargo Ant task public to the build script.

Accessing a configuration
Essentially, Ant tasks are Java classes that adhere to Ant’s extension endpoint for defining custom logic. To add a nonstandard Ant task like the Cargo deployment task to your project, you’ll need to declare it using the Taskdef Ant task. To resolve the Ant task implementation class, the Cargo JAR files containing them will need to be assigned. The next listing shows how easy it is to access the configuration by name. The task uses the resolved dependencies and assigns them to the classpath required for the Cargo Ant task.

Don’t worry if you don’t understand everything in the code example. The important part is that you recognize the Gradle API methods that allow you access to a configuration. The rest of the code is mostly Ant-specific configurations expressed through Gradle’s DSL. Chapter 9 will give you the inside scoop on using Ant tasks from Gradle. With the deployment task set up, it’s time to assign the Cargo dependencies to the cargo configuration.

程式扎記

標籤

2021年3月2日星期二

[Gradle in action] Ch5. Dependency management - Part1

沒有留言:

張貼留言

[Git 常見問題] error: The following untracked working tree files would be overwritten by merge

檢舉濫用情形

學習筆記

標籤

2021年3月2日 星期二