The Java platform, both its base language features and library extensions, provides an excellent base for writing secure applications. In this tutorial, the first of two parts on Java security, Brad Rubin guides you through the basics of cryptography and how it is implemented in the Java programming language, using plenty of code examples to illustrate the concepts.
About this tutorial
What is this tutorial about?
There is perhaps no software engineering topic of more timely importance than application security. Attacks are costly, whether the attack comes from inside or out, and some attacks can expose a software company to liability for damages. As computer (and especially Internet) technologies evolve, security attacks are becoming more sophisticated and frequent. Staying on top of the most up-to-date techniques and tools is one key to application security; the other is a solid foundation in proven technologies such as data encryption, authentication, and authorization.
The Java platform, both the basic language and library extensions, provides an excellent foundation for writing secure applications. This tutorial covers the basics of cryptography and how it is implemented in the Java programming language, and it offers example code to illustrate the concepts.
In this first installment of a two-part tutorial, we cover material in the library extensions -- now part of the JDK 1.4 base -- known as Java Cryptography Extension (JCE) and Java Secure Sockets Extension (JSSE). In addition, this tutorial introduces the CertPath API, which is new for JDK 1.4. In Part 2 (see Resources), we'll expand the discussion to encompass access control, which is managed in the Java platform by the Java Authentication and Authorization Service (JAAS).
Should I take this tutorial?
This is an intermediate-level tutorial; it assumes you know how to read and write basic Java programs, both applications and applets. If you are already a Java programmer and have been curious about cryptography (topics such as private and public key encryption, RSA, SSL, certificates) and the Java libraries that support them (JCE, JSSE), this tutorial is for you. It does not assume any previous background in cryptography, JCE, or JSSE.
This tutorial introduces the basic cryptographic building block concepts. Each concept is followed by the Java implementation considerations, a code example, and the results of the example execution.
Tools, code samples, and installation requirements
You'll need the following items to complete the programming exercises in this tutorial:
A note on the code examples
The code examples dump encrypted data directly to the screen. In most cases, this will result in strange-looking control characters, some of which may occasionally cause screen-formatting problems. This is not good programming practice (it would be better to convert them to displayable ASCII characters or decimal representations), but has been done here to keep the code examples and their output brief.
In most cases in the example execution sections, the actual strings have been modified to be compatible with the character set requirements of this tutorial. Also, in most examples, we look up and display the actual security provider library used for a given algorithm. This is done to give the user a better feel of which libraries are called for which functions. Why? Because, in most installations, there are a number of these providers installed.
Java security programming concepts
How the Java platform facilitates secure programming
The Java programming language and environment has many features that facilitate secure programming:
* No pointers,
* A bytecode verifier,
* Fine-grained control over resource access for both applets and applications.
* A large number of library functions
What are secure programming techniques?
Simply put, there are a number of programming styles and techniques available to help ensure a more secure application. Consider the following as two general examples:
* Storing/deleting passwords.
* Smart serialization.
We'll be discussing these and other techniques in more detail when we encounter a need for them throughout the tutorial.
Security is integrated in JDK 1.4
Now, new relaxed regulations open the door to tighter integration of security features and the base language. The following packages -- used as extensions prior to the 1.4 release -- are now integrated into JDK 1.4:
JDK 1.4 also introduces two new functions:
JCE, JSSE, and the CertPath API are the subject of this tutorial. We'll focus on JAAS in the next tutorial in this series. Neither tutorial covers the JGSS (which provides a generic framework to securely exchange messages between applications).
Security is enriched with third-party libraries
We can enhance an already rich set of functions in the current Java language with third-party libraries, also called providers. Providers add additional security algorithms. As an example of a library, we'll be working with the Bouncy Castle provider (see Resources ). The Bouncy Castle library provides other cryptographic algorithms, including the popular RSA algorithm discussed in What is public key cryptography? and What are digital signatures? of this tutorial.
While your directory names and java.security files might be a bit different, here is the template for installing the Bouncy Castle provider. To install this library, download the bcprov-jdkxx-xxx.jar file and place it in thej2sdk1.x.x\jre\lib\ext and the Program Files\Java\J2re1.x.x\lib\ext directories. In both java.security files, which are in the same directories as above but use "security" instead of "ext", add the following line:
n this section, we've introduced the features the Java language provides, either fully integrated or extension-based, that help to ensure that programming remains secure. We've offered some general examples of secure programming techniques to help you become familiar with the concept. We've covered security technologies that used to be extensions but are now integrated into the version 1.4 release; we've also noted two new security technologies. And we've demonstrated that third-party libraries can enhance security programs by offering new technologies.
In the remainder of this tutorial, we will familiarize you with these concepts designed to provide secure messaging (as they apply to Java programming):
As we discuss each of these topics, we'll serve up examples and sample code.
Ensuring the integrity of a message
In this section, we will learn about message digests, which take the data in a message and generate a block of bits designed to represent the "fingerprint" of the message. We will also cover the JDK 1.4-supported algorithms, classes, and methods related to message digests, offer a code example and a sample execution code for both the message digest and message authentication features.
What is a message digest?
A message digest is a function that ensures the integrity of a message. Message digests take a message as input and generate a block of bits, usually several hundred bits long, that represents the fingerprint of the message. A small change in the message (say, by an interloper or eavesdropper) creates a noticeable change in the fingerprint.
The message-digest function is a one-way function. It is a simple matter to generate the fingerprint from the message, but quite difficult to generate a message that matches a given fingerprint.
Message digests can be weak or strong. A checksum -- which is the XOR of all the bytes of a message -- is an example of a weak message-digest function. It is easy to modify one byte to generate any desired checksum fingerprint. Most strong functions use hashing. A 1-bit change in the message leads to a massive change in the fingerprint (ideally, 50 percent of the fingerprint bits change).
Algorithms, classes, and methods
JDK 1.4 supports the following message-digest algorithms:
MD5 and SHA-1 are the most used algorithms. The MessageDigest class manipulates message digests. The following methods are used in the Message digest code example:
If a key is used as part of the message-digest generation, the algorithm is known as a message-authentication code. JDK 1.4 supports the HMAC/SHA-1 and HMAC/MD5 message-authentication code algorithms. The Mac class manipulates message-authentication codes using a key produced by the KeyGenerator class. The following methods are used in the Message authentication code example:
- Message digest code example (Groovy)
- Message authentication code example (Groovy)
Note that the key generation takes a long time because the code is generating excellent quality pseudo-random numbers using the timing of thread behavior. Once the first number is generated, the others take much less time. Also, notice that unlike the message digest, the message-authentication code uses a cryptographic provider. (For more on providers, see Security is enriched with third-party libraries.)
Keeping a message confidential
In this section, we'll examine the uses of private key encryption and focus on such concepts as cipher blocks, padding, stream ciphers, and cipher modes. We'll quickly detail cipher algorithms, classes, and methods and illustrate this concept with a code example and sample executions.
What is private key cryptography?
Message digests may ensure integrity of a message, but they can't be used to ensure the confidentiality of a message. For that, we need to use private key cryptography to exchange private messages.
Consider this scenario: Alice and Bob each have a shared key that only they know and they agree to use a common cryptographic algorithm, or cipher. In other words, they keep their key private. When Alice wants to send a message to Bob, she encrypts the original message, known as plaintext, to create ciphertext and then sends the ciphertext to Bob. Bob receives the ciphertext from Alice and decrypts the ciphertext with his private key to re-create the original plaintext message. If Eve the eavesdropper is listening in on the communication, she hears only the ciphertext, so the confidentiality of the message is preserved.
You can encrypt single bits or chunks of bits, called blocks. The blocks, called cipher blocks, are typically 64 bits in size. If the message is not a multiple of 64 bits, then the short block must be padded (more on padding at What is padding?). Single-bit encryption is more common in hardware implementations. Single-bit ciphers are called stream ciphers .
The strength of the private key encryption is determined by the cryptography algorithm and the length of the key. If the algorithm is sound, then the only way to attack it is with a brute-force approach of trying every possible key, which will take an average of (1/2)*2^n attempts, where n is the number of bits in the key.
When the U.S. export regulations were restrictive, only 40-bit keys were allowed for export. This key length is fairly weak. The official U.S. standard, the DES algorithm, used 56-bit keys and this is becoming progressively weaker as processor speeds accelerate. Generally, 128-bit keys are preferred today. With them, if one million keys could be tried every second, it would take an average of many times the age of the universe to find a key!
What is padding?
As we mentioned in the previous section, if a block cipher is used and the message length is not a multiple of the block length, the last block must be padded with bytes to yield a full block size. There are many ways to pad a block, such as using all zeroes or ones. In this tutorial, we'll be using PKCS5 padding for private key encryption and PKCS1 for public key encryption. With PKCS5, a short block is padded with a repeating byte whose value represents the number of remaining bytes. We won't be discussing padding algorithms further in this tutorial, but for your information, JDK 1.4 supports the following padding techniques:
The BouncyCastle library (see Security is enriched with third-party libraries and Resources ) supports additional padding techniques.
Modes: Specifying how encryption works
A given cipher can be used in a variety of modes. Modes allow you to specify how encryption will work. For example, you can allow the encryption of one block to be dependent on the encryption of the previous block, or you can make the encryption of one block independent of any other blocks.
The mode you choose depends on your needs and you must consider the trade-offs (security, ability to parallel process, and tolerance to errors in both the plaintext and the ciphertext). Selection of modes is beyond the scope of this tutorial (see Resources for further reading), but again, for your information, the Java platform supports the following modes:
Algorithms, classes, and methods
JDK 1.4 supports the following private key algorithms:
The Cipher class manipulates private key algorithms using a key produced by the KeyGenerator class. The following methods are used in the Private key cryptography code example:
- Private key cryptography code example (Groovy)
Secret messages with public keys
In this section, we'll look at public key cryptography, a feature that solves the problem of encrypting messages between parties without prior arrangement on the keys. We'll take a short walk through the algorithms, classes, and methods that support the public key function, and offer a code sample and execution to illustrate the concept.
What is public key cryptography?
Private key cryptography suffers from one major drawback: how does the private key get to Alice and Bob in the first place? If Alice generates it, she has to send it to Bob, but it is sensitive information so it should be encrypted. However, keys have not been exchanged to perform the encryption. Public key cryptography, invented in the 1970s, solves the problem of encrypting messages between two parties without prior agreement on the key.
In public key cryptography, Alice and Bob not only have different keys, they each have two keys. One key is private and must not be shared with anyone. The other key is public and can be shared with anyone.
When Alice wants to send a secure message to Bob, she encrypts the message using Bob's public key and sends the result to Bob. Bob uses his private key to decrypt the message. When Bob wants to send a secure message to Alice, he encrypts the message using Alice's public key and sends the result to Alice. Alice uses her private key to decrypt the message. Eve can eavesdrop on both public keys and the encrypted messages, but she cannot decrypt the messages because she does not have either of the private keys.
The public and private keys are generated as a pair and need longer lengths than the equivalent-strength private key encryption keys. Typical key lengths for the RSA algorithm are 1,024 bits. It is not feasible to derive one member of the key pair from the other. Public key encryption is slow (100 to 1,000 times slower than private key encryption), so a hybrid technique is usually used in practice. Public key encryption is used to distribute a private key, known as a session key, to another party, and then private key encryption using that private session key is used for the bulk of the message encryption.
Algorithms, classes, and methods
The following two algorithms are used in public key encryption:
The Cipher class manipulates public key algorithms using keys produced by the KeyPairGenerator class. The following methods are used in the Public key cryptography code example example:
- Public key cryptography code example (Groovy)
Signatures without paper
In this section, we'll examine digital signatures, the first level of determining the identification of parties that exchange messages. We'll illustrate both difficult and easy ways to identify the message source through code samples. We'll also list the digital signature algorithms that JDK 1.4 supports, and look at the classes and methods involved.
What are digital signatures?
Did you notice the flaw in the public key message exchange described in What is public key cryptography?? How can Bob prove that the message really came from Alice? Eve could have substituted her public key for Alice's, then Bob would be exchanging messages with Eve thinking she was Alice. This is known as a Man-in-the-Middle attack. We can solve this problem by using a digital signature -- a bit pattern that proves that a message came from a given party.
One way of implementing a digital signature is using the reverse of the public key process described in What is public key cryptography?. Instead of encrypting with a public key and decrypting with a private key, the private key is used by a sender to sign a message and the recipient uses the sender's public key to decrypt the message. Because only the sender knows the private key, the recipient can be sure that the message really came from the sender.
In actuality, the message digest (What is a message digest?), not the entire message, is the bit stream that is signed by the private key. So, if Alice wants to send Bob a signed message, she generates the message digest of the message and signs it with her private key. She sends the message (in the clear) and the signed message digest to Bob. Bob decrypts the signed message digest with Alice's public key and computes the message digest from the cleartext message and checks that the two digests match. If they do, Bob can be sure the message came from Alice.
Note that digital signatures do not provide encryption of the message, so encryption techniques must be used in conjunction with signatures if you also need confidentiality. You can use the RSA algorithm for both digital signatures and encryption. A U.S. standard called DSA (Digital Signature Algorithm) can be used for digital signatures, but not for encryption.
We'll examine two examples in this section. The first, the hard way (see Digital signature code example: The hard way ), uses the primitives already discussed for message digests and public key cryptography to implement digital signatures. The second, the easy way (see Digital signature code example: The easy way ), uses the Java language's direct support for signatures.
- Digital signature code example: The hard way (Groovy)
Digital signature code example: The easy way
The Signature class manipulates digital signatures using a key produced by the KeyPairGenerator class. The following methods are used in the example below:
Proving you are who you are
In this section, we'll discuss digital certificates, the second level to determining the identity of a message originator. We'll look at certificate authorities and the role they play. We'll examine key and certificate repositories and management tools (keytool and keystore) and discuss the CertPath API, a set of functions designed for building and validating certification paths.
What are digital certificates?
As you likely noticed, there is a problem with the digital signature scheme described in What are digital signatures?. It proves that a message was sent by a given party, but how do we know for sure that the sender really is who she says she is. What if someone claims to be Alice and signs a message, but is actually Amanda? We can improve our security by using digital certificates which package an identity along with a public key and is digitally signed by a third party called a certificate authority or CA. JDK 1.4 supports the X.509 Digital Certificate Standard.
Understanding keytool and keystore
The Java platform uses a keystore as a repository for keys and certificates. Physically, the keystore is a file (there is an option to make it an encrypted one) with a default name of .keystore. Keys and certificates can have names, called aliases , and each alias can be protected by a unique password. The keystore itself is also protected by a password; you can choose to have each alias password match the master keystore password.
The Java platform uses the keytool to manipulate the keystore. This tool offers many options; the following example (keytool example) shows the basics of generating a public key pair and corresponding certificate, and viewing the result by querying the keystore. The keytool can be used to export a key into a file, in X.509 format, that can be signed by a certificate authority and then re-imported into the keystore.
There is also a special keystore that is used to hold the certificate authority (or any other trusted) certificates, which in turn contains the public keys for verifying the validity of other certificates. This keystore is called the truststore. The Java language comes with a default truststore in a file called cacerts . If you search for this filename, you will find at least two of these files. You can display the contents with the following command:
In this example, using the default keystore of .keystore, we generate a self-signed certificate using the RSA algorithm with an alias of JohnUserKey and then view the created certificate. We will use this certificate in The concept of code signing to sign a JAR file.
Then you can check your created self-signed certificate:
The Certification Path API is new for JDK 1.4. It is a set of functions for building and validating certification paths or chains. This is done implicitly in protocols like SSL/TLS (see What is Secure Sockets Layer/Transport Layer Security?) and JAR file signature verification, but can now be done explicitly in applications with this support.
As mentioned in What are digital certificates?, a CA can sign a certificate with its private key, and if the recipient holds the CA certificate that has the public key needed for signature verification, it can verify the validity of the signed certificate. In this case, the chain of certificates is of length two -- the anchor of trust (the CA certificate) and the signed certificate. A self-signed certificate is of length one -- the anchor of trust is the signed certificate itself.
Chains can be of arbitrary length, so in a chain of three, a CA anchor of trust certificate can sign an intermediate certificate; the owner of this certificate can use its private key to sign another certificate. The CertPath API can be used to walk the chain of certificates to verify validity, as well as to construct these chains of trust.
Certificates have expiration dates, but can be compromised before they expire, so Certificate Revocation Lists (CRL) must be checked to really ensure the integrity of a signed certificate. These lists are available on the CA Web sites, and can also be programmatically manipulated with the CertPath API. The specific API and code examples are beyond the scope of this tutorial, but Sun has several code examples available in addition to the API documentation.
Trusting the code
In this section, we'll review the concept of code signing, focusing on the tool that manages the certification of a JAR file, Jarsigner.
The concept of code signing
JAR files are the Java platform equivalent of ZIP files, allowing multiple Java class files to be packaged into one file with a .jar extension. This JAR file can then be digitally signed, proving the origin and the integrity of the class file code inside. A recipient of the JAR file can decide whether or not to trust the code based on the signature of the sender and can be confident that the contents have not been tampered with before receipt. The JDK comes with a jarsigner tool that provides this function.
In deployment, access to machine resources can be based on the signer's identity by putting access control statements in the policy file.
The jarsigner tool takes a JAR file and a private key and corresponding certificate as input, then generates a signed version of the JAR file as output. It calculates the message digests for each class in the JAR file and then signs these digests to ensure the integrity of the file and to identify the file owner.
In an applet environment, an HTML page references the class file contained in a signed JAR file. When this JAR file is received by the browser, the signature is checked against any installed certificates or against a certificate authority public signature to verify validity. If no existing certificates are found, the user is prompted with a screen giving the certificate details and asking if the user wants to trust the code.
Code signing example
In this example, we first create a JAR file from a .class file and then sign it by specifying the alias for the certificate in the keystore that is used for the signing. We then run a verification check on the signed JAR file.
If you open the jar file, you can notify that there are two files being added into jar file:
SSL/TLS: Securing C/S communication
In this section, we'll examine the building blocks of the Secure Sockets Layer (and its replacement, Transport Layer Security), the protocol used to authenticate the server to the client. We'll offer a few code examples as illustrations.
What is Secure Sockets Layer/Transport Layer Security?
Secure Sockets Layer (SSL) and its replacement, Transport Layer Security (TLS), is a protocol for establishing a secure communications channel between a client and a server. It is also used to authenticate the server to the client and, less commonly, used to authenticate the client to the server. It is usually seen in a browser application, where the lock at the bottom of the browser window indicates SSL/TLS is in effect.
TLS 1.0 is the same as SSL 3.1. SSL/TLS uses a hybrid of three of the cryptographic building blocks already discussed in this tutorial, but all of this is transparent to the user. Here is a simplified version of the protocol:
(SSL Handshake and HTTPS Bindings on IIS)
SSL/TLS code sample
In this example, we write an HTTPS daemon process using an SSL server socket that returns an HTML stream when a browser connects to it. This example also shows how to generate a machine certificate in a special keystore to support the SSL deployment. In Java programming, the only thing that needs to be done is to use an SSL Server Socket Factory instead of a Socket Factory, using lines like the following:
HTTPS server sample execution
In this example, we create an HTTPS server daemon that waits for a client browser connection and returns "Hello, World!". The browser connects to this daemon via https://localhost:1234. We first create a machine certificate. The name must match the machine name of the computer where the daemon runs; in this case, l2parser. In addition, we cannot use the same .keystore we have used in the past. We must create a separate keystore just for the machine certificate. In this case, it has the name sslKeyStore.
Then, we start the server daemon process specifying the special keystore and its password:
After waiting a few seconds, fire up a browser and point it to https://localhost:1234 and you should be prompted on whether or not to trust the certificate. Selecting "yes" should display "Hello World!", and clicking on the lock in Internet Explorer will give the certificate details.
This tutorial introduced the major cryptographic building blocks that can be used to provide a vast array of application security solutions. You've become familiar with such Java security topics as:
* Built-in features that facilitate secure programming
* Secure programming techniques
* Features newly integrated in JDK 1.4 (JCE, JSSE, JAAS, JGSS, and CertPath API).
* Enriching, third-party security offerings.
And the following concepts:
You should be well poised to explore Java security in more detail (see the Resources section) and to take the next tutorial, Java security, Part 2: Authentication and authorization.