The Java Programming Language

The Core Language

Implementations of the Java programming language provide strong memory safety, even in the presence of data races in concurrent code. This prevents a large range of security vulnerabilities from occurring, unless certain low-level features are used; see Low-level Features of the Virtual Machine.

Increasing Robustness when Reading Arrays

External data formats often include arrays, and the data is stored as an integer indicating the number of array elements, followed by this number of elements in the file or protocol data unit. This length specified can be much larger than what is actually available in the data source.

To avoid allocating extremely large amounts of data, you can allocate a small array initially and grow it as you read more data, implementing an exponential growth policy. See the readBytes(InputStream, int) function in Incrementally reading a byte array.

Esimerkki 1. Incrementally reading a byte array
static byte[] readBytes(InputStream in, int length) throws IOException {
	final int startSize = 65536;
    byte[] b = new byte[Math.min(length, startSize)];
    int filled = 0;
    while (true) {
        int remaining = b.length - filled;
	    readFully(in, b, filled, remaining);
        if (b.length == length) {
            break;
        }
        filled = b.length;
        if (length - b.length <= b.length) {
            // Allocate final length.  Condition avoids overflow.
            b = Arrays.copyOf(b, length);
        } else {
            b = Arrays.copyOf(b, b.length * 2);
        }
    }
    return b;
}

static void readFully(InputStream in,byte[] b, int off, int len)
	    throws IOException {
	int startlen = len;
    while (len > 0) {
        int count = in.read(b, off, len);
        if (count < 0) {
            throw new EOFException();
        }
        off += count;
        len -= count;
    }
}

When reading data into arrays, hash maps or hash sets, use the default constructor and do not specify a size hint. You can simply add the elements to the collection as you read them.

Resource Management

Unlike C++, Java does not offer destructors which can deallocate resources in a predictable fashion. All resource management has to be manual, at the usage site. (Finalizers are generally not usable for resource management, especially in high-performance code; see Finalizers.)

The first option is the try-finally construct, as shown in Resource management with a try-finally block. The code in the finally block should be as short as possible and should not throw any exceptions.

Esimerkki 2. Resource management with a try-finally block
InputStream in = new BufferedInputStream(new FileInputStream(path));
try {
    readFile(in);
} finally {
    in.close();
}

Note that the resource allocation happens outside the try block, and that there is no null check in the finally block. (Both are common artifacts stemming from IDE code templates.)

If the resource object is created freshly and implements the java.lang.AutoCloseable interface, the code in Resource management using the try-with-resource construct can be used instead. The Java compiler will automatically insert the close() method call in a synthetic finally block.

Esimerkki 3. Resource management using the try-with-resource construct
try (InputStream in = new BufferedInputStream(new FileInputStream(path))) {
    readFile(in);
}

To be compatible with the try-with-resource construct, new classes should name the resource deallocation method close(), and implement the AutoCloseable interface (the latter breaking backwards compatibility with Java 6). However, using the try-with-resource construct with objects that are not freshly allocated is at best awkward, and an explicit finally block is usually the better approach.

In general, it is best to design the programming interface in such a way that resource deallocation methods like close() cannot throw any (checked or unchecked) exceptions, but this should not be a reason to ignore any actual error conditions.

Finalizers

Finalizers can be used a last-resort approach to free resources which would otherwise leak. Finalization is unpredictable, costly, and there can be a considerable delay between the last reference to an object going away and the execution of the finalizer. Generally, manual resource management is required; see Resource Management.

Finalizers should be very short and should only deallocate native or other external resources held directly by the object being finalized. In general, they must use synchronization: Finalization necessarily happens on a separate thread because it is inherently concurrent. There can be multiple finalization threads, and despite each object being finalized at most once, the finalizer must not assume that it has exclusive access to the object being finalized (in the this pointer).

Finalizers should not deallocate resources held by other objects, especially if those objects have finalizers on their own. In particular, it is a very bad idea to define a finalizer just to invoke the resource deallocation method of another object, or overwrite some pointer fields.

Finalizers are not guaranteed to run at all. For instance, the virtual machine (or the machine underneath) might crash, preventing their execution.

Objects with finalizers are garbage-collected much later than objects without them, so using finalizers to zero out key material (to reduce its undecrypted lifetime in memory) may have the opposite effect, keeping objects around for much longer and prevent them from being overwritten in the normal course of program execution.

For the same reason, code which allocates objects with finalizers at a high rate will eventually fail (likely with a java.lang.OutOfMemoryError exception) because the virtual machine has finite resources for keeping track of objects pending finalization. To deal with that, it may be necessary to recycle objects with finalizers.

The remarks in this section apply to finalizers which are implemented by overriding the finalize() method, and to custom finalization using reference queues.

Recovering from Exceptions and Errors

Java exceptions come in three kinds, all ultimately deriving from java.lang.Throwable:

  • Run-time exceptions do not have to be declared explicitly and can be explicitly thrown from any code, by calling code which throws them, or by triggering an error condition at run time, like division by zero, or an attempt at an out-of-bounds array access. These exceptions derive from from the java.lang.RuntimeException class (perhaps indirectly).

  • Checked exceptions have to be declared explicitly by functions that throw or propagate them. They are similar to run-time exceptions in other regards, except that there is no language construct to throw them (except the throw statement itself). Checked exceptions are only present at the Java language level and are only enforced at compile time. At run time, the virtual machine does not know about them and permits throwing exceptions from any code. Checked exceptions must derive (perhaps indirectly) from the java.lang.Exception class, but not from java.lang.RuntimeException.

  • Errors are exceptions which typically reflect serious error conditions. They can be thrown at any point in the program, and do not have to be declared (unlike checked exceptions). In general, it is not possible to recover from such errors; more on that below, in The Difficulty of Catching Errors. Error classes derive (perhaps indirectly) from java.lang.Error, or from java.lang.Throwable, but not from java.lang.Exception.

The general expectation is that run-time errors are avoided by careful programming (e.g., not dividing by zero). Checked exception are expected to be caught as they happen (e.g., when an input file is unexpectedly missing). Errors are impossible to predict and can happen at any point and reflect that something went wrong beyond all expectations.

The Difficulty of Catching Errors

Errors (that is, exceptions which do not (indirectly) derive from java.lang.Exception), have the peculiar property that catching them is problematic. There are several reasons for this:

  • The error reflects a failed consistenty check, for example, java.lang.AssertionError.

  • The error can happen at any point, resulting in inconsistencies due to half-updated objects. Examples are java.lang.ThreadDeath, java.lang.OutOfMemoryError and java.lang.StackOverflowError.

  • The error indicates that virtual machine failed to provide some semantic guarantees by the Java programming language. java.lang.ExceptionInInitializerError is an example—it can leave behind a half-initialized class.

In general, if an error is thrown, the virtual machine should be restarted as soon as possible because it is in an inconsistent state. Continuing running as before can have unexpected consequences. However, there are legitimate reasons for catching errors because not doing so leads to even greater problems.

Code should be written in a way that avoids triggering errors. See Increasing Robustness when Reading Arrays for an example.

It is usually necessary to log errors. Otherwise, no trace of the problem might be left anywhere, making it very difficult to diagnose related failures. Consequently, if you catch java.lang.Exception to log and suppress all unexpected exceptions (for example, in a request dispatching loop), you should consider switching to java.lang.Throwable instead, to also cover errors.

The other reason mainly applies to such request dispatching loops: If you do not catch errors, the loop stops looping, resulting in a denial of service.

However, if possible, catching errors should be coupled with a way to signal the requirement of a virtual machine restart.

Low-level Features of the Virtual Machine

Reflection and Private Parts

The setAccessible(boolean) method of the java.lang.reflect.AccessibleObject class allows a program to disable language-defined access rules for specific constructors, methods, or fields. Once the access checks are disabled, any code can use the java.lang.reflect.Constructor, java.lang.reflect.Method, or java.lang.reflect.Field object to access the underlying Java entity, without further permission checks. This breaks encapsulation and can undermine the stability of the virtual machine. (In contrast, without using the setAccessible(boolean) method, this should not happen because all the language-defined checks still apply.)

This feature should be avoided if possible.

Java Native Interface (JNI)

The Java Native Interface allows calling from Java code functions specifically written for this purpose, usually in C or C++.

The transition between the Java world and the C world is not fully type-checked, and the C code can easily break the Java virtual machine semantics. Therefore, extra care is needed when using this functionality.

To provide a moderate amount of type safety, it is recommended to recreate the class-specific header file using javah during the build process, include it in the implementation, and use the -Wmissing-declarations option.

Ideally, the required data is directly passed to static JNI methods and returned from them, and the code and the C side does not have to deal with accessing Java fields (or even methods).

When using GetPrimitiveArrayCritical or GetStringCritical, make sure that you only perform very little processing between the get and release operations. Do not access the file system or the network, and not perform locking, because that might introduce blocking. When processing large strings or arrays, consider splitting the computation into multiple sub-chunks, so that you do not prevent the JVM from reaching a safepoint for extended periods of time.

If necessary, you can use the Java long type to store a C pointer in a field of a Java class. On the C side, when casting between the jlong value and the pointer on the C side,

You should not try to perform pointer arithmetic on the Java side (that is, you should treat pointer-carrying long values as opaque). When passing a slice of an array to the native code, follow the Java convention and pass it as the base array, the integer offset of the start of the slice, and the integer length of the slice. On the native side, check the offset/length combination against the actual array length, and use the offset to compute the pointer to the beginning of the array.

Esimerkki 4. Array length checking in JNI code
JNIEXPORT jint JNICALL Java_sum
  (JNIEnv *jEnv, jclass clazz, jbyteArray buffer, jint offset, jint length)
{
  assert(sizeof(jint) == sizeof(unsigned));
  if (offset < 0 || length < 0) {
    (*jEnv)->ThrowNew(jEnv, arrayIndexOutOfBoundsExceptionClass,
		      "negative offset/length");
    return 0;
  }
  unsigned uoffset = offset;
  unsigned ulength = length;
  // This cannot overflow because of the check above.
  unsigned totallength = uoffset + ulength;
  unsigned actuallength = (*jEnv)->GetArrayLength(jEnv, buffer);
  if (totallength > actuallength) {
    (*jEnv)->ThrowNew(jEnv, arrayIndexOutOfBoundsExceptionClass,
		      "offset + length too large");
    return 0;
  }
  unsigned char *ptr = (*jEnv)->GetPrimitiveArrayCritical(jEnv, buffer, 0);
  if (ptr == NULL) {
    return 0;
  }
  unsigned long long sum = 0;
  for (unsigned char *p = ptr + uoffset, *end = p + ulength; p != end; ++p) {
    sum += *p;
  }
  (*jEnv)->ReleasePrimitiveArrayCritical(jEnv, buffer, ptr, 0);
  return sum;
}

In any case, classes referring to native resources must be declared final, and must not be serializeable or clonable. Initialization and mutation of the state used by the native side must be controlled carefully. Otherwise, it might be possible to create an object with inconsistent native state which results in a crash (or worse) when used (or perhaps only finalized) later. If you need both Java inheritance and native resources, you should consider moving the native state to a separate class, and only keep a reference to objects of that class. This way, cloning and serialization issues can be avoided in most cases.

If there are native resources associated with an object, the class should have an explicit resource deallocation method (Resource Management) and a finalizer (Finalizers) as a last resort. The need for finalization means that a minimum amount of synchronization is needed. Code on the native side should check that the object is not in a closed/freed state.

Many JNI functions create local references. By default, these persist until the JNI-implemented method returns. If you create many such references (e.g., in a loop), you may have to free them using DeleteLocalRef, or start using PushLocalFrame and PopLocalFrame. Global references must be deallocated with DeleteGlobalRef, otherwise there will be a memory leak, just as with malloc and free.

When throwing exceptions using Throw or ThrowNew, be aware that these functions return regularly. You have to return control manually to the JVM.

Technically, the JNIEnv pointer is not necessarily constant during the lifetime of your JNI module. Storing it in a global variable is therefore incorrect. Particularly if you are dealing with callbacks, you may have to store the pointer in a thread-local variable (defined with __thread). It is, however, best to avoid the complexity of calling back into Java code.

Keep in mind that C/C and Java are different languages, despite very similar syntax for expressions. The Java memory model is much more strict than the C or C memory models, and native code needs more synchronization, usually using JVM facilities or POSIX threads mutexes. Integer overflow in Java is defined, but in C/C++ it is not (for the jint and jlong types).

sun.misc.Unsafe

The sun.misc.Unsafe class is unportable and contains many functions explicitly designed to break Java memory safety (for performance and debugging). If possible, avoid using this class.

Interacting with the Security Manager

The Java platform is largely implemented in the Java language itself. Therefore, within the same JVM, code runs which is part of the Java installation and which is trusted, but there might also be code which comes from untrusted sources and is restricted by the Java sandbox (to varying degrees). The security manager draws a line between fully trusted, partially trusted and untrusted code.

The type safety and accessibility checks provided by the Java language and JVM would be sufficient to implement a sandbox. However, only some Java APIs employ such a capabilities-based approach. (The Java SE library contains many public classes with public constructors which can break any security policy, such as java.io.FileOutputStream.) Instead, critical functionality is protected by stack inspection: At a security check, the stack is walked from top (most-nested) to bottom. The security check fails if a stack frame for a method is encountered whose class lacks the permission which the security check requires.

This simple approach would not allow untrusted code (which lacks certain permissions) to call into trusted code while the latter retains trust. Such trust transitions are desirable because they enable Java as an implementation language for most parts of the Java platform, including security-relevant code. Therefore, there is a mechanism to mark certain stack frames as trusted (Re-gaining Privileges).

In theory, it is possible to run a Java virtual machine with a security manager that acts very differently from this approach, but a lot of code expects behavior very close to the platform default (including many classes which are part of the OpenJDK implementation).

Security Manager Compatibility

A lot of code can run without any additional permissions at all, with little changes. The following guidelines should help to increase compatibility with a restrictive security manager.

  • When retrieving system properties using System.getProperty(String) or similar methods, catch SecurityException exceptions and treat the property as unset.

  • Avoid unnecessary file system or network access.

  • Avoid explicit class loading. Access to a suitable class loader might not be available when executing as untrusted code.

If the functionality you are implementing absolutely requires privileged access and this functionality has to be used from untrusted code (hopefully in a restricted and secure manner), see Re-gaining Privileges.

Activating the Security Manager

The usual command to launch a Java application, java, does not activate the security manager. Therefore, the virtual machine does not enforce any sandboxing restrictions, even if explicitly requested by the code (for example, as described in Reducing Trust in Code).

The -Djava.security.manager option activates the security manager, with the fairly restrictive default policy. With a very permissive policy, most Java code will run unchanged. Assuming the policy in Most permissve OpenJDK policy file has been saved in a file grant-all.policy, this policy can be activated using the option -Djava.security.policy=grant-all.policy (in addition to the -Djava.security.manager option).

Esimerkki 5. Most permissve OpenJDK policy file
grant {
    permission java.security.AllPermission;
};

With this most permissive policy, the security manager is still active, and explicit requests to drop privileges will be honored.

Reducing Trust in Code

The Using the security manager to run code with reduced privileges example shows how to run a piece code of with reduced privileges.

Esimerkki 6. Using the security manager to run code with reduced privileges
Permissions permissions = new Permissions();
        ProtectionDomain protectionDomain =
    new ProtectionDomain(null, permissions);
        AccessControlContext context = new AccessControlContext(
            new ProtectionDomain[] { protectionDomain });

// This is expected to succeed.
try (FileInputStream in = new FileInputStream(path)) {
    System.out.format("FileInputStream: %s%n", in);
}

AccessController.doPrivileged(new PrivilegedExceptionAction<Void>() {
	@Override
        public Void run() throws Exception {
	    // This code runs with reduced privileges and is
	    // expected to fail.
	    try (FileInputStream in = new FileInputStream(path)) {
		System.out.format("FileInputStream: %s%n", in);
	    }
	    return null;
	}
    }, context);

The example above does not add any additional permissions to the permissions object. If such permissions are necessary, code like the following (which grants read permission on all files in the current directory) can be used:

permissions.add(new FilePermission(
            System.getProperty("user.dir") + "/-", "read"));

Calls to the java.security.AccessController.doPrivileged() methods do not enforce any additional restriction if no security manager has been set. Except for a few special exceptions, the restrictions no longer apply if the doPrivileged() has returned, even to objects created by the code which ran with reduced privileges. (This applies to object finalization in particular.)

The example code above does not prevent the called code from calling the java.security.AccessController.doPrivileged() methods. This mechanism should be considered an additional safety net, but it still can be used to prevent unexpected behavior of trusted code. As long as the executed code is not dynamic and came with the original application or library, the sandbox is fairly effective.

The context argument in Using the security manager to run code with reduced privileges is extremely important—otherwise, this code would increase privileges instead of reducing them.

For activating the security manager, see Activating the Security Manager. Unfortunately, this affects the virtual machine as a whole, so it is not possible to do this from a library.

Re-gaining Privileges

Ordinarily, when trusted code is called from untrusted code, it loses its privileges (because of the untrusted stack frames visible to stack inspection). The java.security.AccessController.doPrivileged() family of methods provides a controlled backdoor from untrusted to trusted code.

By design, this feature can undermine the Java security model and the sandbox. It has to be used very carefully. Most sandbox vulnerabilities can be traced back to its misuse.

In essence, the doPrivileged() methods cause the stack inspection to end at their call site. Untrusted code further down the call stack becomes invisible to security checks.

The following operations are common and safe to perform with elevated privileges.

  • Reading custom system properties with fixed names, especially if the value is not propagated to untrusted code. (File system paths including installation paths, host names and user names are sometimes considered private information and need to be protected.)

  • Reading from the file system at fixed paths, either determined at compile time or by a system property. Again, leaking the file contents to the caller can be problematic.

  • Accessing network resources under a fixed address, name or URL, derived from a system property or configuration file, information leaks not withstanding.

The Using the security manager to run code with increased privileges example shows how to request additional privileges.

Esimerkki 7. Using the security manager to run code with increased privileges
// This is expected to fail.
try {
    System.out.println(System.getProperty("user.home"));
} catch (SecurityException e) {
    e.printStackTrace(System.err);
}
AccessController.doPrivileged(new PrivilegedAction<Void>() {
        public Void run() {
            // This should work.
            System.out.println(System.getProperty("user.home"));
            return null;
        }
    });

Obviously, this only works if the class containing the call to doPrivileged() is marked trusted (usually because it is loaded from a trusted class loader).

When writing code that runs with elevated privileges, make sure that you follow the rules below.

  • Make the privileged code as small as possible. Perform as many computations as possible before and after the privileged code section, even if it means that you have to define a new class to pass the data around.

  • Make sure that you either control the inputs to the privileged code, or that the inputs are harmless and cannot affect security properties of the privileged code.

  • Data that is returned from or written by the privileged code must either be restricted (that is, it cannot be accessed by untrusted code), or must be harmless. Otherwise, privacy leaks or information disclosures which affect security properties can be the result.

If the code calls back into untrusted code at a later stage (or performs other actions under control from the untrusted caller), you must obtain the original security context and restore it before performing the callback, as in Restoring privileges when invoking callbacks. (In this example, it would be much better to move the callback invocation out of the privileged code section, of course.)

Esimerkki 8. Restoring privileges when invoking callbacks
interface Callback<T> {
	T call(boolean flag);
}

class CallbackInvoker<T> {
	private final AccessControlContext context;
	Callback<T> callback;

	CallbackInvoker(Callback<T> callback) {
	    context = AccessController.getContext();
	    this.callback = callback;
	}

	public T invoke() {
	    // Obtain increased privileges.
	    return AccessController.doPrivileged(new PrivilegedAction<T>() {
		    @Override
		    public T run() {
			// This operation would fail without
			// additional privileges.
			final boolean flag = Boolean.getBoolean("some.property");

			// Restore the original privileges.
			return AccessController.doPrivileged(
                        new PrivilegedAction<T>() {
				@Override
				public T run() {
				    return callback.call(flag);
				}
			    }, context);
		    }
		});
	}
}