Edition 1
Table of Contents
inend and outend. inp and outp are the respective positions. The number of input bytes is checked using the expression len > (size_t)(inend - inp). The cast silences a compiler warning; inend is always larger than inp.
ssize_t
extract_strings(const char *in, size_t inlen, char **out, size_t outlen)
{
const char *inp = in;
const char *inend = in + inlen;
char **outp = out;
char **outend = out + outlen;
while (inp != inend) {
size_t len;
char *s;
if (outp == outend) {
errno = ENOSPC;
goto err;
}
len = (unsigned char)*inp;
++inp;
if (len > (size_t)(inend - inp)) {
errno = EINVAL;
goto err;
}
s = malloc(len + 1);
if (s == NULL) {
goto err;
}
memcpy(s, inp, len);
inp += len;
s[len] = '\0';
*outp = s;
++outp;
}
return outp - out;
err:
{
int errno_old = errno;
while (out != outp) {
free(*out);
++out;
}
errno = errno_old;
}
return -1;
}
len > (size_t)(inend - inp), where len is a variable of type size_t which denotes the total number of bytes which are about to be read or written next. In general, it is not safe to fold multiple such checks into one, as in len1 + len2 > (size_t)(inend - inp), because the expression on the left can overflow or wrap around (see Section 1.1.3, “Recommendations for integer arithmetic”), and it no longer reflects the number of bytes to be processed.
void report_overflow(void);
int
add(int a, int b)
{
int result = a + b;
if (a < 0 || b < 0) {
return -1;
}
// The compiler can optimize away the following if statement.
if (result < 0) {
report_overflow();
}
return result;
}
void report_overflow(void);
unsigned
add_unsigned(unsigned a, unsigned b)
{
unsigned sum = a + b;
if (sum < a) { // or sum < b
report_overflow();
}
return sum;
}
unsigned
mul(unsigned a, unsigned b)
{
if (b && a > ((unsigned)-1) / b) {
report_overflow();
}
return a * b;
}
a * b involving a constant a, where the expression is reduced to b > C for some constant C determined at compile time. The other expression, b && a > ((unsigned)-1) / b, is more difficult to optimize at compile time.
-fwrapv GCC option. As a result, GCC will provide 2's complement semantics for integer arithmetic, including defined behavior on integer overflow.
static, so that access is restricted to a single translation unit.
const is needed to make the array constant, and not just the strings. It must be placed after the *, and not before it.
static const char *const string_list[] = {
"first",
"second",
"third",
NULL
};
static keyword in such cases leads to undefined behavior.
static keyword is used, it can just be dropped, unless the object is very large (larger than 128 kilobytes on 32 bit platforms). In the latter case, it is recommended to allocate the object using malloc, to obtain proper array checking, for the same reasons outlined in Section 1.3.2, “alloca and other forms of stack-based allocation”.
malloc which your code must deallocate explicitly using free.
gets ⟶ fgets
getwd ⟶ getcwd or get_current_dir_name
readdir_r ⟶ readdir
realpath (with a non-NULL second parameter) ⟶ realpath with NULL as the second parameter, or canonicalize_file_name
NAME_MAX (limit not actually enforced by the kernel)
PATH_MAX (limit not actually enforced by the kernel)
_PC_NAME_MAX (This limit, returned by the pathconf function, is not enforced by the kernel.)
_PC_PATH_MAX (This limit, returned by the pathconf function, is not enforced by the kernel.)
f_namemax in struct statvfs (limit not actually enforced by the kernel, see _PC_NAME_MAX above)
asprintf or vasprintf. (For non-GNU targets, these functions are available from Gnulib.) In some cases, the snprintf function might be a suitable replacement, see Section 1.2.3, “String Functions With Explicit Length Arguments”.
sprintf
strcat
strcpy
vsprintf
putenv ⟶ explicit envp argument in process creation (see Section 12.1.3, “Specifying the process environment”)
setenv ⟶ explicit envp argument in process creation (see Section 12.1.3, “Specifying the process environment”)
unsetenv ⟶ explicit envp argument in process creation (see Section 12.1.3, “Specifying the process environment”)
snprintfsnprintf function provides a way to construct a string in a statically-sized buffer. (If the buffer size is allocated on the heap, consider use asprintf instead.)
char fraction[30];
snprintf(fraction, sizeof(fraction), "%d/%d", numerator, denominator);
snprintf call should always be the size of the buffer in the first argument (which should be a character array). Elaborate pointer and length arithmetic can introduce errors and nullify the security benefits of snprintf.
snprintf is not well-suited to constructing a string iteratively, by appending to an existing buffer. snprintf returns one of two values, -1 on errors, or the number of characters which would have been written to the buffer if the buffer were large enough. This means that adding the result of snprintf to the buffer pointer to skip over the characters just written is incorrect and risky. However, as long as the length argument is not zero, the buffer will remain NUL-terminated. Example 1.6, “Repeatedly writing to a buffer using snprintf” works because end -current > 0 is a loop invariant. After the loop, the result string is in the buf variable.
snprintf
char buf[512];
char *current = buf;
const char *const end = buf + sizeof(buf);
for (struct item *it = data; it->key; ++it) {
snprintf(current, end - current, "%s%s=%d",
current == buf ? "" : ", ", it->key, it->value);
current += strlen(current);
}
strlen for performance reasons, you have to check for a negative return value from snprintf and also check if the return value is equal to the specified buffer length or larger. Only if neither condition applies, you may advance the pointer to the start of the write buffer by the number return by snprintf. However, this optimization is rarely worthwhile.
vsnprintf and format stringsvsnprintf (or vasprintf or even snprintf) with a format string which is not a constant, but a function argument, it is important to annotate the function with a format function attribute, so that GCC can warn about misuse of your function (see Example 1.7, “The format function attribute”).
format function attribute
void log_format(const char *format, ...) __attribute__((format(printf, 1, 2)));
void
log_format(const char *format, ...)
{
char buf[1000];
va_list ap;
va_start(ap, format);
vsnprintf(buf, sizeof(buf), format, ap);
va_end(ap);
log_string(buf);
}
strncpystrncpy function does not ensure that the target buffer is NUL-terminated. A common idiom for ensuring NUL termination is:
char buf[10];
strncpy(buf, data, sizeof(buf));
buf[sizeof(buf) - 1] = '\0';
strncat function for this purpose:
buf[0] = '\0';
strncat(buf, data, sizeof(buf) - 1);
strncatstrncat function specifies the maximum number of characters copied from the source buffer, excluding the terminating NUL character. This means that the required number of bytes in the destination buffer is the length of the original string, plus the length argument in the strncat call, plus one. Consequently, this function is rarely appropriate for performing a length-checked string operation, with the notable exception of the strcpy emulation described in Section 1.2.3.3, “strncpy”.
snprintf”:
char buf[10];
snprintf(buf, sizeof(buf), "%s", prefix);
snprintf(buf + strlen(buf), sizeof(buf) - strlen(buf), "%s", data);
snprintf(buf, sizeof(buf), "%s%s", prefix, data);
"%s%s" to implement concatenation, unless you use separate buffers. snprintf does not support overlapping source and target strings.
strlcpy and strlcatstrlcpy and strlcat functions which behave this way, but these functions are not part of GNU libc. strlcpy is often replaced with snprintf with a "%s" format string. See Section 1.2.3.3, “strncpy” for a caveat related to the snprintf return value.
strlcat, use the approach described in Section 1.2.3.4, “strncat”.
_s functionsstrn* and stpn* functionsmalloc and related functionsmalloc, free and realloc, and the calloc function. In addition to these generic functions, there are derived functions such as strdup which perform allocation using malloc internally, but do not return untyped heap memory (which could be used for any object).
malloc.
realloc does not free the old pointer. Therefore, the idiom ptr = realloc(ptr, size); is wrong because the memory pointed to by ptr leaks in case of an error.
free, the pointer is invalid. Further pointer dereferences are not allowed (and are usually detected by valgrind). Less obvious is that any use of the old pointer value is not allowed, either. In particular, comparisons with any other pointer (or the null pointer) are undefined according to the C standard.
realloc if the memory area cannot be enlarged in-place. For instance, the compiler may assume that a comparison between the old and new pointer will always return false, so it is impossible to detect movement this way.
malloc and other allocation functions return a null pointer. Dereferencing this pointer lead to a crash. Such dereferences can even be exploitable for code execution if the dereference is combined with an array subscript.
alloca and other forms of stack-based allocationSIGSEGV signal is generated and the program typically terminates.
alloca and related functions such as strdupa. These functions should be avoided because of the lack of error checking. (They can be used safely if the allocated size is less than the page size (typically, 4096 bytes), but this case is relatively rare.) Additionally, relying on alloca makes it more difficult to reorgnize the code because it is not allowed to use the pointer after the function calling alloca has returned, even if this function has been inlined into its caller.
alloca or VLAs for performance reasons, consider using a small on-stack array (less than the page size, large enough to fulfill most requests). If the requested size is small enough, use the on-stack array. Otherwise, call malloc. When exiting the function, check if malloc had been called, and free the buffer as needed.
calloc function performs such checks.
malloc or realloc is used, the size check must be written manually. For instance, to allocate an array of n elements of type T, check that the requested size is not greater than ((size_t) -1) / sizeof(T). See Section 1.1.3, “Recommendations for integer arithmetic”.
malloc, and completely different interfaces for memory management. Both approaches can reduce the effectiveness of valgrind and similar tools, and the heap corruption detection provided by GNU libc, so they should be avoided.
malloc and free. The Boehm-Dehmers-Weiser allocator can be used from C programs, with minimal type annotations. Performance is competitive with malloc on 64-bit architectures, especially for multi-threaded programs. The stop-the-world pauses may be problematic for some real-time applications, though.
xmalloc which abort the process on allocation failure (instead of returning a NULL pointer), or alternatives to relatively recent library additions such as snprintf (along with implementations for systems which lack them).
__attribute__ annotations to function declarations can remedy this to some extent, but these annotations have to be maintained carefully for feature parity with the standard implementation.
printf-style function used for logging), you should add a suitable format attribute, as in Example 1.7, “The format function attribute”.
warn_unused_result attribute and you propagate its return value, your wrapper should be declared with warn_unused_result as well.
__builtin_object_size GCC builtin is desirable if the wrapper processes arrays. (This functionality is used by the -D_FORTIFY_SOURCE=2 checks to guard against static buffer overflows.) However, designing appropriate interfaces and implementing the checks may not be entirely straightforward.
malloc), careful analysis and comparison with the compiler documentation is required to check if propagating the attribute is appropriate. Incorrectly applied attributes can result in undesired behavioral changes in the compiled code.
operator new[]n, an expression like new T[n] can return a pointer to a heap region which is too small. In other words, not all array elements are actually backed with heap memory reserved to the array. Current GCC versions generate code that performs a computation of the form sizeof(T) * size_t(n) + cookie_size, where cookie_size is currently at most 8. This computation can overflow, and GCC versions prior to 4.8 generated code which did not detect this. (Fedora 18 was the first release which fixed this in GCC.)
std::vector template can be used instead an explicit array allocation. (The GCC implementation detects overflow internally.)
operator new[] and the sources will be compiled with older GCC versions, code which allocates arrays with a variable length must check for overflow manually. For the new T[n] example, the size check could be n || (n > 0 && n > (size_t(-1) - 8) / sizeof(T)). (See Section 1.1.3, “Recommendations for integer arithmetic”.) If there are additional dimensions (which must be constants according to the C++ standard), these should be included as factors in the divisor.
strcat which works on std::string arguments. Similarly, do not name methods after such functions.
-std=c++98 for the original 1998 C++ standard
-std=c++03 for the 1998 standard with the changes from the TR1 technical report
-std=c++11 for the 2011 C++ standard. This option should not be used.
-std=c++0x for several different versions of C++11 support in development, depending on the GCC version. This option should not be used.
-std=gnu++98, -std=gnu++03, -std=gnu++11. Again, -std=gnu++11 should not be used.
libstdc++ will change in subtle ways. Currently, no C++ libraries are compiled in C++11 mode, so if you compile your code in C++11 mode, it will be incompatible with the rest of the system. Unfortunately, this is also the case if you do not use any C++11 features. Currently, there is no safe way to enable C++11 mode (except for freestanding applications).
-std=c++03 or -std=gnu++03 and in the <tr1/*> header files. This includes std::tr1::shared_ptr (from <tr1/memory>) and std::tr1::function (from <tr1/functional>). For other C++11 features, the Boost C++ library contains replacements.
std::copy
std::copy_backward
std::copy_if
std::move (three-argument variant)
std::move_backward
std::partition_copy_if
std::remove_copy
std::remove_copy_if
std::replace_copy
std::replace_copy_if
std::swap_ranges
std::transform
std::copy_n, std::fill_n and std::generate_n do not perform iterator checking, either, but there is an explicit count which has to be supplied by the caller, as opposed to an implicit length indicator in the form of a pair of forward iterators.
std::back_inserter function.
std::equal
std::is_permutation
std::mismatch
std::stringstd::string class provides a convenient way to handle strings. Unlike C strings, std::string objects have an explicit length (and can contain embedded NUL characters), and storage for its characters is managed automatically. This section discusses std::string, but these observations also apply to other instances of the std::basic_string template.
data() member function does not necessarily point to a NUL-terminated string. To obtain a C-compatible string pointer, use c_str() instead, which adds the NUL terminator.
data() and c_str() functions and iterators are only valid until certain events happen. It is required that the exact std::string object still exists (even if it was initially created as a copy of another string object). Pointers and iterators are also invalidated when non-const member functions are called, or functions with a non-const reference parameter. The behavior of the GCC implementation deviates from that required by the C++ standard if multiple threads are present. In general, only the first call to a non-const member function after a structural modification of the string (such as appending a character) is invalidating, but this also applies to member function such as the non-const version of begin(), in violation of the C++ standard.
c_str() member function on a temporary object. This is convenient for calling C functions, but the pointer will turn invalid as soon as the temporary object is destroyed, which generally happens when the outermost expression enclosing the expression on which c_str() is called completes evaluation. Passing the result of c_str() to a function which does not store or otherwise leak that pointer is safe, though.
std::vector and std::array, subscribing with operator[] does not perform bounds checks. Use the at(size_type) member function instead. See Section 2.2.3, “Containers and operator[]”. Furthermore, accessing the terminating NUL character using operator[] is not possible. (In some implementations, the c_str() member function writes the NUL character on demand.)
data() or c_str() after casting away const. If you need a C-style writable string, use a std::vector<char> object and its data() member function. In this case, you have to explicitly add the terminating NUL character.
std::string is currently based on reference counting. It is expected that a future version will remove the reference counting, due to performance and conformance issues. As a result, code that implicitly assumes sharing by holding to pointers or iterators for too long will break, resulting in run-time crashes or worse. On the other hand, non-const iterator-returning functions will no longer give other threads an opportunity for invalidating existing iterators and pointers because iterator invalidation does not depend on sharing of the internal character array object anymore.
operator[]std::vector provide both operator[](size_type) and a member function at(size_type). This applies to std::vector itself, std::array, std::string and other instances of std::basic_string.
operator[](size_type) is not required by the standard to perform bounds checking (and the implementation in GCC does not). In contrast, at(size_type) must perform such a check. Therefore, in code which is not performance-critical, you should prefer at(size_type) over operator[](size_type), even though it is slightly more verbose.
front() and back() member functions are undefined if a vector object is empty. You can use vec.at(0) and vec.at(vec.size() - 1) as checked replacements. For an empty vector, data() is defined; it returns an arbitrary pointer, but not necessarily the NULL pointer.
readBytes(InputStream, int) function in Example 3.1, “Incrementally reading a byte array”.
static byte[] readBytes(InputStream in, int length) throws IOException {
final int startSize = 65536;
byte[] b = new byte[Math.min(length, startSize)];
int filled = 0;
while (true) {
int remaining = b.length - filled;
readFully(in, b, filled, remaining);
if (b.length == length) {
break;
}
filled = b.length;
if (length - b.length <= b.length) {
// Allocate final length. Condition avoids overflow.
b = Arrays.copyOf(b, length);
} else {
b = Arrays.copyOf(b, b.length * 2);
}
}
return b;
}
static void readFully(InputStream in,byte[] b, int off, int len)
throws IOException {
int startlen = len;
while (len > 0) {
int count = in.read(b, off, len);
if (count < 0) {
throw new EOFException();
}
off += count;
len -= count;
}
}
try-finally construct, as shown in Example 3.2, “Resource management with a try-finally block”. The code in the finally block should be as short as possible and should not throw any exceptions.
try-finally block
InputStream in = new BufferedInputStream(new FileInputStream(path));
try {
readFile(in);
} finally {
in.close();
}
try block, and that there is no null check in the finally block. (Both are common artifacts stemming from IDE code templates.)
java.lang.AutoCloseable interface, the code in Example 3.3, “Resource management using the try-with-resource construct” can be used instead. The Java compiler will automatically insert the close() method call in a synthetic finally block.
try-with-resource construct
try (InputStream in = new BufferedInputStream(new FileInputStream(path))) {
readFile(in);
}
try-with-resource construct, new classes should name the resource deallocation method close(), and implement the AutoCloseable interface (the latter breaking backwards compatibility with Java 6). However, using the try-with-resource construct with objects that are not freshly allocated is at best awkward, and an explicit finally block is usually the better approach.
close() cannot throw any (checked or unchecked) exceptions, but this should not be a reason to ignore any actual error conditions.
this pointer).
java.lang.OutOfMemoryError exception) because the virtual machine has finite resources for keeping track of objects pending finalization. To deal with that, it may be necessary to recycle objects with finalizers.
finalize() method, and to custom finalization using reference queues.
java.lang.Throwable:
java.lang.RuntimeException class (perhaps indirectly).
throw statement itself). Checked exceptions are only present at the Java language level and are only enforced at compile time. At run time, the virtual machine does not know about them and permits throwing exceptions from any code. Checked exceptions must derive (perhaps indirectly) from the java.lang.Exception class, but not from java.lang.RuntimeException.
java.lang.Error, or from java.lang.Throwable, but not from java.lang.Exception.
java.lang.Exception), have the peculiar property that catching them is problematic. There are several reasons for this:
java.lang.AssertionError.
java.lang.ThreadDeath, java.lang.OutOfMemoryError and java.lang.StackOverflowError.
java.lang.ExceptionInInitializerError is an example—it can leave behind a half-initialized class.
java.lang.Exception to log and suppress all unexpected exceptions (for example, in a request dispatching loop), you should consider switching to java.lang.Throwable instead, to also cover errors.
Reflection and private partssetAccessible(boolean) method of the java.lang.reflect.AccessibleObject class allows a program to disable language-defined access rules for specific constructors, methods, or fields. Once the access checks are disabled, any code can use the java.lang.reflect.Constructor, java.lang.reflect.Method, or java.lang.reflect.Field object to access the underlying Java entity, without further permission checks. This breaks encapsulation and can undermine the stability of the virtual machine. (In contrast, without using the setAccessible(boolean) method, this should not happen because all the language-defined checks still apply.)
-Wmissing-declarations option.
GetPrimitiveArrayCritical or GetStringCritical, make sure that you only perform very little processing between the get and release operations. Do not access the file system or the network, and not perform locking, because that might introduce blocking. When processing large strings or arrays, consider splitting the computation into multiple sub-chunks, so that you do not prevent the JVM from reaching a safepoint for extended periods of time.
long type to store a C pointer in a field of a Java class. On the C side, when casting between the jlong value and the pointer on the C side,
long values as opaque). When passing a slice of an array to the native code, follow the Java convention and pass it as the base array, the integer offset of the start of the slice, and the integer length of the slice. On the native side, check the offset/length combination against the actual array length, and use the offset to compute the pointer to the beginning of the array.
JNIEXPORT jint JNICALL Java_sum
(JNIEnv *jEnv, jclass clazz, jbyteArray buffer, jint offset, jint length)
{
assert(sizeof(jint) == sizeof(unsigned));
if (offset < 0 || length < 0) {
(*jEnv)->ThrowNew(jEnv, arrayIndexOutOfBoundsExceptionClass,
"negative offset/length");
return 0;
}
unsigned uoffset = offset;
unsigned ulength = length;
// This cannot overflow because of the check above.
unsigned totallength = uoffset + ulength;
unsigned actuallength = (*jEnv)->GetArrayLength(jEnv, buffer);
if (totallength > actuallength) {
(*jEnv)->ThrowNew(jEnv, arrayIndexOutOfBoundsExceptionClass,
"offset + length too large");
return 0;
}
unsigned char *ptr = (*jEnv)->GetPrimitiveArrayCritical(jEnv, buffer, 0);
if (ptr == NULL) {
return 0;
}
unsigned long long sum = 0;
for (unsigned char *p = ptr + uoffset, *end = p + ulength; p != end; ++p) {
sum += *p;
}
(*jEnv)->ReleasePrimitiveArrayCritical(jEnv, buffer, ptr, 0);
return sum;
}
final, and must not be serializeable or cloneable. Initialization and mutation of the state used by the native side must be controlled carefully. Otherwise, it might be possible to create an object with inconsistent native state which results in a crash (or worse) when used (or perhaps only finalized) later. If you need both Java inheritance and native resources, you should consider moving the native state to a separate class, and only keep a reference to objects of that class. This way, cloning and serialization issues can be avoided in most cases.
DeleteLocalRef, or start using PushLocalFrame and PopLocalFrame. Global references must be deallocated with DeleteGlobalRef, otherwise there will be a memory leak, just as with malloc and free.
Throw or ThrowNew, be aware that these functions return regularly. You have to return control manually to the JVM.
JNIEnv pointer is not necessarily constant during the lifetime of your JNI module. Storing it in a global variable is therefore incorrect. Particularly if you are dealing with callbacks, you may have to store the pointer in a thread-local variable (defined with __thread). It is, however, best to avoid the complexity of calling back into Java code.
jint and jlong types).
sun.misc.Unsafesun.misc.Unsafe class is unportable and contains many functions explicitly designed to break Java memory safety (for performance and debugging). If possible, avoid using this class.
java.io.FileOutputStream.) Instead, critical functionality is protected by stack inspection: At a security check, the stack is walked from top (most-nested) to bottom. The security check fails if a stack frame for a method is encountered whose class lacks the permission which the security check requires.
System.getProperty(String) or similar methods, catch SecurityException exceptions and treat the property as unset.
java, does not activate the security manager. Therefore, the virtual machine does not enforce any sandboxing restrictions, even if explicitly requested by the code (for example, as described in Section 3.3.3, “Reducing trust in code”).
-Djava.security.manager option activates the security manager, with the fairly restrictive default policy. With a very permissive policy, most Java code will run unchanged. Assuming the policy in Example 3.5, “Most permissve OpenJDK policy file” has been saved in a file grant-all.policy, this policy can be activated using the option -Djava.security.policy=grant-all.policy (in addition to the -Djava.security.manager option).
grant {
permission java.security.AllPermission;
};
Permissions permissions = new Permissions();
ProtectionDomain protectionDomain =
new ProtectionDomain(null, permissions);
AccessControlContext context = new AccessControlContext(
new ProtectionDomain[] { protectionDomain });
// This is expected to succeed.
try (FileInputStream in = new FileInputStream(path)) {
System.out.format("FileInputStream: %s%n", in);
}
AccessController.doPrivileged(new PrivilegedExceptionAction<Void>() {
@Override
public Void run() throws Exception {
// This code runs with reduced privileges and is
// expected to fail.
try (FileInputStream in = new FileInputStream(path)) {
System.out.format("FileInputStream: %s%n", in);
}
return null;
}
}, context);
permissions object. If such permissions are necessary, code like the following (which grants read permission on all files in the current directory) can be used:
permissions.add(new FilePermission(
System.getProperty("user.dir") + "/-", "read"));
java.security.AccessController.doPrivileged() methods do not enforce any additional restriction if no security manager has been set. Except for a few special exceptions, the restrictions no longer apply if the doPrivileged() has returned, even to objects created by the code which ran with reduced privileges. (This applies to object finalization in particular.)
java.security.AccessController.doPrivileged() methods. This mechanism should be considered an additional safety net, but it still can be used to prevent unexpected behavior of trusted code. As long as the executed code is not dynamic and came with the original application or library, the sandbox is fairly effective.
context argument in Example 3.6, “Using the security manager to run code with reduced privileges” is extremely important—otherwise, this code would increase privileges instead of reducing them.
java.security.AccessController.doPrivileged() family of methods provides a controlled backdoor from untrusted to trusted code.
doPrivileged() methods cause the stack inspection to end at their call site. Untrusted code further down the call stack becomes invisible to security checks.
// This is expected to fail.
try {
System.out.println(System.getProperty("user.home"));
} catch (SecurityException e) {
e.printStackTrace(System.err);
}
AccessController.doPrivileged(new PrivilegedAction<Void>() {
public Void run() {
// This should work.
System.out.println(System.getProperty("user.home"));
return null;
}
});
doPrivileged() is marked trusted (usually because it is loaded from a trusted class loader).
interface Callback<T> {
T call(boolean flag);
}
class CallbackInvoker<T> {
private final AccessControlContext context;
Callback<T> callback;
CallbackInvoker(Callback<T> callback) {
context = AccessController.getContext();
this.callback = callback;
}
public T invoke() {
// Obtain increased privileges.
return AccessController.doPrivileged(new PrivilegedAction<T>() {
@Override
public T run() {
// This operation would fail without
// additional privileges.
final boolean flag = Boolean.getBoolean("some.property");
// Restore the original privileges.
return AccessController.doPrivileged(
new PrivilegedAction<T>() {
@Override
public T run() {
return callback.call(flag);
}
}, context);
}
});
}
}
ctypes module, do not provide memory safety guarantees comparable to the rest of Python. If such functionality is used, the advice in Section 1.1, “The core language” should be followed.
compile
eval
exec
execfile
int and float functions instead of eval. Sandboxing untrusted Python code does not work reliably.
rexec Python module cannot safely sandbox untrusted code and should not be used. The standard CPython implementation is not suitable for sandboxing.
subprocess module can be used to write scripts which are almost as concise as shell scripts when it comes to invoking external programs, and Python offers richer data structures, with less arcane syntax and more consistent behavior.
$variable” or “${variable}”.
"…".
external-program "$arg1" "$arg2"
IFS variable. This may allow the injection of additional options which are then processed by external-program.
eval built-in command, or by invoking a subshell with “bash -c”. These constructs should not be used.
$((expression))” is evaluated. This construct is called arithmetic expansion.
$[expression]” is a deprecated syntax with the same effect.
let shell built-in are evaluated.
((expression))” is an alternative syntax for “let expression”.
[[…]]” can trigger arithmetic evaluation if certain operators such as -eq are used. (The test built-in does not perform arithmetic evaluation, even with integer operators such as -eq.)
[[ $variable =~ regexp ]]” can be used for input validation, assuming that regexp is a constant regular expression. See Section 5.5, “Performing input validation”.
${variable[expression]}” (array indexing) or “${variable:expression}” (string slicing), trigger arithmetic evaluation of expression.
[subscript]=expression” triggers evaluation of subscript, but not expression.
for command, “for ((expression1; expression2; expression3)); do commands; done” are evaluated. This does not apply to the regular for command, “for variable in list; do commands; done”.
declare -i integer_variable declare -a array_variable declare -A assoc_array_variable typeset -i integer_variable typeset -a array_variable typeset -A assoc_array_variable local -i integer_variable local -a array_variable local -A assoc_array_variable readonly -i integer_variable readonly -a array_variable readonly -A assoc_array_variable
array_variable=(1 2 3 4)
mapfile) can implicitly create array variables.
export -f or declare -f).
module::function”.
IFS variable to tokenize strings.
-”, it may be interpreted by the external command as an option. Depending on the external program, a “--” argument stops option processing and treats all following arguments as positional parameters. (Double quotes are completely invisible to the command being invoked, so they do not prevent variable values from being interpreted as options.)
env -i” command with an additional parameter which indicates the environment has been cleared and suppresses a further self-execution. Alternatively, individual commands can be executed with “env -i”.
set -e” may be sufficient. This causes the script to stop on the first failed command. However, failures in pipes (“command1 | command2”) are only detected for the last command in the pipe, errors in previous commands are ignored. This can be changed by invoking “set -o pipefail”. Due to architectural limitations, only the process that spawned the entire pipe can check for failures in individual commands; it is not possible for a process to tell if the process feeding data (or the process consuming data) exited normally or with an error.
mktemp command, and temporary directories with “mktemp -d”.
tmpfile="$(mktemp)" cleanup () { rm -f -- "$tmpfile" } trap cleanup 0
$value” is an integer. This construct is specific to bash and not portable to POSIX shells.
if [[ $value =~ ^-?[0-9]+$ ]] ; then
echo value is an integer
else
echo "value is not an integer" 1>&2
exit 1
fi
case statements for input validation is also possible and supported by other (POSIX) shells, but the pattern language is more restrictive, and it can be difficult to write suitable patterns.
expr external command can give misleading results (e.g., if the value being checked contains operators itself) and should not be used.
main function, and invoking the main function at the end of the script, using this syntax:
main "$@" ; exit $?
main function, instead of opening the script file and trying to read more commands.
GOMAXPROCS is not larger than 1). The reason is that interface values and slices consist of multiple words are not updated atomically. Another thread of execution can observe an inconsistent pairing between type information and stored value (for interfaces) or pointer and length (for slices), and such inconsistency can lead to a memory safety violation.
unsafe package (or other packages which expose unsafe constructs) is memory-safe. For example, invalid casts and out-of-range subscripting cause panics at run time.
error to signal error.
io.Writer).
nil value, handling any encountered error. See Example 6.1, “Regular error handling in Go” for details.
type Processor interface {
Process(buf []byte) (message string, err error)
}
type ErrorHandler interface {
Handle(err error)
}
func RegularError(buf []byte, processor Processor,
handler ErrorHandler) (message string, err error) {
message, err = processor.Process(buf)
if err != nil {
handler.Handle(err)
return "", err
}
return
}
io.Reader, io.ReaderAt and related interfaces, it is necessary to check for a non-zero number of read bytes first, as shown in Example 6.2, “Read error handling in Go”. If this pattern is not followed, data loss may occur. This is due to the fact that the io.Reader interface permits returning both data and an error at the same time.
func IOError(r io.Reader, buf []byte, processor Processor,
handler ErrorHandler) (message string, err error) {
n, err := r.Read(buf)
// First check for available data.
if n > 0 {
message, err = processor.Process(buf[0:n])
// Regular error handling.
if err != nil {
handler.Handle(err)
return "", err
}
}
// Then handle any error.
if err != nil {
handler.Handle(err)
return "", err
}
return
}
encoding hierarchy provide support for serialization and deserialization. The usual caveats apply (see Chapter 13, Serialization and Deserialization).
Unmarshal and Decode functions should only be used with fresh values in the interface{} argument. This is due to the way defaults for missing values are implemented: During deserialization, missing value do not result in an error, but the original value is preserved. Using a fresh value (with suitable default values if necessary) ensures that data from a previous deserialization operation does not leak into the current one. This is especially relevant when structs are deserialized.
substring method on strings (the string class in the glib-2.0 package) are not range-checked. It is the responsibility of the calling code to ensure that the arguments being passed are valid. This applies even to cases (like substring) where the implementation would have range information to check the validity of indexes. See Section 1.1.2, “Recommendations for pointers and array handling”.
GObject values. For plain C pointers (such as strings), the programmer has to ensure that storage is deallocated once it is no longer needed (to avoid memory leaks), and that storage is not being deallocated while it is still being used (see Section 1.3.1.1, “Use-after-free errors”).
Table of Contents
pthread_mutex_lock and pthread_mutex_unlock functions without linking against -lpthread because the system provides stubs for non-threaded processes.
fork, these locks should be acquired and released in helpers registered with pthread_atfork. This function is not available without -lpthread, so you need to use dlsym or a weak symbol to obtain its address.
fork protection for other reasons, you should store the process ID and compare it to the value returned by getpid each time you access the global state. (getpid is not implemented as a system call and is fast.) If the value changes, you know that you have to re-create the state object. (This needs to be combined with locking, of course.)
struct. In C++, the handle can be a pointer to an abstract base class, or it can be hidden using the pointer-to-implementation idiom.
final classes in Java). Classes which are not designed for inheritance and are used as base classes nevertheless create potential maintenance hazards because it is difficult to predict how client code will react when calls to virtual methods are added, reordered or removed.
std::function objects should be used for callbacks.
void *, the value of which can be specified by client code. If possible, the value of the closure parameter should be provided by client code at the same time a specific callback is registered (or specified as a function argument). If a single closure parameter is shared by multiple callbacks, flexibility is greatly reduced, and conflicts between different pieces of client code using the same library object could be unresolvable. In some cases, it makes sense to provide a de-registration callback which can be used to destroy the closure parameter when the callback is no longer used.
longjmp. If possible, all library objects should remain in a valid state. (All further operations on them can fail, but it should be possible to deallocate them without causing resource leaks.)
fcntl locks behave in surprising ways, not just in a multi-threaded environment)
FILE *-based functions found in <stdio.h>, and all the file and network communication facilities provided by the Python and Java environments are eventually implemented in them.
select limit”), and the kernel resources are not freed. Therefore, it is important to close all descriptors at the earlierst point in time possible, but not earlier.
close system call is always successful in the sense that the passed file descriptor is never valid after the function has been called. However, close still can return an error, for example if there was a file system failure. But this error is not very useful because the absence of an error does not mean that all caches have been emptied and previous writes have been made durable. Programs which need such guarantees must open files with O_SYNC or use fsync or fdatasync, and may also have to fsync the directory containing the file.
socketpair, close one of the descriptors, and call shutdown(fd, SHUTRDWR) on the other.
close on the descriptor. Instead it program uses dup2 to replace the descriptor to be closed with the dummy descriptor created earlier. This way, the kernel will not reuse the descriptor, but it will carry out all other steps associated with calling a descriptor (for instance, if the descriptor refers to a stream socket, the peer will be notified).
SO_LINGER socket option alters the behavior of close, so that it will return only after the lingering data has been processed, either by sending it to the peer successfully, or by discarding it after the configured timeout. However, there is no interface which could perform this operation in the background, so a separate userspace thread is needed for each close call, causing scalability issues.
connlimit match type in particular) and specialized filtering devices for denial-of-service network traffic.
TIME_WAIT state commonly seen in netstat output. The kernel automatically expires such sockets if necessary.
fork share the initial set of file descriptors with their parent process. By default, file descriptors are also preserved if a new process image is created with execve (or any of the other functions such as system or posix_spawn).
FD_CLOEXEC flag, using F_GETFD and F_SETFD operations of the fcntl function.
FD_CLOEXEC was set. Therefore, many system calls which create descriptors (such as open and openat) now accept the O_CLOEXEC flag (SOCK_CLOEXEC for socket and socketpair), which cause the FD_CLOEXEC flag to be set for the file descriptor in an atomic fashion. In addition, a few new systems calls were introduced, such as pipe2 and dup3.
fork, but before creating a new process image with execve, all file descriptors which the child process will not need are closed.
3 to 255 and later 1023. But this is only an approximatio because it is possible to create file descriptors outside this range easily (see Section 9.3, “Dealing with the select limit”). Another approach reads /proc/self/fd and closes the unexpected descriptors listed there, but this approach is much slower.
select limitselect function only supports a maximum of FD_SETSIZE file descriptors (that is, the maximum permitted value for a file descriptor is FD_SETSIZE - 1, usually 1023.) If a process opens many files, descriptors may exceed such limits. It is impossible to query such descriptors using select.
select, at least one of them needs to be changed. Calls to select can be replaced with calls to poll or another event handling mechanism. Replacing the select function is the recommended approach.
FD_SETSIZE limit using the following procedure.
fd as usual, preferably with the O_CLOEXEC flag.
fd, invoke:
int newfd = fcntl(fd, F_DUPFD_CLOEXEC, (long)FD_SETSIZE);
newfd result is non-negative, otherwise close fd and report an error, and return.
fd and continue to use newfd.
FD_SETSIZE. Even though this algorithm is racy in the sense that the FD_SETSIZE first descriptors could fill up, a very high degree of physical parallelism is required before this becomes a problem.
readdir.
O_NONBLOCK flag is specified.
O_NOFOLLOW and AT_SYMLINK_NOFOLLOW variants of system calls only affected final path name component.
O_CREAT and O_EXCL flags, so that creating the file will fail if it already exists. This guards against the unexpected appearance of file names, either due to creation of a new file, or hard-linking of an existing file. In multi-threaded programs, rather than manipulating the umask, create the files with mode 000 if possible, and adjust it afterwards with fchmod.
at” variants of system calls have to be used (that is, functions like openat, fchownat, fchmodat, and unlinkat, together with O_NOFOLLOW or AT_SYMLINK_NOFOLLOW). Path names passed to these functions must have just a single component (that is, without a slash). When descending, the descriptors of parent directories must be kept open. The missing opendirat function can be emulated with openat (with an O_DIRECTORY flag, to avoid opening special files with side effects), followed by fdopendir.
at” functions are not available, it is possible to emulate them by changing the current directory. (Obviously, this only works if the process is not multi-threaded.) fchdir has to be used to change the current directory, and the descriptors of the parent directories have to be kept open, just as with the “at”-based approach. chdir("...") is unsafe because it might ascend outside the intended directory tree.
at” function emulation is currently required when manipulating extended attributes. In this case, the lsetxattr function can be used, with a relative path name consisting of a single component. This also applies to SELinux contexts and the lsetfilecon function.
fchmodat and fchownat affect files whose link count is greater than one. But opening the files, checking that the link count is one with fstat, and using fchmod and fchown on the file descriptor may have unwanted side effects, due to item 2 above. When creating directories, it is therefore important to change the ownership and permissions only after it has been fully created. Until that point, file names are stable, and no files with unexpected hard links can be introduced.
readdir. Concurrent modification of the directory can result in a list of files being returned which never actually existed on disk.
unlinkat without further checks because deletion only affects the name within the directory tree being processed.
stat system call.
setfsuid and setfsgid. (These functions are preferred over seteuid and setegid because they do not allow the impersonated user to send signals to the process.) These functions are not thread safe. In multi-threaded processes, these operations need to be performed in a single-threaded child process. Unexpected blocking may occur as well.
PATH_MAX, NAME_MAX. However, on most systems, the length of canonical path names (absolute path names with all symbolic links resolved, as returned by realpath or canonicalize_file_name) can exceed PATH_MAX bytes, and individual file name components can be longer than NAME_MAX. This is also true of the _PC_PATH_MAX and _PC_NAME_MAX values returned by pathconf, and the f_namemax member of struct statvfs. Therefore, these constants should not be used. This is also reason why the readdir_r should never be used (instead, use readdir).
readdir might still result in a different byte sequence.
ioctl support (even fairly generic functionality such as FIEMAP for discovering physical file layout and holes) is file-system-specific.
rename can fail (even when stat indicates that the source and target directories are located on the same file system). This system call should work if the old and new paths are located in the same directory, though.
stat/fstat interface, even if stored by the file system.
statvfs and fstatvfs functions allow programs to examine the number of available blocks and inodes, through the members f_bfree, f_bavail, f_ffree, and f_favail of struct statvfs. Some file systems return fictional values in the f_ffree and f_favail fields, so the only reliable way to discover if the file system still has space for a file is to try to create it. The f_bfree field should be reasonably accurate, though.
/tmp race condition). This is tricky because traditionally, system-wide temporary directories shared by all users are used.
tmpfs for storing temporary files, to increase performance and decrease wear on Flash storage. As a result, spooling data to temporary files does not result in any memory savings, and the related complexity can be avoided if the data is kept in process memory.
secure_getenv to obtain the value of the TMPDIR environment variable. If it is set, convert the path to a fully-resolved absolute path, using realpath(path, NULL). Check if the new path refers to a directory and is writeable. In this case, use it as the temporary directory.
/tmp.
tempfile.tempdir variable.
java.lang.System.getenv(String) method to obtain the value of the TMPDIR environment variable, and follow the two steps described above. (Java's default directory selection does not honor TMPDIR.)
mkostemp function creates a named temporary file. You should specify the O_CLOEXEC flag to avoid file descriptor leaks to subprocesses. (Applications which do not use multiple threads can also use mkstemp, but libraries should use mkostemp.) For determining the directory part of the file name pattern, see Section 11.1, “Obtaining the location of temporary directory”.
mkostemp multiple times. Do not create additional file names derived from the name provided by a previous mkostemp call. However, it is safe to close the descriptor returned by mkostemp and reopen the file using the generated name.
tempfile.NamedTemporaryFile provides similar functionality, except that the file is deleted automatically by default. Note that you may have to use the file attribute to obtain the actual file object because some programming interfaces cannot deal with file-like objects. The C function mkostemp is also available as tempfile.mkstemp.
java.io.File.createTempFile(String, String, File) function, using the temporary file location determined according to Section 11.1, “Obtaining the location of temporary directory”. Do not use java.io.File.deleteOnExit() to delete temporary files, and do not register a shutdown hook for each temporary file you create. In both cases, the deletion hint cannot be removed from the system if you delete the temporary file prior to termination of the VM, causing a memory leak.
tmpfile function creates a temporary file and immediately deletes it, while keeping the file open. As a result, the file lacks a name and its space is deallocated as soon as the file descriptor is closed (including the implicit close when the process terminates). This avoids cluttering the temporary directory with orphaned files.
fmemopen function can be used to create a FILE * object which is backed by memory.
tempfile.TemporaryFile class, and the tempfile.SpooledTemporaryFile class provides a way to avoid creation of small temporary files.
mkdtemp function can be used to create a temporary directory. (For determining the directory part of the file name pattern, see Section 11.1, “Obtaining the location of temporary directory”.) The directory is not automatically removed. In Python, this function is available as tempfile.mkdtemp. In Java 7, temporary directories can be created using the java.nio.file.Files.createTempDirectory(Path, String, FileAttribute...) function.
-rf and -- options.
PATH must be obtained in a secure manner (see Section 12.3.1, “Accessing environment variables”). If the PATH variable is not set or untrusted, the safe default /bin:/usr/bin must be used.
system should not be used. The posix_spawn function can be used instead, or a combination fork and execve. (In some cases, it may be preferable to use vfork or the Linux-specific clone system call instead of fork.)
subprocess module bypasses the shell by default (when the shell keyword argument is not set to true). os.system should not be used.
envp argument to posix_spawn or execve. The functions setenv, unsetenv and putenv should not be used. They are not thread-safe and suffer from memory leaks.
dict for the the env argument of the subprocess.Popen constructor. The Java class java.lang.ProcessBuilder provides a environment() method, which returns a map that can be manipulated.
PATH should be initialized to /bin:/usr/bin.
USER and HOME can be inhereted from the parent process environment, or they can be initialized from the pwent structure for the user.
DISPLAY and XAUTHORITY variables should be passed to the subprocess if it is an X program. Note that this will typically not work across trust boundaries because XAUTHORITY refers to a file with 0600 permissions.
LANG, LANGUAGE, LC_ADDRESS, LC_ALL, LC_COLLATE, LC_CTYPE, LC_IDENTIFICATION, LC_MEASUREMENT, LC_MESSAGES, LC_MONETARY, LC_NAME, LC_NUMERIC, LC_PAPER, LC_TELEPHONE and LC_TIME can be passed to the subprocess if present.
NUL characters because the system APIs will silently truncate argument strings at the first NUL character.
getopt_long. This convention is widely used, but it is just that, and individual programs might interpret a command line in a different way.
--option-name=VALUE syntax, placing the option and its value into the same command line argument. This avoids any potential confusion if the data starts with -.
-- marker after the last option, and include the data at the right position. The -- marker terminates option processing, and the data will not be treated as an option even if it starts with a dash.
<defunct> by ps) is kept around until the status information is collected (reaped) by the parent process. Over the years, several interfaces for this have been invented:
wait, waitpid, waitid, wait3 or wait4, without specifying a process ID. This will deliver any matching process ID. This approach is typically used from within event loops.
waitpid, waitid, or wait4, with a specific process ID. Only data for the specific process ID is returned. This is typically used in code which spawns a single subprocess in a synchronous manner.
SIGCHLD signal, using sigaction, and specifies to the SA_NOCLDWAIT flag. This approach could be used by event loops as well.
waitpid or waitid, and hope that the status is not collected by an event loop first.
SUID/SGID processesSUID file permission bit indicates that an executable should run with the effective user ID equal to the owner of the executable file. Similarly, with the SGID bit, the effective group ID is set to the group of the executable file.
-D_GNU_SOURCE. The Autoconf macro AC_GNU_SOURCE ensures this.
secure_getenv and __secure_getenv function. The Autoconf directive AC_CHECK_FUNCS([__secure_getenv secure_getenv]) performs these checks.
secure_getenv function. See Example 12.1, “Obtaining a definition for secure_getenv”.
secure_getenv instead of getenv to obtain the value of critical environment variables. secure_getenv will pretend the variable has not bee set if the process environment is not trusted.
secure_getenv function or the __secure_getenv is available from GNU libc.
secure_getenv#include <stdlib.h> #ifndef HAVE_SECURE_GETENV # ifdef HAVE__SECURE_GETENV # define secure_getenv __secure_getenv # else # error neither secure_getenv nor __secure_getenv are available # endif #endif
setsid. The parent process can simply exit (using _exit, to avoid running clean-up actions twice).
/dev/null. Logging should be redirected to syslog.
umask(0). This is risky because it often leads to world-writable files and directories, resulting in security vulnerabilities such as arbitrary process termination by untrusted local users, or log file truncation. If the umask needs setting, a restrictive value such as 027 or 077 is recommended.
-c option).
fork as a primitive for parallelismfork which is not immediately followed by a call to execve (perhaps after rearranging and closing file descriptors) is typically unsafe, especially from a library which does not control the state of the entire process. Such use of fork should be replaced with proper child processes or threads.
unserialize)
eval function to parse JSON objects in Javascript; even with the regular expression filter from RFC 4627, there are still information leaks remaining. JSON-based formats can still turn out risky if they serve as an encoding form for any if the serialization frameworks listed above.
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema">
<!ENTITY sys SYSTEM "http://www.example.com/ent.xml"> <!ENTITY pub PUBLIC "-//Example//Public Entity//EN" "http://www.example.com/pub-ent.xml">
<!NOTATION not SYSTEM "../not.xml">
xsd: prefix) directly support textual regular expressions which are not required to be deterministic.
// Stop the parser when an entity declaration is encountered.
static void
EntityDeclHandler(void *userData,
const XML_Char *entityName, int is_parameter_entity,
const XML_Char *value, int value_length,
const XML_Char *base, const XML_Char *systemId,
const XML_Char *publicId, const XML_Char *notationName)
{
XML_StopParser((XML_Parser)userData, XML_FALSE);
}
XML_Parser object is created (Example 13.2, “Creating an Expat XML parser”).
XML_Parser parser = XML_ParserCreate("UTF-8");
if (parser == NULL) {
fprintf(stderr, "XML_ParserCreate failed\n");
close(fd);
exit(1);
}
// EntityDeclHandler needs a reference to the parser to stop
// parsing.
XML_SetUserData(parser, parser);
// Disable entity processing, to inhibit entity expansion.
XML_SetEntityDeclHandler(parser, EntityDeclHandler);
XML_StartDoctypeDeclHandler handler installed with XML_SetDoctypeDeclHandler.
QXmlDeclHandler and QXmlSimpleReader subclasses are needed. It is not possible to use the QDomDocument::setContent(const QByteArray &) convenience methods.
class NoEntityHandler : public QXmlDeclHandler {
public:
bool attributeDecl(const QString&, const QString&, const QString&,
const QString&, const QString&);
bool internalEntityDecl(const QString&, const QString&);
bool externalEntityDecl(const QString&, const QString&,
const QString&);
QString errorString() const;
};
bool
NoEntityHandler::attributeDecl
(const QString&, const QString&, const QString&, const QString&,
const QString&)
{
return false;
}
bool
NoEntityHandler::internalEntityDecl(const QString&, const QString&)
{
return false;
}
bool
NoEntityHandler::externalEntityDecl(const QString&, const QString&, const
QString&)
{
return false;
}
QString
NoEntityHandler::errorString() const
{
return "XML declaration not permitted";
}
QXmlReader subclass in Example 13.4, “A QtXml XML reader which blocks entity processing”. Some parts of QtXml will call the setDeclHandler(QXmlDeclHandler *) method. Consequently, we prevent overriding our custom handler by providing a definition of this method which does nothing. In the constructor, we activate namespace processing; this part may need adjusting.
class NoEntityReader : public QXmlSimpleReader {
NoEntityHandler handler;
public:
NoEntityReader();
void setDeclHandler(QXmlDeclHandler *);
};
NoEntityReader::NoEntityReader()
{
QXmlSimpleReader::setDeclHandler(&handler);
setFeature("http://xml.org/sax/features/namespaces", true);
setFeature("http://xml.org/sax/features/namespace-prefixes", false);
}
void
NoEntityReader::setDeclHandler(QXmlDeclHandler *)
{
// Ignore the handler which was passed in.
}
NoEntityReader class can be used with one of the overloaded QDomDocument::setContent methods. Example 13.5, “Parsing an XML document with QDomDocument, without entity expansion” shows how the buffer object (of type QByteArray) is wrapped as a QXmlInputSource. After calling the setContent method, you should check the return value and report any error.
NoEntityReader reader;
QBuffer buffer(&data);
buffer.open(QIODevice::ReadOnly);
QXmlInputSource source(&buffer);
QDomDocument doc;
QString errorMsg;
int errorLine;
int errorColumn;
bool okay = doc.setContent
(&source, &reader, &errorMsg, &errorLine, &errorColumn);
javax.xml.XMLConstants.FEATURE_SECURE_PROCESSING, which enforces heuristic restrictions on the number of entity expansions. Note that this flag alone does not prevent resolution of external references (system IDs or public IDs), so it is slightly misnamed.
class NoEntityResolver implements EntityResolver {
@Override
public InputSource resolveEntity(String publicId, String systemId)
throws SAXException, IOException {
// Throwing an exception stops validation.
throw new IOException(String.format(
"attempt to resolve \"%s\" \"%s\"", publicId, systemId));
}
}
class NoResourceResolver implements LSResourceResolver {
@Override
public LSInput resolveResource(String type, String namespaceURI,
String publicId, String systemId, String baseURI) {
// Throwing an exception stops validation.
throw new RuntimeException(String.format(
"resolution attempt: type=%s namespace=%s " +
"publicId=%s systemId=%s baseURI=%s",
type, namespaceURI, publicId, systemId, baseURI));
}
}
import javax.xml.XMLConstants;
import javax.xml.parsers.DocumentBuilder;
import javax.xml.parsers.DocumentBuilderFactory;
import javax.xml.parsers.ParserConfigurationException;
import javax.xml.parsers.SAXParser;
import javax.xml.parsers.SAXParserFactory;
import javax.xml.transform.dom.DOMSource;
import javax.xml.transform.sax.SAXSource;
import javax.xml.validation.Schema;
import javax.xml.validation.SchemaFactory;
import javax.xml.validation.Validator;
import org.w3c.dom.Document;
import org.w3c.dom.ls.LSInput;
import org.w3c.dom.ls.LSResourceResolver;
import org.xml.sax.EntityResolver;
import org.xml.sax.ErrorHandler;
import org.xml.sax.InputSource;
import org.xml.sax.SAXException;
import org.xml.sax.SAXParseException;
import org.xml.sax.XMLReader;
org.w3c.dom.Document object from an input stream. Example 13.9, “DOM-based XML parsing in OpenJDK” use the data from the java.io.InputStream instance in the inputStream variable.
DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
// Impose restrictions on the complexity of the DTD.
factory.setFeature(XMLConstants.FEATURE_SECURE_PROCESSING, true);
// Turn on validation.
// This step can be omitted if validation is not desired.
factory.setValidating(true);
// Parse the document.
DocumentBuilder builder = factory.newDocumentBuilder();
builder.setEntityResolver(new NoEntityResolver());
builder.setErrorHandler(new Errors());
Document document = builder.parse(inputStream);
NoEntityResolver class in Example 13.6, “Helper class to prevent DTD external entity resolution in OpenJDK”. Because external DTD references are prohibited, DTD validation (if enabled) will only happen against the internal DTD subset embedded in the XML document.
javax.xml.transform.Transformer class to add the DTD reference to the document, and an entity resolver which whitelists this external reference.
java.io.InputStream in the inputStream variable.
SchemaFactory factory = SchemaFactory.newInstance(
XMLConstants.W3C_XML_SCHEMA_NS_URI);
// This enables restrictions on the schema and document
// complexity.
factory.setFeature(XMLConstants.FEATURE_SECURE_PROCESSING, true);
// This prevents resource resolution by the schema itself.
// If the schema is trusted and references additional files,
// this line must be omitted, otherwise loading these files
// will fail.
factory.setResourceResolver(new NoResourceResolver());
Schema schema = factory.newSchema(schemaFile);
Validator validator = schema.newValidator();
// This prevents external resource resolution.
validator.setResourceResolver(new NoResourceResolver());
validator.validate(new SAXSource(new InputSource(inputStream)));
NoResourceResolver class is defined in Example 13.7, “Helper class to prevent schema resolution in OpenJDK”.
org.w3c.dom.Document instance document.
SchemaFactory factory = SchemaFactory.newInstance(
XMLConstants.W3C_XML_SCHEMA_NS_URI);
// This enables restrictions on schema complexity.
factory.setFeature(XMLConstants.FEATURE_SECURE_PROCESSING, true);
// The following line prevents resource resolution
// by the schema itself.
factory.setResourceResolver(new NoResourceResolver());
Schema schema = factory.newSchema(schemaFile);
Validator validator = schema.newValidator();
// This prevents external resource resolution.
validator.setResourceResolver(new NoResourceResolver());
validator.validate(new DOMSource(document));
struct.
PK11_GenerateRandom in the NSS library (usable for high data rates)
RAND_bytes in the OpenSSL library (usable for high data rates)
gnutls_rnd in GNUTLS, with GNUTLS_RND_RANDOM as the first argument (usable for high data rates)
os.urandom in Python
/dev/urandom character device
/dev/urandom are suitable for high data rates because they do not deplete the system-wide entropy pool.
RAND_bytes and PK11_GenerateRandom have three-state return values (with conflicting meanings). Careful error checking is required. Please review the documentation when using these functions.
openssl genrsa uses, but does not require, entropy from a physical source of randomness, among other things.) Such keys should be stored in a hardware security module if possible, and generated from random bits reserved for this purpose derived from a non-deterministic physical source.
# Name of the user owning the file with the private key %define tlsuser %{name} # Name of the directory which contains the key and certificate files %define tlsdir %{_sysconfdir}/%{name} %define tlskey %{tlsdir}/%{name}.key %define tlscert %{tlsdir}/%{name}.crt
%{tlsuser} (not root). In order to avoid races, if the directory %{tlsdir} is owned by the services user, you should use the code in Example 15.1, “Creating a key pair in a user-owned directory”. The invocation of su with the -s /bin/bash argument is necessary in case the login shell for the user has been disabled.
%post if [ $1 -eq 1 ] ; then if ! test -e %{tlskey} ; then su -s /bin/bash \ -c "umask 077 && openssl genrsa -out %{tlskey} 2048 2>/dev/null" \ %{tlsuser} fi if ! test -e %{tlscert} ; then cn="Automatically generated certificate for the %{tlsuser} service" req_args="-key %{tlskey} -out %{tlscert} -days 7305 -subj \"/CN=$cn/\"" su -s /bin/bash \ -c "openssl req -new -x509 -extensions usr_cert $req_args" \ %{tlsuser} fi fi %files %dir %attr(0755,%{tlsuser},%{tlsuser]) %{tlsdir} %ghost %attr(0600,%{tlsuser},%{tlsuser}) %config(noreplace) %{tlskey} %ghost %attr(0644,%{tlsuser},%{tlsuser}) %config(noreplace) %{tlscert}
%ghost), or delete the files when the package is uninstalled (the %config(noreplace) part).
%{tlsdir} is owned by root, use the code in Example 15.2, “Creating a key pair in a root-owned directory”.
root-owned directory%post if [ $1 -eq 1 ] ; then if ! test -e %{tlskey} ; then (umask 077 && openssl genrsa -out %{tlskey} 2048 2>/dev/null) chown %{tlsuser} %{tlskey} fi if ! test -e %{tlscert} ; then cn="Automatically generated certificate for the %{tlsuser} service" openssl req -new -x509 -extensions usr_cert \ -key %{tlskey} -out %{tlscert} -days 7305 -subj "/CN=$cn/" fi fi %files %dir %attr(0755,root,root]) %{tlsdir} %ghost %attr(0600,%{tlsuser},%{tlsuser}) %config(noreplace) %{tlskey} %ghost %attr(0644,root,root) %config(noreplace) %{tlscert}
Requires(pre): on the package which creates the user. This ensures that the user account will exist when it is needed for the su or chmod invocation.
random: nonblocking pool is initialized. In theory, it is also possible to read from /dev/random while generating the key material (instead of /dev/urandom), but this can block not just during the boot process, but also much later at run time, and generally results in a poor user experience.
Table of Contents
PF_UNIX protocol family, sometimes called PF_LOCAL) are restricted by file system permissions. If the server socket path is not world-writable, the server identity cannot be spoofed by local users.
root, so if a UDP or TCP server is running on the local host and it uses a trusted port, its identity is assured. (Not all operating systems enforce the trusted ports concept, and the network might not be trusted, so it is only useful on the local system.)
/etc/hosts). IP-based ACLs often use prefix notation to extend access to entire subnets. Name-based ACLs sometimes use wildcards for adding groups of hosts (from entire DNS subtrees). (In the SSH context, host-based authentication means something completely different and is not covered in this section.)
gethostbyaddr and getnameinfo functions cannot be trusted. (DNS PTR records can be set to arbitrary values, not just names belong to the address owner.) If these names are used for ACL matching, a forward lookup using gethostbyaddr or getaddrinfo has to be performed. The name is only valid if the original address is found among the results of the forward lookup (double-reverse lookup).
AF_UNIX or AF_LOCAL) are restricted to the local host and offer a special authentication mechanism: credentials passing.
SO_PEERCRED (Linux) or LOCAL_PEERCRED (FreeBSD) socket options, or the getpeereid (other BSDs, MacOS X). These interfaces provide direct access to the (effective) user ID on the other end of a domain socket connect, without cooperation from the other end.
sendmsg and recvmsg functions. On some systems, only credentials data that the peer has explicitly sent can be received, and the kernel checks the data for correctness on the sending side. This means that both peers need to deal with ancillary data. Compared to that, the modern interfaces are easier to use. Both sets of interfaces vary considerably among UNIX-like systems, unfortunately.
getpwuid (or getpwuid_r) and getgrouplist. Using the PID and information from /proc/PID/status is prone to race conditions and insecure.
AF_NETLINK authentication of originnl_pid in the sockaddr_nl structure is 0. (This structure can be obtained using recvfrom or recvmsg, it is different from the nlmsghdr structure.) The kernel does not prevent other processes from sending unicast Netlink messages, but the nl_pid field in the sender's socket address will be non-zero in such cases.
AF_NETLINK sockets as an IPC mechanism among processes, but prefer UNIX domain sockets for this tasks.
TCP_NODELAY socket option (at least for the duration of the handshake), or use the Linux-specific TCP_CORK option.
const int val = 1;
int ret = setsockopt(sockfd, IPPROTO_TCP, TCP_NODELAY, &val, sizeof(val));
if (ret < 0) {
perror("setsockopt(TCP_NODELAY)");
exit(1);
}
close_notify alerts and respond to them. This is especially important if the upper-layer protocol does not provide means to detect connection truncation (like some uses of HTTP).
accept would not. Otherwise, a client which fails to complete the TLS handshake for some reason will prevent the server from handling input from other clients.
fork function calls (see Section 12.6, “fork as a primitive for parallelism”).
int values with the following meaning:
1 indicates success (for example, a successful signature verification).
0 indicates semantic failure (for example, a signature verification which was unsuccessful because the signing certificate was self-signed).
-1 indicates a low-level error in the system, such as failure to allocate memory using malloc.
SSL object has failed. However, there are still cases where no detailed error information is available (e.g., if SSL_shutdown fails due to a connection teardown by the other end).
static void __attribute__((noreturn))
ssl_print_error_and_exit(SSL *ssl, const char *op, int ret)
{
int subcode = SSL_get_error(ssl, ret);
switch (subcode) {
case SSL_ERROR_NONE:
fprintf(stderr, "error: %s: no error to report\n", op);
break;
case SSL_ERROR_WANT_READ:
case SSL_ERROR_WANT_WRITE:
case SSL_ERROR_WANT_X509_LOOKUP:
case SSL_ERROR_WANT_CONNECT:
case SSL_ERROR_WANT_ACCEPT:
fprintf(stderr, "error: %s: invalid blocking state %d\n", op, subcode);
break;
case SSL_ERROR_SSL:
fprintf(stderr, "error: %s: TLS layer problem\n", op);
case SSL_ERROR_SYSCALL:
fprintf(stderr, "error: %s: system call failed: %s\n", op, strerror(errno));
break;
case SSL_ERROR_ZERO_RETURN:
fprintf(stderr, "error: %s: zero return\n", op);
}
exit(1);
}
OPENSSL_config function is documented to never fail. In reality, it can terminate the entire process if there is a failure accessing the configuration file. An error message is written to standard error, but which might not be visible if the function is called from a daemon process.
d2i_ and end in _fp or _bio (e.g., d2i_X509_fp or d2i_X509_bio). These decoders must not be used for parsing data from untrusted sources; instead, the variants without the _fp and _bio (e.g., d2i_X509) shall be used. The BIO variants have received considerably less testing and are not very robust.
openssl x509) are generally generally less robust than the actual library code. They use the BIO functions internally, and not the more robust variants.
openssl verify result in an exit status of zero.
openssl genrsa, do not ensure that physical entropy is used for key generation—they obtain entropy from /dev/urandom and other sources, but not from /dev/random. This can result in weak keys if the system lacks a proper entropy source (e.g., a virtual machine with solid state storage). Depending on local policies, keys generated by these OpenSSL tools should not be used in high-value, critical functions.
openssl s_client and openssl s_server) are debugging tools and should never be used as generic clients. For instance, the s_client tool reacts in a surprisign way to lines starting with R and Q.
libgnutls.so.26 links to libpthread.so.0. Loading the threading library too late causes problems, so the main program should be linked with -lpthread as well. As a result, it can be difficult to use GNUTLS in a plugin which is loaded with the dlopen function. Another side effect is that applications which merely link against GNUTLS (even without actually using it) may incur a substantial overhead because other libraries automatically switch to thread-safe algorithms.
gnutls_global_init function must be called before using any functionality provided by the library. This function is not thread-safe, so external locking is required, but it is not clear which lock should be used. Omitting the synchronization does not just lead to a memory leak, as it is suggested in the GNUTLS documentation, but to undefined behavior because there is no barrier that would enforce memory ordering.
gnutls_global_deinit function does not actually deallocate all resources allocated by gnutls_global_init. It is currently not thread-safe. Therefore, it is best to avoid calling it altogether.
/dev/random as the randomness source for nonces and other random data which is needed for TLS operation, but does not actually require physical randomness. As a result, TLS applications can block, waiting for more bits to become available in /dev/random.
SSL_ForceHandshake function can succeed, but no TLS handshake takes place, the peer is not authenticated, and subsequent data is exchanged in the clear.
fork after the library has been initialized. This behavior is required by the PKCS#11 API specification.
/etc/ssl/certs or files derived from it.
// The following call prints an error message and calls exit() if
// the OpenSSL configuration file is unreadable.
OPENSSL_config(NULL);
// Provide human-readable error messages.
SSL_load_error_strings();
// Register ciphers.
SSL_library_init();
// Configure a client connection context. Send a hendshake for the
// highest supported TLS version, and disable compression.
const SSL_METHOD *const req_method = SSLv23_client_method();
SSL_CTX *const ctx = SSL_CTX_new(req_method);
if (ctx == NULL) {
ERR_print_errors(bio_err);
exit(1);
}
SSL_CTX_set_options(ctx, SSL_OP_NO_SSLv2 | SSL_OP_NO_COMPRESSION);
// Adjust the ciphers list based on a whitelist. First enable all
// ciphers of at least medium strength, to get the list which is
// compiled into OpenSSL.
if (SSL_CTX_set_cipher_list(ctx, "HIGH:MEDIUM") != 1) {
ERR_print_errors(bio_err);
exit(1);
}
{
// Create a dummy SSL session to obtain the cipher list.
SSL *ssl = SSL_new(ctx);
if (ssl == NULL) {
ERR_print_errors(bio_err);
exit(1);
}
STACK_OF(SSL_CIPHER) *active_ciphers = SSL_get_ciphers(ssl);
if (active_ciphers == NULL) {
ERR_print_errors(bio_err);
exit(1);
}
// Whitelist of candidate ciphers.
static const char *const candidates[] = {
"AES128-GCM-SHA256", "AES128-SHA256", "AES256-SHA256", // strong ciphers
"AES128-SHA", "AES256-SHA", // strong ciphers, also in older versions
"RC4-SHA", "RC4-MD5", // backwards compatibility, supposed to be weak
"DES-CBC3-SHA", "DES-CBC3-MD5", // more backwards compatibility
NULL
};
// Actually selected ciphers.
char ciphers[300];
ciphers[0] = '\0';
for (const char *const *c = candidates; *c; ++c) {
for (int i = 0; i < sk_SSL_CIPHER_num(active_ciphers); ++i) {
if (strcmp(SSL_CIPHER_get_name(sk_SSL_CIPHER_value(active_ciphers, i)),
*c) == 0) {
if (*ciphers) {
strcat(ciphers, ":");
}
strcat(ciphers, *c);
break;
}
}
}
SSL_free(ssl);
// Apply final cipher list.
if (SSL_CTX_set_cipher_list(ctx, ciphers) != 1) {
ERR_print_errors(bio_err);
exit(1);
}
}
// Load the set of trusted root certificates.
if (!SSL_CTX_set_default_verify_paths(ctx)) {
ERR_print_errors(bio_err);
exit(1);
}
SSL_CTX object for creating connections concurrently from multiple threads, provided that the SSL_CTX object is not modified (e.g., callbacks must not be changed).
SSL_connect fails, the ssl_print_error_and_exit function from Example 17.2, “Obtaining OpenSSL error codes” is called.
certificate_validity_override function provides an opportunity to override the validity of the certificate in case the OpenSSL check fails. If such functionality is not required, the call can be removed, otherwise, the application developer has to implement it.
SSL_set_tlsext_host_name and X509_check_host must be the name that was passed to getaddrinfo or a similar name resolution function. No host name canonicalization must be performed. The X509_check_host function used in the final step for host name matching is currently only implemented in OpenSSL 1.1, which is not released yet. In case host name matching fails, the function certificate_host_name_override is called. This function should check user-specific certificate store, to allow a connection even if the host name does not match the certificate. This function has to be provided by the application developer. Note that the override must be keyed by both the certificate and the host name.
// Create the connection object.
SSL *ssl = SSL_new(ctx);
if (ssl == NULL) {
ERR_print_errors(bio_err);
exit(1);
}
SSL_set_fd(ssl, sockfd);
// Enable the ServerNameIndication extension
if (!SSL_set_tlsext_host_name(ssl, host)) {
ERR_print_errors(bio_err);
exit(1);
}
// Perform the TLS handshake with the server.
ret = SSL_connect(ssl);
if (ret != 1) {
// Error status can be 0 or negative.
ssl_print_error_and_exit(ssl, "SSL_connect", ret);
}
// Obtain the server certificate.
X509 *peercert = SSL_get_peer_certificate(ssl);
if (peercert == NULL) {
fprintf(stderr, "peer certificate missing");
exit(1);
}
// Check the certificate verification result. Allow an explicit
// certificate validation override in case verification fails.
int verifystatus = SSL_get_verify_result(ssl);
if (verifystatus != X509_V_OK && !certificate_validity_override(peercert)) {
fprintf(stderr, "SSL_connect: verify result: %s\n",
X509_verify_cert_error_string(verifystatus));
exit(1);
}
// Check if the server certificate matches the host name used to
// establish the connection.
// FIXME: Currently needs OpenSSL 1.1.
if (X509_check_host(peercert, (const unsigned char *)host, strlen(host),
0) != 1
&& !certificate_host_name_override(peercert, host)) {
fprintf(stderr, "SSL certificate does not match host name\n");
exit(1);
}
X509_free(peercert);
BIO object and use the SSL object as the underlying transport, using BIO_set_ssl.
const char *const req = "GET / HTTP/1.0\r\n\r\n";
if (SSL_write(ssl, req, strlen(req)) < 0) {
ssl_print_error_and_exit(ssl, "SSL_write", ret);
}
char buf[4096];
ret = SSL_read(ssl, buf, sizeof(buf));
if (ret < 0) {
ssl_print_error_and_exit(ssl, "SSL_read", ret);
}
SSL_shutdown function needs to be called twice for an orderly, synchronous connection termination (Example 17.7, “Closing an OpenSSL connection in an orderly fashion”). This exchanges close_notify alerts with the server. The additional logic is required to deal with an unexpected close_notify from the server. Note that is necessary to explicitly close the underlying socket after the connection object has been freed.
// Send the close_notify alert.
ret = SSL_shutdown(ssl);
switch (ret) {
case 1:
// A close_notify alert has already been received.
break;
case 0:
// Wait for the close_notify alert from the peer.
ret = SSL_shutdown(ssl);
switch (ret) {
case 0:
fprintf(stderr, "info: second SSL_shutdown returned zero\n");
break;
case 1:
break;
default:
ssl_print_error_and_exit(ssl, "SSL_shutdown 2", ret);
}
break;
default:
ssl_print_error_and_exit(ssl, "SSL_shutdown 1", ret);
}
SSL_free(ssl);
close(sockfd);
SSL_CTX_free(ctx);
gnutls_global_init();
// Load the trusted CA certificates.
gnutls_certificate_credentials_t cred = NULL;
int ret = gnutls_certificate_allocate_credentials (&cred);
if (ret != GNUTLS_E_SUCCESS) {
fprintf(stderr, "error: gnutls_certificate_allocate_credentials: %s\n",
gnutls_strerror(ret));
exit(1);
}
// gnutls_certificate_set_x509_system_trust needs GNUTLS version 3.0
// or newer, so we hard-code the path to the certificate store
// instead.
static const char ca_bundle[] = "/etc/ssl/certs/ca-bundle.crt";
ret = gnutls_certificate_set_x509_trust_file
(cred, ca_bundle, GNUTLS_X509_FMT_PEM);
if (ret == 0) {
fprintf(stderr, "error: no certificates found in: %s\n", ca_bundle);
exit(1);
}
if (ret < 0) {
fprintf(stderr, "error: gnutls_certificate_set_x509_trust_files(%s): %s\n",
ca_bundle, gnutls_strerror(ret));
exit(1);
}
gnutls_certificate_free_credentials(cred);
NORMAL set of cipher suites and protocols provides a reasonable default. Then the TLS handshake must be initiated. This is shown in Example 17.10, “Establishing a TLS client connection using GNUTLS”.
// Create the session object.
gnutls_session_t session;
ret = gnutls_init(&session, GNUTLS_CLIENT);
if (ret != GNUTLS_E_SUCCESS) {
fprintf(stderr, "error: gnutls_init: %s\n",
gnutls_strerror(ret));
exit(1);
}
// Configure the cipher preferences.
const char *errptr = NULL;
ret = gnutls_priority_set_direct(session, "NORMAL", &errptr);
if (ret != GNUTLS_E_SUCCESS) {
fprintf(stderr, "error: gnutls_priority_set_direct: %s\n"
"error: at: \"%s\"\n", gnutls_strerror(ret), errptr);
exit(1);
}
// Install the trusted certificates.
ret = gnutls_credentials_set(session, GNUTLS_CRD_CERTIFICATE, cred);
if (ret != GNUTLS_E_SUCCESS) {
fprintf(stderr, "error: gnutls_credentials_set: %s\n",
gnutls_strerror(ret));
exit(1);
}
// Associate the socket with the session object and set the server
// name.
gnutls_transport_set_ptr(session, (gnutls_transport_ptr_t)(uintptr_t)sockfd);
ret = gnutls_server_name_set(session, GNUTLS_NAME_DNS,
host, strlen(host));
if (ret != GNUTLS_E_SUCCESS) {
fprintf(stderr, "error: gnutls_server_name_set: %s\n",
gnutls_strerror(ret));
exit(1);
}
// Establish the session.
ret = gnutls_handshake(session);
if (ret != GNUTLS_E_SUCCESS) {
fprintf(stderr, "error: gnutls_handshake: %s\n",
gnutls_strerror(ret));
exit(1);
}
certificate_validity_override function is called if the verification fails, so that a separate, user-specific trust store can be checked. This function call can be omitted if the functionality is not needed.
// Obtain the server certificate chain. The server certificate
// itself is stored in the first element of the array.
unsigned certslen = 0;
const gnutls_datum_t *const certs =
gnutls_certificate_get_peers(session, &certslen);
if (certs == NULL || certslen == 0) {
fprintf(stderr, "error: could not obtain peer certificate\n");
exit(1);
}
// Validate the certificate chain.
unsigned status = (unsigned)-1;
ret = gnutls_certificate_verify_peers2(session, &status);
if (ret != GNUTLS_E_SUCCESS) {
fprintf(stderr, "error: gnutls_certificate_verify_peers2: %s\n",
gnutls_strerror(ret));
exit(1);
}
if (status != 0 && !certificate_validity_override(certs[0])) {
gnutls_datum_t msg;
#if GNUTLS_VERSION_AT_LEAST_3_1_4
int type = gnutls_certificate_type_get (session);
ret = gnutls_certificate_verification_status_print(status, type, &out, 0);
#else
ret = -1;
#endif
if (ret == 0) {
fprintf(stderr, "error: %s\n", msg.data);
gnutls_free(msg.data);
exit(1);
} else {
fprintf(stderr, "error: certificate validation failed with code 0x%x\n",
status);
exit(1);
}
}
gnutls_x509_crt_check_hostname). Again, an override function certificate_host_name_override is called. Note that the override must be keyed to the certificate and the host name. The function call can be omitted if the override is not needed.
// Match the peer certificate against the host name.
// We can only obtain a set of DER-encoded certificates from the
// session object, so we have to re-parse the peer certificate into
// a certificate object.
gnutls_x509_crt_t cert;
ret = gnutls_x509_crt_init(&cert);
if (ret != GNUTLS_E_SUCCESS) {
fprintf(stderr, "error: gnutls_x509_crt_init: %s\n",
gnutls_strerror(ret));
exit(1);
}
// The peer certificate is the first certificate in the list.
ret = gnutls_x509_crt_import(cert, certs, GNUTLS_X509_FMT_DER);
if (ret != GNUTLS_E_SUCCESS) {
fprintf(stderr, "error: gnutls_x509_crt_import: %s\n",
gnutls_strerror(ret));
exit(1);
}
ret = gnutls_x509_crt_check_hostname(cert, host);
if (ret == 0 && !certificate_host_name_override(certs[0], host)) {
fprintf(stderr, "error: host name does not match certificate\n");
exit(1);
}
gnutls_x509_crt_deinit(cert);
gnutls_certificate_verify_peers3 function.
char buf[4096];
snprintf(buf, sizeof(buf), "GET / HTTP/1.0\r\nHost: %s\r\n\r\n", host);
ret = gnutls_record_send(session, buf, strlen(buf));
if (ret < 0) {
fprintf(stderr, "error: gnutls_record_send: %s\n", gnutls_strerror(ret));
exit(1);
}
ret = gnutls_record_recv(session, buf, sizeof(buf));
if (ret < 0) {
fprintf(stderr, "error: gnutls_record_recv: %s\n", gnutls_strerror(ret));
exit(1);
}
gnutls_bye function. Finally, the session object can be deallocated using gnutls_deinit (see Example 17.14, “Using a GNUTLS session”).
// Initiate an orderly connection shutdown.
ret = gnutls_bye(session, GNUTLS_SHUT_RDWR);
if (ret < 0) {
fprintf(stderr, "error: gnutls_bye: %s\n", gnutls_strerror(ret));
exit(1);
}
// Free the session object.
gnutls_deinit(session);
import java.security.NoSuchAlgorithmException;
import java.security.NoSuchProviderException;
import java.security.cert.CertificateEncodingException;
import java.security.cert.CertificateException;
import java.security.cert.X509Certificate;
import javax.net.ssl.SSLContext;
import javax.net.ssl.SSLParameters;
import javax.net.ssl.SSLSocket;
import javax.net.ssl.TrustManager;
import javax.net.ssl.X509TrustManager;
import sun.security.util.HostnameChecker;
sun.security.util.HostnameChecker. (The public OpenJDK API does not provide any support for dissecting the subject distinguished name of an X.509 certificate, so a custom-written DER parser is needed—or we have to use an internal class, which we do below.) In OpenJDK 7, the setEndpointIdentificationAlgorithm method was added to the javax.net.ssl.SSLParameters class, providing an official way to implement host name checking.
SSLContext instance. With a properly configured OpenJDK installation, the SunJSSE provider uses the system-wide set of trusted root certificate authorities, so no further configuration is necessary. For backwards compatibility with OpenJDK 6, the TLSv1 provider has to be supported as a fall-back option. This is shown in Example 17.15, “Setting up an SSLContext for OpenJDK TLS clients”.
SSLContext for OpenJDK TLS clients
// Create the context. Specify the SunJSSE provider to avoid
// picking up third-party providers. Try the TLS 1.2 provider
// first, then fall back to TLS 1.0.
SSLContext ctx;
try {
ctx = SSLContext.getInstance("TLSv1.2", "SunJSSE");
} catch (NoSuchAlgorithmException e) {
try {
ctx = SSLContext.getInstance("TLSv1", "SunJSSE");
} catch (NoSuchAlgorithmException e1) {
// The TLS 1.0 provider should always be available.
throw new AssertionError(e1);
} catch (NoSuchProviderException e1) {
throw new AssertionError(e1);
}
} catch (NoSuchProviderException e) {
// The SunJSSE provider should always be available.
throw new AssertionError(e);
}
ctx.init(null, null, null);
SSLParameters for TLS use with OpenJDK”). Like the context, these parameters can be reused for multiple TLS connections.
SSLParameters for TLS use with OpenJDK
// Prepare TLS parameters. These have to applied to every TLS
// socket before the handshake is triggered.
SSLParameters params = ctx.getDefaultSSLParameters();
// Do not send an SSL-2.0-compatible Client Hello.
ArrayList<String> protocols = new ArrayList<String>(
Arrays.asList(params.getProtocols()));
protocols.remove("SSLv2Hello");
params.setProtocols(protocols.toArray(new String[protocols.size()]));
// Adjust the supported ciphers.
ArrayList<String> ciphers = new ArrayList<String>(
Arrays.asList(params.getCipherSuites()));
ciphers.retainAll(Arrays.asList(
"TLS_RSA_WITH_AES_128_CBC_SHA256",
"TLS_RSA_WITH_AES_256_CBC_SHA256",
"TLS_RSA_WITH_AES_256_CBC_SHA",
"TLS_RSA_WITH_AES_128_CBC_SHA",
"SSL_RSA_WITH_3DES_EDE_CBC_SHA",
"SSL_RSA_WITH_RC4_128_SHA1",
"SSL_RSA_WITH_RC4_128_MD5",
"TLS_EMPTY_RENEGOTIATION_INFO_SCSV"));
params.setCipherSuites(ciphers.toArray(new String[ciphers.size()]));
params.setEndpointIdentificationAlgorithm("HTTPS");
"HTTPS" algorithm. (The algorithms have minor differences with regard to wildcard handling, which should not matter in practice.)
params. (After this point, changes to params will not affect this TLS socket.) As mentioned initially, host name checking requires using an internal API on OpenJDK 6.
// Create the socket and connect it at the TCP layer.
SSLSocket socket = (SSLSocket) ctx.getSocketFactory()
.createSocket(host, port);
// Disable the Nagle algorithm.
socket.setTcpNoDelay(true);
// Adjust ciphers and protocols.
socket.setSSLParameters(params);
// Perform the handshake.
socket.startHandshake();
// Validate the host name. The match() method throws
// CertificateException on failure.
X509Certificate peer = (X509Certificate)
socket.getSession().getPeerCertificates()[0];
// This is the only way to perform host name checking on OpenJDK 6.
HostnameChecker.getInstance(HostnameChecker.TYPE_TLS).match(
host, peer);
setEndpointIdentificationAlgorithm method on the params object (before it was applied to the socket).
socket.getOutputStream().write("GET / HTTP/1.0\r\n\r\n"
.getBytes(Charset.forName("UTF-8")));
byte[] buffer = new byte[4096];
int count = socket.getInputStream().read(buffer);
System.out.write(buffer, 0, count);
TrustManager and SSLContext objects have to be used for different servers.
public class MyTrustManager implements X509TrustManager {
private final byte[] certHash;
public MyTrustManager(byte[] certHash) throws Exception {
this.certHash = certHash;
}
@Override
public void checkClientTrusted(X509Certificate[] chain, String authType)
throws CertificateException {
throw new UnsupportedOperationException();
}
@Override
public void checkServerTrusted(X509Certificate[] chain,
String authType) throws CertificateException {
byte[] digest = getCertificateDigest(chain[0]);
String digestHex = formatHex(digest);
if (Arrays.equals(digest, certHash)) {
System.err.println("info: accepting certificate: " + digestHex);
} else {
throw new CertificateException("certificate rejected: " +
digestHex);
}
}
@Override
public X509Certificate[] getAcceptedIssuers() {
return new X509Certificate[0];
}
}
init method of the SSLContext object, as show in Example 17.20, “Using a custom TLS trust manager with OpenJDK”.
SSLContext ctx;
try {
ctx = SSLContext.getInstance("TLSv1.2", "SunJSSE");
} catch (NoSuchAlgorithmException e) {
try {
ctx = SSLContext.getInstance("TLSv1", "SunJSSE");
} catch (NoSuchAlgorithmException e1) {
throw new AssertionError(e1);
} catch (NoSuchProviderException e1) {
throw new AssertionError(e1);
}
} catch (NoSuchProviderException e) {
throw new AssertionError(e);
}
MyTrustManager tm = new MyTrustManager(certHash);
ctx.init(null, new TrustManager[] {tm}, null);
javax.net.ssl.X509ExtendedTrustManager class. The OpenJDK TLS implementation will call the new methods, passing along TLS session information. This can be used to implement certificate overrides as a fallback (if certificate or host name verification fails), and a trust manager object can be used for multiple servers because the server address is available to the trust manager.
// NSPR include files
#include <prerror.h>
#include <prinit.h>
// NSS include files
#include <nss.h>
#include <pk11pub.h>
#include <secmod.h>
#include <ssl.h>
#include <sslproto.h>
// Private API, no other way to turn a POSIX file descriptor into an
// NSPR handle.
NSPR_API(PRFileDesc*) PR_ImportTCPSocket(int);
NSS_SetDomesticPolicy if there are no strong ciphers available, assuming that it has already been called otherwise. This avoids overriding the process-wide cipher suite policy unnecessarily.
libnssckbi.so NSS module with a call to the SECMOD_LoadUserModule function. The root certificates are compiled into this module. (The PEM module for NSS, libnsspem.so, offers a way to load trusted CA certificates from a file.)
PR_Init(PR_USER_THREAD, PR_PRIORITY_NORMAL, 0);
NSSInitContext *const ctx =
NSS_InitContext("sql:/etc/pki/nssdb", "", "", "", NULL,
NSS_INIT_READONLY | NSS_INIT_PK11RELOAD);
if (ctx == NULL) {
const PRErrorCode err = PR_GetError();
fprintf(stderr, "error: NSPR error code %d: %s\n",
err, PR_ErrorToName(err));
exit(1);
}
// Ciphers to enable.
static const PRUint16 good_ciphers[] = {
TLS_RSA_WITH_AES_128_CBC_SHA,
TLS_RSA_WITH_AES_256_CBC_SHA,
SSL_RSA_WITH_3DES_EDE_CBC_SHA,
SSL_NULL_WITH_NULL_NULL // sentinel
};
// Check if the current policy allows any strong ciphers. If it
// doesn't, set the cipher suite policy. This is not thread-safe
// and has global impact. Consequently, we only do it if absolutely
// necessary.
int found_good_cipher = 0;
for (const PRUint16 *p = good_ciphers; *p != SSL_NULL_WITH_NULL_NULL;
++p) {
PRInt32 policy;
if (SSL_CipherPolicyGet(*p, &policy) != SECSuccess) {
const PRErrorCode err = PR_GetError();
fprintf(stderr, "error: policy for cipher %u: error %d: %s\n",
(unsigned)*p, err, PR_ErrorToName(err));
exit(1);
}
if (policy == SSL_ALLOWED) {
fprintf(stderr, "info: found cipher %x\n", (unsigned)*p);
found_good_cipher = 1;
break;
}
}
if (!found_good_cipher) {
if (NSS_SetDomesticPolicy() != SECSuccess) {
const PRErrorCode err = PR_GetError();
fprintf(stderr, "error: NSS_SetDomesticPolicy: error %d: %s\n",
err, PR_ErrorToName(err));
exit(1);
}
}
// Initialize the trusted certificate store.
char module_name[] = "library=libnssckbi.so name=\"Root Certs\"";
SECMODModule *module = SECMOD_LoadUserModule(module_name, NULL, PR_FALSE);
if (module == NULL || !module->loaded) {
const PRErrorCode err = PR_GetError();
fprintf(stderr, "error: NSPR error code %d: %s\n",
err, PR_ErrorToName(err));
exit(1);
}
SECMOD_DestroyModule(module);
NSS_ShutdownContext(ctx);
PR_ImportTCPSocket function is used to turn the POSIX file descriptor sockfd into an NSPR file descriptor. (This function is de-facto part of the NSS public ABI, so it will not go away.) Creating the TLS-capable file descriptor requires a model descriptor, which is configured with the desired set of protocols. The model descriptor is not needed anymore after TLS support has been activated for the existing connection descriptor.
SSL_BadCertHook can be omitted if no mechanism to override certificate verification is needed. The bad_certificate function must check both the host name specified for the connection and the certificate before granting the override.
SSL_ResetHandshake, SSL_SetURL, and SSL_ForceHandshake. (If SSL_ResetHandshake is omitted, SSL_ForceHandshake will succeed, but the data will not be encrypted.) During the handshake, the certificate is verified and matched against the host name.
// Wrap the POSIX file descriptor. This is an internal NSPR
// function, but it is very unlikely to change.
PRFileDesc* nspr = PR_ImportTCPSocket(sockfd);
sockfd = -1; // Has been taken over by NSPR.
// Add the SSL layer.
{
PRFileDesc *model = PR_NewTCPSocket();
PRFileDesc *newfd = SSL_ImportFD(NULL, model);
if (newfd == NULL) {
const PRErrorCode err = PR_GetError();
fprintf(stderr, "error: NSPR error code %d: %s\n",
err, PR_ErrorToName(err));
exit(1);
}
model = newfd;
newfd = NULL;
if (SSL_OptionSet(model, SSL_ENABLE_SSL2, PR_FALSE) != SECSuccess) {
const PRErrorCode err = PR_GetError();
fprintf(stderr, "error: set SSL_ENABLE_SSL2 error %d: %s\n",
err, PR_ErrorToName(err));
exit(1);
}
if (SSL_OptionSet(model, SSL_V2_COMPATIBLE_HELLO, PR_FALSE) != SECSuccess) {
const PRErrorCode err = PR_GetError();
fprintf(stderr, "error: set SSL_V2_COMPATIBLE_HELLO error %d: %s\n",
err, PR_ErrorToName(err));
exit(1);
}
if (SSL_OptionSet(model, SSL_ENABLE_DEFLATE, PR_FALSE) != SECSuccess) {
const PRErrorCode err = PR_GetError();
fprintf(stderr, "error: set SSL_ENABLE_DEFLATE error %d: %s\n",
err, PR_ErrorToName(err));
exit(1);
}
// Allow overriding invalid certificate.
if (SSL_BadCertHook(model, bad_certificate, (char *)host) != SECSuccess) {
const PRErrorCode err = PR_GetError();
fprintf(stderr, "error: SSL_BadCertHook error %d: %s\n",
err, PR_ErrorToName(err));
exit(1);
}
newfd = SSL_ImportFD(model, nspr);
if (newfd == NULL) {
const PRErrorCode err = PR_GetError();
fprintf(stderr, "error: SSL_ImportFD error %d: %s\n",
err, PR_ErrorToName(err));
exit(1);
}
nspr = newfd;
PR_Close(model);
}
// Perform the handshake.
if (SSL_ResetHandshake(nspr, PR_FALSE) != SECSuccess) {
const PRErrorCode err = PR_GetError();
fprintf(stderr, "error: SSL_ResetHandshake error %d: %s\n",
err, PR_ErrorToName(err));
exit(1);
}
if (SSL_SetURL(nspr, host) != SECSuccess) {
const PRErrorCode err = PR_GetError();
fprintf(stderr, "error: SSL_SetURL error %d: %s\n",
err, PR_ErrorToName(err));
exit(1);
}
if (SSL_ForceHandshake(nspr) != SECSuccess) {
const PRErrorCode err = PR_GetError();
fprintf(stderr, "error: SSL_ForceHandshake error %d: %s\n",
err, PR_ErrorToName(err));
exit(1);
}
char buf[4096];
snprintf(buf, sizeof(buf), "GET / HTTP/1.0\r\nHost: %s\r\n\r\n", host);
PRInt32 ret = PR_Write(nspr, buf, strlen(buf));
if (ret < 0) {
const PRErrorCode err = PR_GetError();
fprintf(stderr, "error: PR_Write error %d: %s\n",
err, PR_ErrorToName(err));
exit(1);
}
ret = PR_Read(nspr, buf, sizeof(buf));
if (ret < 0) {
const PRErrorCode err = PR_GetError();
fprintf(stderr, "error: PR_Read error %d: %s\n",
err, PR_ErrorToName(err));
exit(1);
}
// Send close_notify alert.
if (PR_Shutdown(nspr, PR_SHUTDOWN_BOTH) != PR_SUCCESS) {
const PRErrorCode err = PR_GetError();
fprintf(stderr, "error: PR_Read error %d: %s\n",
err, PR_ErrorToName(err));
exit(1);
}
// Closes the underlying POSIX file descriptor, too.
PR_Close(nspr);
ssl module (actually a wrapper around OpenSSL). The exported interface is somewhat restricted, so that the client code shown below does not fully implement the recommendations in Section 17.1.1, “OpenSSL Pitfalls”.
https:// URLs or otherwise implement HTTPS support do not perform certificate validation at all. (For example, this is true for the httplib and xmlrpclib modules.) If you use HTTPS, you should not use the built-in HTTP clients. The Curl class in the curl module, as provided by the python-pycurl package implements proper certificate validation.
ssl module currently does not perform host name checking on the server certificate. Example 17.26, “Implementing TLS host name checking Python (without wildcard support)” shows how to implement certificate matching, using the parsed certificate returned by getpeercert.
def check_host_name(peercert, name):
"""Simple certificate/host name checker. Returns True if the
certificate matches, False otherwise. Does not support
wildcards."""
# Check that the peer has supplied a certificate.
# None/{} is not acceptable.
if not peercert:
return False
if peercert.has_key("subjectAltName"):
for typ, val in peercert["subjectAltName"]:
if typ == "DNS" and val == name:
return True
else:
# Only check the subject DN if there is no subject alternative
# name.
cn = None
for attr, val in peercert["subject"]:
# Use most-specific (last) commonName attribute.
if attr == "commonName":
cn = val
if cn is not None:
return cn == name
return False
ssl.wrap_socket function. The function call in Example 17.27, “Establishing a TLS client connection with Python” provides additional arguments to override questionable defaults in OpenSSL and in the Python module.
ciphers="HIGH:-aNULL:-eNULL:-PSK:RC4-SHA:RC4-MD5" selects relatively strong cipher suites with certificate-based authentication. (The call to check_host_name function provides additional protection against anonymous cipher suites.)
ssl_version=ssl.PROTOCOL_TLSv1 disables SSL 2.0 support. By default, the ssl module sends an SSL 2.0 client hello, which is rejected by some servers. Ideally, we would request OpenSSL to negotiated the most recent TLS version supported by the server and the client, but the Python module does not allow this.
cert_reqs=ssl.CERT_REQUIRED turns on certificate validation.
ca_certs='/etc/ssl/certs/ca-bundle.crt' initializes the certificate store with a set of trusted root CAs. Unfortunately, it is necessary to hard-code this path into applications because the default path in OpenSSL is not available through the Python ssl module.
ssl module (and OpenSSL) perform certificate validation, but the certificate must be compared manually against the host name, by calling the check_host_name defined above.
sock = ssl.wrap_socket(sock,
ciphers="HIGH:-aNULL:-eNULL:-PSK:RC4-SHA:RC4-MD5",
ssl_version=ssl.PROTOCOL_TLSv1,
cert_reqs=ssl.CERT_REQUIRED,
ca_certs='/etc/ssl/certs/ca-bundle.crt')
# getpeercert() triggers the handshake as a side effect.
if not check_host_name(sock.getpeercert(), host):
raise IOError("peer certificate does not match host name")
sock.write("GET / HTTP/1.1\r\nHost: " + host + "\r\n\r\n")
print sock.read()
sock.close()
| Revision History | |||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|
| Revision 1.3-1 | Mon Oct 13 2014 | ||||||||||
| |||||||||||
| Revision 1.2-1 | Wed Jul 16 2014 | ||||||||||
| |||||||||||
| Revision 1.1-1 | Tue Aug 27 2013 | ||||||||||
| |||||||||||
| Revision 1.0-1 | Thu May 09 2013 | ||||||||||
| |||||||||||
| Revision 0-1 | Thu Mar 7 2013 | ||||||||||
| |||||||||||