Processes

Creating Safe Processes

This section describes how to create new child processes in a safe manner. In addition to the concerns addressed below, there is the possibility of file descriptor leaks, see Preventing File Descriptor Leaks to Child Processes.

Obtaining the Program Path and the Command-line Template

The name and path to the program being invoked should be hard-coded or controlled by a static configuration file stored at a fixed location (at an file system absolute path). The same applies to the template for generating the command line.

The configured program name should be an absolute path. If it is a relative path, the contents of the PATH must be obtained in a secure manner (see Accessing Environment Variables). If the PATH variable is not set or untrusted, the safe default /bin:/usr/bin must be used.

If too much flexibility is provided here, it may allow invocation of arbitrary programs without proper authorization.

Bypassing the Shell

Child processes should be created without involving the system shell.

For C/C++, system should not be used. The posix_spawn function can be used instead, or a combination fork and execve. (In some cases, it may be preferable to use vfork or the Linux-specific clone system call instead of fork.)

In Python, the subprocess module bypasses the shell by default (when the shell keyword argument is not set to true). os.system should not be used.

The Java class java.lang.ProcessBuilder can be used to create subprocesses without interference from the system shell.

Portability notice

On Windows, there is no argument vector, only a single argument string. Each application is responsible for parsing this string into an argument vector. There is considerable variance among the quoting style recognized by applications. Some of them expand shell wildcards, others do not. Extensive application-specific testing is required to make this secure.

Note that some common applications (notably ssh) unconditionally introduce the use of a shell, even if invoked directly without a shell. It is difficult to use these applications in a secure manner. In this case, untrusted data should be supplied by other means. For example, standard input could be used, instead of the command line.

Specifying the Process Environment

Child processes should be created with a minimal set of environment variables. This is absolutely essential if there is a trust transition involved, either when the parent process was created, or during the creation of the child process.

In C/C++, the environment should be constructed as an array of strings and passed as the envp argument to posix_spawn or execve. The functions setenv, unsetenv and putenv should not be used. They are not thread-safe and suffer from memory leaks.

Python programs need to specify a dict for the the env argument of the subprocess.Popen constructor. The Java class java.lang.ProcessBuilder provides a environment() method, which returns a map that can be manipulated.

The following list provides guidelines for selecting the set of environment variables passed to the child process.

  • PATH should be initialized to /bin:/usr/bin.

  • USER and HOME can be inhereted from the parent process environment, or they can be initialized from the pwent structure for the user.

  • The DISPLAY and XAUTHORITY variables should be passed to the subprocess if it is an X program. Note that this will typically not work across trust boundaries because XAUTHORITY refers to a file with 0600 permissions.

  • The location-related environment variables LANG, LANGUAGE, LC_ADDRESS, LC_ALL, LC_COLLATE, LC_CTYPE, LC_IDENTIFICATION, LC_MEASUREMENT, LC_MESSAGES, LC_MONETARY, LC_NAME, LC_NUMERIC, LC_PAPER, LC_TELEPHONE and LC_TIME can be passed to the subprocess if present.

  • The called process may need application-specific environment variables, for example for passing passwords. (See Passing Secrets to Subprocesses.)

  • All other environment variables should be dropped. Names for new environment variables should not be accepted from untrusted sources.

Robust Argument List Processing

When invoking a program, it is sometimes necessary to include data from untrusted sources. Such data should be checked against embedded NUL characters because the system APIs will silently truncate argument strings at the first NUL character.

The following recommendations assume that the program being invoked uses GNU-style option processing using getopt_long. This convention is widely used, but it is just that, and individual programs might interpret a command line in a different way.

If the untrusted data has to go into an option, use the --option-name=VALUE syntax, placing the option and its value into the same command line argument. This avoids any potential confusion if the data starts with -.

For positional arguments, terminate the option list with a single -- marker after the last option, and include the data at the right position. The -- marker terminates option processing, and the data will not be treated as an option even if it starts with a dash.

Passing Secrets to Subprocesses

The command line (the name of the program and its argument) of a running process is traditionally available to all local users. The called program can overwrite this information, but only after it has run for a bit of time, during which the information may have been read by other processes. However, on Linux, the process environment is restricted to the user who runs the process. Therefore, if you need a convenient way to pass a password to a child process, use an environment variable, and not a command line argument. (See Specifying the Process Environment.)

Portability notice

On some UNIX-like systems (notably Solaris), environment variables can be read by any system user, just like command lines.

If the environment-based approach cannot be used due to portability concerns, the data can be passed on standard input. Some programs (notably gpg) use special file descriptors whose numbers are specified on the command line. Temporary files are an option as well, but they might give digital forensics access to sensitive data (such as passphrases) because it is difficult to safely delete them in all cases.

Handling Child Process Termination

When child processes terminate, the parent process is signalled. A stub of the terminated processes (a zombie, shown as <defunct> by ps) is kept around until the status information is collected (reaped) by the parent process. Over the years, several interfaces for this have been invented:

  • The parent process calls wait, waitpid, waitid, wait3 or wait4, without specifying a process ID. This will deliver any matching process ID. This approach is typically used from within event loops.

  • The parent process calls waitpid, waitid, or wait4, with a specific process ID. Only data for the specific process ID is returned. This is typically used in code which spawns a single subprocess in a synchronous manner.

  • The parent process installs a handler for the SIGCHLD signal, using sigaction, and specifies to the SA_NOCLDWAIT flag. This approach could be used by event loops as well.

None of these approaches can be used to wait for child process terminated in a completely thread-safe manner. The parent process might execute an event loop in another thread, which could pick up the termination signal. This means that libraries typically cannot make free use of child processes (for example, to run problematic code with reduced privileges in a separate address space).

At the moment, the parent process should explicitly wait for termination of the child process using waitpid or waitid, and hope that the status is not collected by an event loop first.

SUID/SGID processes

Programs can be marked in the file system to indicate to the kernel that a trust transition should happen if the program is run. The SUID file permission bit indicates that an executable should run with the effective user ID equal to the owner of the executable file. Similarly, with the SGID bit, the effective group ID is set to the group of the executable file.

Linux supports fscaps, which can grant additional capabilities to a process in a finer-grained manner. Additional mechanisms can be provided by loadable security modules.

When such a trust transition has happened, the process runs in a potentially hostile environment. Additional care is necessary not to rely on any untrusted information. These concerns also apply to libraries which can be linked into such processes.

Accessing Environment Variables

The following steps are required so that a program does not accidentally pick up untrusted data from environment variables.

  • Compile your C/C++ sources with -D_GNU_SOURCE. The Autoconf macro AC_GNU_SOURCE ensures this.

  • Check for the presence of the secure_getenv and secure_getenv function. The Autoconf directive AC_CHECK_FUNCS([secure_getenv secure_getenv]) performs these checks.

  • Arrange for a proper definition of the secure_getenv function. See Obtaining a definition for secure_getenv.

  • Use secure_getenv instead of getenv to obtain the value of critical environment variables. secure_getenv will pretend the variable has not bee set if the process environment is not trusted.

Critical environment variables are debugging flags, configuration file locations, plug-in and log file locations, and anything else that might be used to bypass security restrictions or cause a privileged process to behave in an unexpected way.

Either the secure_getenv function or the __secure_getenv is available from GNU libc.

Example 1. Obtaining a definition for secure_getenv
#include <stdlib.h>

#ifndef HAVE_SECURE_GETENV
#  ifdef HAVE__SECURE_GETENV
#    define secure_getenv __secure_getenv
#  else
#    error neither secure_getenv nor __secure_getenv are available
#  endif
#endif

Daemons

Background processes providing system services (daemons) need to decouple themselves from the controlling terminal and the parent process environment:

  • Fork.

  • In the child process, call setsid. The parent process can simply exit (using _exit, to avoid running clean-up actions twice).

  • In the child process, fork again. Processing continues in the child process. Again, the parent process should just exit.

  • Replace the descriptors 0, 1, 2 with a descriptor for /dev/null. Logging should be redirected to syslog.

Older instructions for creating daemon processes recommended a call to umask(0). This is risky because it often leads to world-writable files and directories, resulting in security vulnerabilities such as arbitrary process termination by untrusted local users, or log file truncation. If the umask needs setting, a restrictive value such as 027 or 077 is recommended.

Other aspects of the process environment may have to changed as well (environment variables, signal handler disposition).

It is increasingly common that server processes do not run as background processes, but as regular foreground process under a supervising master process (such as systemd). Server processes should offer a command line option which disables forking and replacement of the standard output and standard error streams. Such an option is also useful for debugging.

Semantics of Command-line Arguments

After process creation and option processing, it is up to the child process to interpret the arguments. Arguments can be file names, host names, or URLs, and many other things. URLs can refer to the local network, some server on the Internet, or to the local file system. Some applications even accept arbitrary code in arguments (for example, python with the -c option).

Similar concerns apply to environment variables, the contents of the current directory and its subdirectories.

Consequently, careful analysis is required if it is safe to pass untrusted data to another program.

fork as a Primitive for Parallelism

A call to fork which is not immediately followed by a call to execve (perhaps after rearranging and closing file descriptors) is typically unsafe, especially from a library which does not control the state of the entire process. Such use of fork should be replaced with proper child processes or threads.