12 Apr 2022

Security Containers: System Call and Permissions

The majority of the time, you’ll see containers running a Linux operation system. Containers run Linux processes that are visible from the host.

A regular and a containerized process use system calls and need permission and privileges, but the difference is how these permissions are assigned at runtime or during the image build process for containers.

System Calls

Application run in what’s called user space.

The application will ask the kernel on its behalf if the application needs to do something like accessing a file or even find the time of the day. System Call or syscall interface is when the user space code uses to make these requests of the kernel.

Here are some examples of System calls:

read

read data from a file

write

write data to a file

open

open a file for subsequent reading or writing

execve

run an executable program

chown

change the owner of a file

clone

create a new process

Uncommonly application developers will need to worry about system calls. The lowest level that they can come across is the Glibc library or the Golang syscall package.

The system call is used the same way either if it’s running in a container or not. The challenge is due to containers are making a system call to the same kernel, they’re on a single host share.

Following the principle, least privilege, describe here we can allow users to limit the set of system calls that different programs can access.

File Permissions

There is a saying that in Linux, everything is a file.

Permissions on files determine which users are allowed to access those files and what actions they can perform on the files where it’s referred to as discretionary access control or DAC.

The below image describes the output command of ls -l. It shows information about the files and their attributes.

For the file permission group, there are three actions that users might be able to perform on this file: read (r), write (w) or execute (x).

In the above example the owner, group, and any user can execute the file.

Permissions on the file can be affected by the use of setuid, setgid and sticky bits. Setuid and setgid can be used by attackers to obtain additional permissions.

setuid and setgid

When you execute a file, the process that gets started inherits your user ID.

The below example shows a file owned by the user called lessandrougulino. Run this under root by executing sudo sleep 100

As you can see, the sleep process is running under the root UID.

Now, let’s try turning on the setuid bit.

chmod +s sleep.sh

Running the sleep.sh again, we can see the file (sleep.sh) has taken its user ID from the owner of the file

This bit is typically used to give a program privileges that it needs but that is not usually extended to regular users.

Let’s try with the command ping. As you can see, I can run the command as non-root.

Now, let’s take a copy of the ping command.

cp /usr/bin/ping ./myping

The permissions of myping command.

Let’s try to run it using myping command.

As you can see above, when you copy an executable, the file ownership attributes are set according to the user ID you’re operating as, and the setuid bit is not carried over.

The original ping executable has the s or setuid bit instead of a regular x.

Let’s set the setuid on the executable and try again.

sudo chmod +s ./myping

It’s working because myping executable has the setuid bit, which allows it to operate as root.

If you open a new terminal, you can see the myping executable isn’t running as root. It’s happening because the modern version of ping starts off running as root, but it explicitly sets just the capabilities that it needs and resets its user ID to be that of the original user, it jumps through some hoops.

You can explore this in more detail, using trace.

Security implications of setuid

It’s very easy to write your own program that does setuid on itself and then, having already transitioned to root, calls the shell.

Some container image scanners will report on the presence of files with the setuid bit set.

You can prevent it from being used with the --no-new-privileges flag on a docker run command.

Version 2.2 of the Linux kernel introduced more granular control over these extra privileges through capabilities. I’ll cover it in the next post.

Security Containers: System Call and Permissions

Related Articles

System Calls

File Permissions

setuid and setgid

Security implications of setuid