A Deep Dive into Linux Kernel CVE-2017-18017 in netfilter TCP_MSS

A Deep Dive into Linux Kernel CVE-2017-18017 in netfilter TCP_MSS

During a recent security audit of a device, I stumbled upon a known security vulnerability in the Linux kernel. Although CVE-2017-18017 has been out there for quite some time, I could not find a full description of the bug and mechanism where the bug resides, or POC code exploiting the vulnerability.

In order to exploit the bug, I decided to research the mechanism and develop an exploit myself. The bug is fixed in recent versions of the Linux kernel but as I realized, devices that still run the older Linux kernel version are still prevalent in today’s consumer electronics reach spectrum.


Let’s start out by understanding the different modules TCP_MSS is embedded in:

Netfilter and Iptables

Netfilter is a set of kernel modules allowing network traffic management, filtering and inspection by implementing hooks in the kernel on the network stack. Netfilter hooks are usually managed by a user-space app such as Iptables.

In Iptables, data management and filtering is achieved by defining rules in tables. A full description of the features and design of Iptables is outside the scope of this writeup as it is a very comprehensive system, but it is a very useful feature in many networking devices such as routers, IP-cameras, and such and used to implement firewall capabilities of to filter out possibly unwanted traffic.

A simple example a root user can use in order to allow incoming connections only to port 443 for example is:

The TCP MSS Option

As we saw above, the bug resides in the xt_TCPMSS.c file that is responsible for implementing iptables rules that involve handling or inspection of the MSS parameter of TCP packets.

In TCP, the client and server use a process of 3-way handshake in order to build-up the connection, during the process, the client will send a SYN packet, wait for the server to send back a SYN ACK packet, and finally send an ACK packet to finalize the connection. After that process, data can be sent by either side according to the TCP protocol.

The length of the TCP header structure is determined by the number and type of options included in the header. The TCP stack will determine the size of the header according to the “Header Length” parameter located in the four most significant bits of the Flags parameter.

The length value is multiplied by four to arrive at the header’s final length in bytes. Each option type may have a different length and the options structure is an array of TLV type objects.

The following is taken from SYN packet options in the TCP header:

Some options will only be included in either SYN or SYN ACK packets. In this case, options relating to the global connection parameters will apply until the connection is torn down. Examples of such parameters are SACK (Selective ACK) and MSS (Maximum segment size) – as seen in the screenshot above.


Maximum segment size (MSS) describes the largest amount of data the computer is able to receive. In a 3-way handshake, both client and server will include their MSS parameter value in the respective SYN and SYN ACK packets to signify the maximum TCP packet size they are expecting to receive from the other end.

MSS is different from – but related to – the Layer 2 MTU parameter which signifies the largest PDU that will be sent. The relation between both parameters is that MSS is obviously bound by the value of MTU. The default MSS value of most systems is 1460, which correlates to the default value of MTU after reducing the length of the TCP header.

Code analysis

The first thing I did was check the fix – the commit can be found here.

It looks like prior to the fix, the tcp_hdrlen parameter could be set to a value that is lower than the minimum length of a default TCP Header by manipulating the tcph→doff parameter. This parameter correlates to the TCP Header Length parameters previously described. The parameter should have a minimum value of 0x5 (as we explained, this value is multiplied by four to arrive at the minimum lenעth of a TCP header which is 20 bytes).

Following the initial examination, I expected the issue to be related to the usage of the tcp_hdrlen parameter where it is assumed a length greater than the minimum 20 bytes, so I looked at the original function implementation where the bugfix was committed.

The function responsible for MSS manipulation is the tcpmss_mangle_packet  function. The function is responsible for updating the MSS parameter in the incoming/outgoing packet according to the rule defined in iptables and according to other restrictions (e.g. the MSS will not be updated to a lower value than the minimum defined).

In order to locate the MSS parameter in the TCP header, the function iterates over all options included in the TCP header using the following loop:

In the loop, the iterator is initialized to the size of the minimum length TCP header and incremented by the length of the current option each iteration, and the loop will stop if the iterator exceeds the length of tcp_hdrlen.

If tcp_hdrlen is forced to a value lower than TCPOLEN_MSS, the condition to quit the loop will be the iterator being a negative number, but this will never happen since the iterator i is defined as unsigned int.

This is a classic boundary checking and integer overflow bug. Theoretically, it causes an infinite loop, but in reality, it will cause one of two scenarios:

  1.  The kernel continues iterating over memory it assumes is still part of the TCP header and exceeds its readable address space to arrive at an access violation exception (Page Fault)
  2. The loop’s inner condition is met and a byte sequence corresponding to the MSS parameter option number and length are met, in which case the following two bytes will be overwritten since the loop assumes they contain the MSS value that should be updated.

Both scenarios could lead to Denial-of-service of the device.

Mitigations for CVE-2017-18017

Issues like CVE-2017-18017 remind us how important it is to keep our software up-to-date during the development process. Similar issues are discovered on a monthly basis and continuously updating vulnerable software versions throughout the lifecycle of the embedded device software is essential to device security.

Moreover, it is important to make sure we know the exploitability of critical embedded software vulnerabilities in order to prioritize and mitigate them properly. In the case of this netfilter_TCP_MSS bug, the CVSS rating was high but the exploitability was not clear, and we needed to do research to determine how this vulnerability impacts embedded devices.

In cases of large scale projects involving many teams and developers, it is a challenge to efficiently ensure software security in the different modules that make up the firmware. In this case it is recommended to integrate an automated cybersecurity monitoring tool into the development process.

For this audit, I used Cybellum’s Product Security Platform’s smart context filtering in order to detect all security vulnerabilities that exist in the firmware, and are also exposed by the system configuration.

Appendix A – Building the debugging environment

My intention was to debug the issue using the kernel-debug feature on a QEMU system.

in order to run an x86-x64 Linux in QEMU, I required a filesystem and a kernel of the pre-fixed version compiled with debug symbols and debug capabilities

Creating a filesystem

I used the following script based on debootstrap to build a Debian filesystem.

I built the Linux kernel from a source downloaded here.

I downloaded the source for kernel version 4.9.1.

Compiling The Kernel

Compiling the Linux kernel proved difficult for an older version since the compilation relies on multiple tools (e.g. GCC) of specific versions.

In order to use all the proper tools of the right version, I used a docker image of an older ubuntu xenial image.

After downloading and running the container

I downloaded, extracted, and compiled the kernel then I created the .config file with make

since I want to be able to debug the kernel I had to add the following flags to the .config file

In order to regenerate configurations and run compilation:

Now I was able to run the QEMU VM using:

Finally, triggering the bug requires enabling the TCP MSS filter so the tcpmss_mangle_packet  callback function will be executed for incoming SYN packets with the MSS option in the header.

Appendix B – PoC

I used Scapy to assemble the TCP packet in such a way to reach the loop I identified earlier.

The packet needs to have SYN flag enabled and with a Header Length parameter of 0. In order to create the required length value I added an array of 41 NOP-type options to the TCP packet, since the size of each NOP option is a single byte and the default header is 20 bytes, 41 NOPs should overflow the Header Length (over the maximum of 60 bytes).

Sure enough, the Header Length is equal to 0 and Wireshark is noting the packet erroneous value:

In some test runs, this scenario caused a page fault as expected: