Busting the Myths on Binary Analysis vs. Source Code Analysis

Busting the Myths on Binary Analysis vs. Source Code Analysis

Despite being around for years (maybe even decades), the practice of automated code reviews to identify security vulnerabilities and other flaws still leave product security professionals with many misconceptions. Source code analysis provides complete coverage, some say. Binary analysis is inaccurate, others cry.

These beliefs are misleading, if not entirely wrong, yet they persist. It’s time to dispel these myths once and for all.

To help clarify the truth and guide you in making informed decisions about security analysis for embedded software, I have compiled and debunked some of the most common myths on this topic.

What You’ll Learn

  • Differences Between Source Code and Binary Analysis: Understanding the fundamental differences between these two methods of software security analysis and their respective roles in identifying vulnerabilities.
  • Common Myths Debunked: Clarity on prevalent misconceptions about source code analysis and binary analysis, and why these beliefs are misleading.
  • Limitations of Source Code Analysis: Insight into the inherent limitations of source code analysis, including issues with false positives and missed vulnerabilities.
  • Advantages of Binary Analysis: How binary analysis can provide a more comprehensive security assessment by considering post-compilation configurations and actual operational code.

Myth #1 – Source Code Analysis Detects All Software Vulnerabilities

I presume the origin of this myth is the misconception that binary code in the embedded software world is a simple transformation of source code into machine-readable bits, so analysis of your source code for software vulnerabilities will detect most if not all security issues.

But that is far from the truth.

Let’s briefly explore how source code turns into a functional application in the product we want to secure, say an IVI – an in-vehicle infotainment system (though it could just as well be a connected medical device or a smart energy meter).

The source code powering the IVI is a text version of a computer program that contains instructions for the machine to follow – play music, locate the vehicle on the navigation app and so on. It is written in a programming language which a human can read and change (for example C, C++, Java).

To enable the vehicle’s hardware/on-board computer to execute the program, the source code is compiled – a process that generates the machine code that computers can interpret. That’s the binary executable code of the application.

But that’s not the end of it. To get your infotainment system working, you need way more than a string of 0’s and 1’s representing your own code:

  • The operating system (e.g., Linux, Android) must be added as part of the final functional binary software
  • Software drivers are added to enable the IVI to communicate with the network or with the GPS
  • You may have additional scripts and frameworks added post compilation

And there‘s more – many software components are added (and need configuration) post-compilation.

If you use source code scanning (or SAST – static application security testing), you’ll miss vulnerabilities in the operating system and additional software “bundled” with your own code. As a result, you’ll effectively end up with ‘false negatives’ – a type of fool’s gold that fails to give you what you really need – comprehensive software security analysis.

In contrast, binary software composition analysis (SCA) deals with all aspects of the software that makes up your product, identifying vulnerabilities and security risks that may impact device operation and your customers.

Pro-Tip:

When integrating binary analysis into your security practices, consider setting up continuous monitoring systems that track changes in binary configurations and dependencies over time. This proactive approach helps in identifying new vulnerabilities that may arise due to updates or changes in the software environment, enhancing overall product security and resilience.

Myth #2 – Source Code Analysis is Foolproof

I guess what stands behind this myth is the assumption that access to a program’s source code provides the best visibility into how code performs in real life and thus the analysis would be accurate. But that’s not the case.

First of all, let’s be honest – neither source code analysis nor binary analysis are 100% perfect.

But this myth masks the fact that source code analysis is often plagued with false-positives that can consume valuable product security resources.

Why is that? Applications don’t always use the entire source code, but rather only the relevant pieces providing the desired functionality (the unused code is known as “dead code”). The compiler, on the other hand, generates the binaries from the source code you actually use.

Here is the catch – you may have pieces of source code with vulnerabilities that aren’t necessarily used in your product. Source code scanners would flag those  and your product security team will spend precious time analyzing them even though they are completely irrelevant to your product.

These are “false positives”.

Contrarily, binary analysis looks at the actual code used in your software, yielding a lower false-positive rate while helping you focus on vulnerabilities that are truly relevant to your product.

False positives and negatives in source code scanning

Myth #3 – Source Code Analysis is All You Need to Secure Production Code

Like myth #1, this one is likely to do with the thought that binary code in the embedded software world is a simple transformation of source code into bits that machines can understand.

When it comes to binaries, however, it’s not just about code being added/bundled with it, but also the issue of configurations that happen post-compilation. The binary isn’t typically used as is, but is rather configured extensively for deployment. For example:

  • Instead of generic cryptographic keys used during the development phase, you have to use customer-specific “production” keys
  • The OS is configured to enable/disable certain services
  • The OS-level firewall rules are configured

Binary Analysis Can Detect Compiler Issues

It’s not simply the compilation of the source code that gets you to the final (binary) product. There are compiler issues (e.g., code hardening capabilities), the linker that combines 3rd party libraries, the whole DevSecOps flow, packaging of firmware, deployment to the user and more.

All of these steps alter the binary and may introduce cybersecurity risks such as violations of internal security policies or of certain cybersecurity standards and regulations.

Unlike source code analysis, binary analysis can detect such security issues and help your team fix them on time.

Cybellum_source-to-binary

Why Binary Analysis is Essential for Product Cyber Resilience

Comparing source code analysis to binary analysis is a bit like checking the nutritional value of the ingredients of an uncooked meal against those of a cooked one.

The nutrients in your food are not what they were when you first got them. From vitamins to calcium to potassium, cooking improves absorption of some nutrients, but also changes the levels of the vitamins and minerals in the final product. Cooking oils and spices add much needed flavoring but also contribute to the caloric-value of a meal.

Source code analysis is a valuable method for detecting certain vulnerabilities during the development process. There is no arguing about that.

But beyond the fact that source code analysis (SAST) is not feasible in many cases because it’s simply not available (as is often the case in industries that rely on a complex software supply chain – automotive, medical, IIoT and more) – binary analysis is essential for enhancing your product’s cyber resilience, even in situations where your product’s source code is available.

Key Takeaways

  • Incomplete Coverage of Source Code Analysis: Source code analysis alone cannot detect all software vulnerabilities, especially those introduced by additional components and configurations added post-compilation.
  • False Positives in Source Code Analysis: Source code analysis can yield a high rate of false positives, wasting valuable security resources on irrelevant issues.
  • Essential Role of Binary Analysis: Binary analysis offers a more accurate view of software vulnerabilities by analyzing the final executable code, including all integrated components and configurations.
  • Enhanced Cyber Resilience: Combining both source code and binary analysis is crucial for achieving robust product security, as each method addresses different aspects of the software lifecycle.
  • Practical Application: For industries reliant on complex software supply chains, such as automotive, medical, and IIoT, binary analysis is indispensable for ensuring comprehensive cyber resilience.