← All insights

Forensic Code Review: What Courts Need to See in Software IP Disputes

How forensic source code analysis works in litigation, what it can reveal about authorship and copying, and what makes code evidence defensible in court.

Software IPSource CodeForensic AnalysisExpert Witness

Software intellectual property disputes frequently turn on what the source code itself reveals. Whether the allegation is copyright infringement, misappropriation of trade secrets, or breach of a licensing agreement, the code is the primary evidence, and analysing it requires a structured forensic approach that will withstand judicial scrutiny.

Having acted as expert in a number of source code disputes across a range of industries and technologies, I set out below what forensic code review involves in practice, and what solicitors should be aware of when instructing an expert in this area.

What forensic code review is

Forensic code review is the systematic examination of source code, compiled binaries, or both, to answer specific questions relevant to a dispute. Those questions typically fall into one of several categories:

  • Authorship and copying: Was code written independently, or was it copied or derived from another codebase? Are there patterns of similarity that go beyond what would be expected from common programming practices?
  • Ownership and provenance: Who wrote the code, when was it written, and how did it evolve? Can version control history be relied upon, or has it been manipulated?
  • Quality and compliance: Does the code meet the standards set out in a specification or contract? Are there defects, security vulnerabilities, or departures from industry best practice?
  • Functionality: What does the code actually do? Is there hidden or malicious functionality, such as logic bombs, backdoors, or undisclosed data collection?

The methodology depends on the question. A copying analysis requires different tools and techniques from a quality assessment, and both differ from a functional analysis of what a system does at runtime.

How similarity is assessed

One of the most common questions in software IP disputes is whether one codebase was copied from another. This is rarely as simple as it sounds. Modern software development involves extensive use of open-source libraries, common frameworks, and well-established design patterns. Two independent developers solving the same problem will often produce structurally similar code, simply because there are a limited number of sensible ways to approach certain tasks.

Forensic code review distinguishes between these expected similarities and those that indicate copying. The analysis typically examines multiple layers:

  • Textual similarity: Direct comparison of source code text, including variable names, comments, formatting, and whitespace patterns. Identical or near-identical passages, particularly where they include unusual naming conventions, idiosyncratic comments, or the same typographical errors, are strong indicators of copying.
  • Structural similarity: Comparison of code architecture, function organisation, data flow, and control logic. Two programs may use different variable names and syntax but share an identical structure that is unlikely to have arisen independently.
  • Non-literal similarity: Assessment of higher-level design choices, including the selection and arrangement of algorithms, the structure of APIs, and the organisation of modules. This is the most contested area in software IP law, and the expert’s role is to identify what is technically significant and distinguish it from what is dictated by external factors or common practice.
  • Anomalous artefacts: Residual traces of copying that the alleged infringer may not have removed, such as debugging statements, configuration files referencing the original project, or metadata embedded in the code that points to a different author or organisation.

No single indicator is conclusive. The strength of a forensic code review lies in the accumulation of evidence across these layers, forming a coherent picture that a judge can evaluate.

Version control as evidence

Most modern software is developed using version control systems such as Git, Subversion, or Mercurial. These systems record every change made to a codebase: who made it, when, and what was altered. Version control history can be powerful evidence of authorship, development timelines, and the provenance of specific features.

However, version control metadata is not tamper-proof. Commit timestamps can be altered. Entire histories can be rewritten. The integrity of a repository’s commit history can itself become a central issue in a dispute, particularly where there are questions about whether metadata has been manipulated to misrepresent the timeline of development.

A forensic code review must therefore assess not only what the version control history shows, but whether it can be relied upon. This involves examining commit patterns, cross-referencing timestamps with external evidence, and identifying anomalies that suggest manipulation.

Decompilation and binary analysis

In some disputes, the source code is not available, either because it has been withheld, destroyed, or because the dispute concerns a compiled binary or firmware. In these cases, the expert may need to decompile or reverse-engineer the software to understand its functionality.

Decompilation converts compiled machine code back into a higher-level representation that can be analysed. It does not perfectly reconstruct the original source code (variable names and comments are typically lost during compilation) but it can reveal the logic, structure, and behaviour of the software. This approach can be applied to a range of software types, from embedded firmware to mobile applications, where the original source code is contested or unavailable.

This type of analysis requires care. The legal framework governing decompilation varies by jurisdiction, and the expert must work within the boundaries set by the instructing team. The analysis itself must be reproducible, meaning that another competent expert, given the same binary and the same tools, should reach the same technical conclusions.

What makes code evidence defensible

A forensic code review is only as useful as the report that communicates it. In my experience, the following principles bear on whether the evidence will withstand scrutiny:

Reproducibility. Every step of the analysis, including the tools used, the versions of code examined, and the comparison methodology, must be documented clearly enough that an opposing expert can repeat the process and verify or challenge the results.

Proportionality. Software codebases can be vast. A commercial platform might contain millions of lines of code. The review must be proportionate, focused on the specific modules, files, or functions relevant to the dispute, rather than attempting to analyse everything.

Context. Code does not exist in a vacuum. The expert must explain what the code does in terms the court can understand, placing technical findings in the context of the allegations. A finding that two functions share a high degree of textual similarity is meaningless without explaining what that function does, whether such similarity is unusual, and what alternative explanations exist.

Objectivity. The expert’s duty is to the court. If the forensic evidence does not support the instructing party’s position, the expert must say so. In software IP disputes, it is not uncommon for an initial review to reveal that alleged copying is in fact the use of a shared open-source library, or that similarities reflect industry-standard design patterns rather than misappropriation. A credible expert reports what the evidence shows, regardless of which party instructed them.

When to instruct

Software IP disputes benefit from early expert involvement. Code repositories can be altered or deleted. Development environments change. Employees leave and take knowledge with them. The earlier forensic analysis begins, the more complete the picture is likely to be.

Where solicitors are advising on a potential software IP claim, or defending against one, early engagement with a technology expert can assist in assessing the technical merits before proceedings are issued, identifying what evidence needs to be preserved, and framing the technical questions that will shape the case.