- SonarCloud discovered a critical Zip Slip vulnerability in OpenRefine.
- If a user running a vulnerable version is tricked into importing a malicious project, an attacker could execute arbitrary code on the user’s machine.
- SonarCloud not only discovered the vulnerability but also provides valuable guidance on how to mitigate this kind of vulnerability and prevent common pitfalls.
- The vulnerability was fixed with version 3.7.4.
OpenRefine is a Java-based open-source data cleaning and transformation tool. This includes loading different types of data, cleaning it, converting it, and extending it. All of this can be done from the browser by accessing OpenRefine’s web interface. With almost 10k stars and ~1.8k forks, it is one of the more popular GitHub projects.
In our continuous effort to help secure open-source projects and improve our Clean Code solution, we regularly scan open-source projects via SonarCloud and evaluate the findings. In fact, everybody can also do it – SonarCloud is a free code analysis product for open-source projects, regardless of their size or language.
One of the findings reported by SonarCloud was a Zip Slip vulnerability in OpenRefine that made us curious. A Zip Slip vulnerability is caused by inadequate path validation when extracting archives, which may allow attackers to overwrite existing files or extract files to unintended locations.
In this article, we outline the impact of this vulnerability and explain how this and other code vulnerabilities can be detected with SonarCloud. Furthermore, we explain how attackers could exploit the vulnerability and describe a typical pitfall developers may fall into when trying to fix it.
OpenRefine version 3.7.3 and below is prone to a Zip Slip vulnerability in the project import feature (CVE-2023-37476). Although OpenRefine is designed to only run locally on a user's machine, an attacker can trick a user into importing a malicious project file. Once this file is imported, the attacker can execute arbitrary code on the user’s machine:
The vulnerability was fixed with OpenRefine version 3.7.4.
In this section, we dive into the technical details of the vulnerability.
SonarCloud is our cloud-based code analysis service. It uses state-of-the-art techniques in static code analysis to find quality issues, bugs, and security vulnerabilities in your code. With the recently added deeper SAST technology it is even possible to uncover hidden security vulnerabilities introduced by the usage of third-party dependencies.
During our regular scan of public open-source projects, the engine reported the following issue in OpenRefine (see it yourself on SonarCloud):
As clearly visible by the highlighted code flow, the
untar method iterates over all files within an archive and uses the
tarEntry.getName() method to create a new
File object, which is then passed to
FileOutputStream to extract this file. This introduces a Zip Slip vulnerability allowing an attacker to write files outside the intended folder (
destDir) by creating an archive with a file, e.g., named
untar method is called from the
FileProjectManager.importProject method, which handles the import of existing Refine project files:
Projects can either be imported by directly uploading an archive or by providing the URL of an archive. This is what the feature looks like on the web interface:
The corresponding endpoint is called
The vulnerability gives attackers a strong primitive: writing files with arbitrary content to an arbitrary location on the filesystem. For applications running with
root privileges, there are dozens of possibilities to turn this into arbitrary code execution on the operating system: adding a new user to the
passwd file, adding an SSH key, creating a cron job, and more. For applications running with the permissions of a low-privilege user, the opportunities are more limited but still occur – earlier this year, we documented a unique way to achieve code execution by writing a site-specific configuration hook, which is limited to Python applications.
Besides these generic techniques, there might be features of the application itself, which could be leveraged by attackers. In the case of OpenRefine, the application implements an auto-reload feature, which regularly scans the
WEB-INF folder for changes and restarts the
WebAppContext when a file is changed:
All classes within the
WEB-INF/classes folder are reloaded during the restart of the
WebAppContext. This means that attackers could overwrite an existing
.class file within this folder, which triggers the reload and subsequently executes the attacker's
.class file, resulting in the ability to execute arbitrary code.
In order to mitigate this vulnerability, it needs to be ensured that all files are extracted under the intended base folder. One way you might think of doing this is by using the
getCanonicalPath method to retrieve the absolute and unique path as a String and then leverage the
startsWith method to verify that the destination path is part of the intended base folder:
Caution: This does not fully fix the vulnerability! Can you spot the problem here?
getCanonicalPath method removes terminating path separators, which makes this still vulnerable to a partial path traversal!
Assuming the base folder (
destDir) is defined as the home directory of the user john (
"/home/john/"), the trailing slash is removed, resulting in
"/home/john". This means that attackers could still partially path traversal to another user’s home directory beginning with the same characters, e.g.,
"/home/johnny/" since this passes the check:
We continuously keep track of freshly unveiled pitfalls like this and add them to our engine. To correctly fix a vulnerability, you can click on the
"How can I fix it?" tab directly attached to the corresponding issue on SonarCloud:
In order to prevent this partial path traversal, there are two different approaches:
- Reinsert the path separator for the base folder after calling
- Retrieve the
Pathobject related to the
Fileand use its
startsWithmethod. This does not literally compare the path’s string but determines this on a path’s elements basis.
For OpenRefine, the maintainers avoided falling into this trap. They correctly fixed the vulnerability by leveraging the
This effectively prevents files from being written outside the intended
|2023-07-07||We report the issue to the maintainers|
|2023-07-08||Maintainers confirm the issue and start working on a patch|
|2023-07-17||OpenRefine Version 3.7.4 is released, which fixes the issue|
|2023-07-17||CVE-2023-37476 is assigned|
In this article, we deep-dived into a critical Zip Slip vulnerability in OpenRefine. We also outlined how attackers can leverage an application’s features to turn a file write into arbitrary code execution. Furthermore, we highlighted common pitfalls developers may face when trying to fix this path traversal vulnerability.
With the help of SonarCloud, this vulnerability was not only detected in a matter of seconds, it could also be fixed properly by relying on the comprehensive information SonarCloud provides for each raised issue. This applies to security issues, but also code quality problems, which helps developers to write Clean Code, increasing security, maintainability, and reliability.
Finally, we would like to thank the OpenRefine maintainers for quickly responding to our notification, providing a comprehensive patch, and transparently informing all users.