Blog post

Code Interoperability: The Hazards of Technological Variety

Stefan Schiller photo

Stefan Schiller

Vulnerability Researcher

12 min read

  • Security

Key Information

  • In April 2023, the Sonar Research team discovered and reported two critical vulnerabilities in Apache Guacamole.
  • Apache Guacamole is a popular remote desktop gateway commonly used in enterprise environments to access hosts and isolated applications from a web browser.
  • The vulnerabilities tracked as CVE-2023-30575 and CVE-2023-30576, would have allowed low-privileged users to gain remote code execution (RCE) on the Guacamole server by attacking the external web interface.
  • Attackers could leverage this access to spy on every connection, harvest sensitive credentials, and pivot to an organization’s internal network.
  • Thanks to our report, the Guacamole maintainers fixed the vulnerabilities in May 2023 with version 1.5.2, and there were no signs of in-the-wild exploitation.


Can you think about even one project that does not use several programming languages, protocols, or communication standards? Today’s variety of technologies introduces a significant challenge when it comes to interoperability. If two different software components interact with each other but disagree about certain specifics of their communication protocol, this may introduce vulnerabilities known as parser differentials.


This two-part blog series dives into two critical vulnerabilities in the remote desktop gateway Apache Guacamole, which allows users to access remote machines via a web browser. The Guacamole gateway is usually the only externally accessible instance, granting access to remote machines isolated in an organization’s internal network.


This first article will explain how Guacamole’s architecture connects a Java component with a C backend server, which introduces the aforementioned challenge of interoperability. We will determine how Java’s internal processing of Unicode strings can lead to unexpected behavior, which results in a severe vulnerability an attacker can exploit.


In the second article, we will see that the requirement of high parallelism to serve and share hundreds of connections at the same time makes an application like Guacamole also prone to concurrency issues. We will dive into the world of glibc heap exploitation and explain how attackers could ultimately gain remote code execution.

We also presented the content of this blog post series at Hexacon23. A recording of the talk can be found here: YouTube: HEXACON2023 - An Avocado Nightmare by Stefan Schiller.


Apache Guacamole 1.5.1 and below incorrectly calculates the length of instructions sent during the Guacamole protocol handshake, which allows attackers to inject Guacamole instructions during the handshake (CVE-2023-30575).

Furthermore, Apache Guacamole 0.9.10 throughout 1.5.1 continues to reference a freed RDP audio input buffer, leading to a Use-After-Free vulnerability (CVE-2023-30576).

Both vulnerabilities can be combined by a low-privileged user with access to an RDP connection to gain remote code execution on the Guacamole server. This access could be used to spy on every connection, harvest sensitive credentials, and pivot to an organization’s internal network:

The vulnerabilities have been fixed with Apache Guacamole version 1.5.2.

Technical Details

In this section, we briefly describe how Apache Guacamole works under the hood and then dive into the first vulnerability, which is a parser differential between Guacamole’s Java and C components. We explore the root cause of this vulnerability and outline how an attacker could exploit it.

Apache Guacamole Architecture

From a user’s perspective, Guacamole is very simple:

  • You access the external web interface,
  • You enter your credentials, and
  • You are automatically connected to your configured internal host.

You can access this host like a virtual machine fully from the browser:

Because of the simplicity of this solution, it is utilized for many different use cases in enterprise environments:

  • It can be used by remote employees to access computers in their company.
  • It can be used for bring-your-own-device (BYOD) deployments to access company resources safely from a personal device.
  • It is used in popular browser isolation solutions.
  • It can be used for server administration.
  • And it can also be integrated with cloud platforms.

Behind the scenes of this handy solution are two different components: the Guacamole Client and the Guacamole Server:

The Guacamole Client is the externally exposed component that communicates with the browser. It is written in Java and provides the web interface. It serves the required client-side JavaScript code, is responsible for user authentication, and provides a WebSocket endpoint. Once a connection is established, it basically passes all communication through to the Guacamole Server.

The Guacamole Server is written in C and is usually not externally exposed. In a default setup, it runs on the same machine as the Guacamole Client and listens on localhost only. This component is responsible for making the specific remote connection to the internal hosts via RDP, SSH, or VNC.

Guacamole Protocol

All communication between both components is done via the custom Guacamole Protocol. This protocol is a generic remote desktop protocol and abstraction of the specific protocols RDP, VNC, and SSH. Due to the abstraction, the JavaScript client-side code in the browser does not need to care about these particular protocols and only needs to support the Guacamole Protocol. Like for any remote desktop protocol, the user input is, for example, a keyboard stroke or a mouse movement, and the output is the screen display of the internal host.

The whole communication is based on single Guacamole Instructions. An example of such an instruction to move the mouse looks like this:

It consists of three elements in this case, which are comma-separated and terminated by a semicolon. The first element is the opcode, and all the following elements are arguments to this opcode. Each element on its own consists of a decimal integer for its LENGTH followed by a separating period and the actual VALUE. The length of this value is denoted by the decimal integer in front of it.

In the case of the above example, the instruction is sent by the client to set the mouse position to the provided x and y coordinates. In order to send this instruction, the connection to the internal host must be established first, of course. For this purpose, the Guacamole Client and Server perform a Handshake. During this phase, the client tells the server how to set up the connection to the internal host:

At first, the client sends a select instruction to inform the server which remote protocol to use. In this case, RDP. Then, the client sends a few other instructions, followed by an image instruction. This tells the server which image types the user’s browser supports. At the very end of the handshake, the client sends a connect instruction. This instruction contains all the information required to make the RDP connection. So, for example:

  • The IP address of the internal host
  • The port of the RDP service, and
  • The credentials to use for the RDP connection.

Once the server receives this instruction, it establishes the RDP connection to the internal host.

Attack Surface

From an attacker’s point of view, we can already make some interesting observations. The first observation is related to the handshake: The supported image types sent from the client to the server are taken from this query parameter sent by the browser:

This is the only value a low-privileged user can influence during the Handshake. All other values are populated from the configuration in the database. Another thing that we noticed is this excerpt from the protocol specification:

Each element of the list has a positive decimal integer length prefix separated by the value of the element by a period. This length denotes the number of Unicode characters in the value of the element, which is encoded in UTF-8.

It states that the LENGTH field of a Guacamole instruction is not a byte length. Instead, it denotes the number of UTF-8 encoded Unicode characters. From an attacker’s point of view, this promises encoding issues, which have already proven to be security-relevant in the past.

This is particularly interesting because of the language difference: The Guacamole Client is written in Java, and the Guacamole Server is a C application. Both of these components need to handle the encoding in the same way, which leads us to a significant challenge software faces nowadays: Technological Variety.

Technological Variety

There are two crucial aspects of why this is so relevant nowadays: On the one hand, we have a vast landscape of technologies, and on the other hand, everything is heavily connected. The result of this is that there is a lot of communication between very different types of components. If these components even slightly disagree about a certain specific of their communication, this can have a devastating impact:

Guacamole is an excellent example of this challenge because it employs two different programming languages in a single project. To determine any inconsistencies between both components, we set up a small and straightforward fuzzing environment:

  1. At first, we feed some random input to the Java part.
  2. Then, we generate an image Instruction with the input as an argument and send it to the C part.
  3. The C part consumes and parses the instruction.
  4. After this, we check whether the C parser just crashed or its internal state is inconsistent.

All of this just uses the existing Source Code, only with a small harness.

It didn’t take long before this simple setup dumped an interesting input:

The green byte sequence f0 af a0 a2 was randomly generated and the surrounding bytes were added by the Java code. This byte sequence is a 4-byte UTF-8 sequence representing a CJK Unicode character. Although this is a single Unicode character, Java populated the LENGTH field with the value 2. The C parser ended up in an inconsistent state because another Unicode character was expected, which is correct according to the specification.

Let’s do some basic tests to determine why Java included the value 2 here.

Java agrees that a single A character has a length of 1:

Java also doesn’t seem to have a problem with this Greek beta character and also agrees that its length is 1:

For more fancy characters like this victory hand, Java still agrees that its length is 1:

But if we insert our CJK character, Java suddenly assumes that this character has a length of 2:

How does this make sense? Also, we have been talking about UTF-8 the whole time. Doesn’t Java use a UTF-16 encoding?

Java’s internal String representation

Yes and no, actually: in Java 9, the concept of Compact Strings was introduced, which dynamically encodes a String internally with LATIN-1 or UTF-16:

For a simple A character, which is encoded as the 1-byte sequence 41 in UTF-8, Java also stores this character internally with a single byte because it can be encoded with LATIN-1. To keep track of the internal encoding, the String object has a private member called coder. For LATIN-1, in the case of a simple A character, the value of coder is set to 0.

For a UTF-8 sequence, which cannot be represented with LATIN-1 like the Greek beta character, the coder value is set to 1 for UTF-16. This means that the initial UTF-8 sequence is now converted to UTF-16, and the resulting two bytes are stored internally.

So now we are aware of how Java internally stores a String. But how is the length of such a String determined? Actually, very easy. This is the internal Java implementation of the String length method:

public int length() {
  return value.length >> coder();

The length of the internal byte array is right-shifted by the coder value, and the result is returned:

For the simple A character, which is encoded with LATIN-1, the coder value is 0. Because of this, the shift doesn’t change the value, and the length is just 1. For the UTF-16 encoded Greek beta character, the coder value is 1. This means that the amount of bytes stored in the internal byte array is effectively divided by 2. Thus, the resulting length is also 1. This also works fine for a 3-byte UTF-8 sequence since these are stored as 2 bytes in UTF-16.

Now, let's go back to the CJK Unicode character. We have already seen that this character is encoded with 4 bytes in UTF-8. The question is, how are these bytes converted to UTF-16?

And here we are actually getting to the problem. The 2 bytes of a UTF-16 character are enough to map the 1, 2, and 3-byte UTF-8 sequences. However, it is not sufficient to map the 4-byte sequences. These sequences are mapped to a UTF-16 Surrogate Pair:

A Surrogate Pair consists of a High Surrogate and a Low Surrogate, which are specific code units reserved explicitly for this purpose. This is also applied to the CJK Unicode character:

The internal byte array contains 4 bytes for this character: 2 bytes for the High Surrogate and 2 bytes for the Low Surrogate. This also affects the length calculation:

The size of the byte array (4) is shifted right by one, resulting in a length of 2. This 2 is inserted into the Guacamole Instruction, followed by the CJK Unicode character. According to the Guacamole specification, Java encodes this character using UTF-8 resulting in a 4-byte sequence, which still represents one single character. Thus, the C parser fails because it expects yet another character.

So, what the heck? The Java String length method is broken?

But, no. It is not:

Or at least, it behaves according to the specification, which clearly states that it returns the number of Unicode code units (storage unit - 2 bytes for UTF-16), not code points (unique number assigned to a character).

In this case, this little difference introduces a parser differential between the Java client and the C server. The vulnerability is triggered by UTF-8 sequences that consist of 4 bytes like the CJK character or, ironically, this Unicode bug character:

Exploitation: Guacamole Protocol Injection

So, how can this little difference be exploited? We have already figured out that the query parameter for the supported image types is inserted into the Guacamole image instruction sent during the Handshake. The query parameter can also be set multiple times to define more than just one image type:

All of these query parameters are inserted as an argument to the image instruction.

What an attacker would like to achieve is to break out of the green VALUE field and inject a whole new instruction. This can be done by sending a swarm of bugs:

The first query parameter is set to 4 UTF-8 encoded bugs, and in the second one additional instruction is inserted, which consists of:

  • A semicolon, followed by
  • the digit 8,
  • a dot, and
  • the string “injected”

The Guacamole Client then sends this byte sequence to the Guacamole Server during the Handshake:

It begins with the opcode, which is image, followed by two arguments:

  • the 4 bugs, which Java assumes to have a length of 8, and
  • the additionally injected string.

When the Guacamole Server receives this byte sequence, it is unaware of its structure and starts to parse it:

The parser begins by reading the LENGTH field of the first element, which is 5. Thus, five Unicode characters are consumed for the VALUE field. This completes the opcode. Next, the LENGTH field of the argument is read, which is 8 this time. Accordingly, the parser consumes all four Unicode bug characters but proceeds to consume bytes beyond the boundary of the argument. After eight characters are processed, the parser encounters a semicolon, which designates the end of the image instruction. Thus, the injected string in the second argument becomes a whole new instruction!

The instruction an attacker can inject is placed right after the image instruction, which is sent during the handshake before the connect instruction:

This means that an attacker can inject a new connect instruction, which is inserted before the legit connect instruction. This makes the Guacamole Server ignore the second, actually legit connect instruction and instead connect to any host an attacker would like to.

One way to leverage this is to exfiltrate data. The connect instruction sent by the Guacamole Client contains sensitive items like the credentials used to make the connection, optional passwords and private keys for shared drives, and gateway credentials:

This sensitive information is inserted as arguments of the legit connect instruction, which directly follows the attacker’s injected connect instruction. This means it is possible to let the injected connect instruction end with a big LENGTH value for one of the settings. This way, the Guacamole Server assumes that all data following is the VALUE of this setting. This includes the legit connect instruction with all the sensitive items, which is now not an instruction anymore but a simple VALUE of one of the settings of the attacker’s connect instruction:

As the attacker cannot know the length of the legit connect instruction in advance, the correct argument boundary might be missed, as shown by the red arrow at the bottom of the animation. This can be overcome by slightly adjusting the injected LENGTH value until the correct boundary is hit.

The specific setting of the attacker’s connect instruction, which is now populated with the legit connect instruction, is called load-balance-info. This setting is sent to the RDP server as the RoutingToken during the initial protocol negotiation. Since the RoutingToken is sent at the beginning of the RDP connection, the attacker doesn’t even need to set up a custom RDP server to dump some internal RDP handshake data. It is just enough to set up a TCP listener on the host specified in the injected connect instruction. The Guacamole Server tries to connect to this server and happily transmits all sensitive data in the form of the RoutingToken:

The above video demonstrates how an attacker with low-privileged access can leverage the protocol injection to exfiltrate the connection settings. Of course, an attacker can also populate the injected connect instruction with the exact same settings and only change specific values.

One feature which can be enabled this way is the RDP Drive Redirection. When instructed to do so, Guacamole can map a configured share to the RDP connection. This share can then be accessed from the browser via a dedicated file browser. The problem is that the shared folder is also one setting of the connect instruction. This means that an attacker can leverage the injection to define the Guacamole Server’s root as the shared drive. This allows the attacker to leak any world-readable file on the server:

Writing is not possible by default because the Guacamole user is very restricted. Nevertheless, an attacker can still use this to leak valuable information like the memory layout of the Guacamole process by reading the /proc/self/maps file. This information could be very useful when exploiting a memory corruption vulnerability, as we will see in the second part of the blog series!


2023-04-11We report both issues to the maintainers
2023-04-11Maintainers acknowledge receipt of our report
2023-04-12Maintainers confirm both issues
2023-05-09Maintainers finish the patch for both issues
2023-05-09We review and confirm the patch
2023-05-25Maintainers release patched version 1.5.2


In this first article, in a series of two, we briefly introduced the remote desktop gateway Apache Guacamole and explained its common enterprise use cases and technical architecture. We then dived into the first vulnerability we discovered, which allows an attacker to inject instructions into the Guacamole handshake. An attacker could leverage this to establish arbitrary connections from the Guacamole server, leak sensitive information, and read arbitrary files.

This vulnerability is not only interesting from a technical point of view but also highlights a more generic insight: the increasing variety of technologies poses a big security risk because of interoperability problems. If two inherently different components need to communicate with each other but even slightly disagree about certain specifics of their communication protocol, this can introduce severe security vulnerabilities. These parser differentials will likely remain the source of very impactful bugs in the coming years.

Although our main approach is to audit source code, studying protocol specifications can be very beneficial in identifying these kinds of inconsistencies. Suppose there are some strange specifics like the Unicode length we have been seeing, gaps in the specification, or certain corner cases. In that case, it is probably worth checking if different parsers handle this the same way.

In the next article of this series, we will see that the requirement of high parallelism to serve and share hundreds of connections at the same time makes an application like Guacamole also prone to concurrency issues. We will dive into the world of glibc heap exploitation and ultimately gain remote code execution.

At last, we would like to thank the Guacamole maintainers for quickly responding to our report and providing a comprehensive patch!

Related Blog Posts

Get new blogs delivered directly to your inbox!

Stay up-to-date with the latest Sonar content. Subscribe now to receive the latest blog articles. 

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.