From SVG and back, yet another mutation XSS via namespace confusion for DOMPurify < 2.2.2 bypass
For those who are only interested in the final payload here you go (I won’t judge). For the ones interested in why it works, please bear with me.
First of all, I would like to point out that this article and the bypass described here are heavily based on Michał Bentkowski (@SecurityMB) research. The original article and previous bypass who made it possible for me to find this new vector are available here. The article also contains all of the required foundations the reader may need to properly understand how and why the bypass described here works. It took me a few rounds of reading and playing with the LiveDOM++ tool, also provided by @SecurityMB, to really understand why Michał’s original bypass worked in the first place. Gareth Heyes (@garethheyes) also built upon Michał’s work and found a variation of the original bypass days after it got published. I also took my time with Gareth’s bypass to properly understand what was going on.
What is DOMPurify?
DOMPurify is a widely used HTML sanitizer library. It is mainly used to sanitize user input on web applications that permits the creation of HTML/Rich Text content. Think of a web-mail client or blog platform for example. A common usage pattern for DOMPurify is the following:
According to Michał’s article:
In terms of parsing and serializing HTML as well as operations on the DOM tree, the following operations happen in the short snippet above:
htmlis parsed into the DOM Tree;
DOMPurify sanitizes the DOM Tree (in a nutshell, the process is about walking through all elements and attributes in the DOM tree, and deleting all nodes that are not in the allow-list);
The DOM tree is serialized back into the HTML markup;
After assignment to
innerHTML, the browser parses the HTML markup again;
The parsed DOM tree is appended into the DOM tree of the document;
The important takeaway is that the HTML markup is parsed twice and serialized into a string in between.
Namespaces, and why they are important
HTML is a markup language based on XML. XML utilizes the concept of namespaces. An HTML document, compliant with the current specification of HTML 5, may contain elements from three different namespaces.
- HTML (
- SVG (
- MathML (
Namespaces solve the ambiguity problem when a single XML document contains homograph elements and/or properties from different “vocabularies”. For example, all the previously listed specifications contain a tag named style. The style tag has different properties and rendering behaviors associated with it depending on the namespace it's used. This means that a browser will behave differently when it parses a style tag depending on its ancestors (Figure 1, Figure 2).
DOM mutation in a nutshell
Although it may sound counter-intuitive, parsing and serializing a DOM fragment is not an idempotent operation. This means that in some cases, depending on how the fragment is constructed, the serialized version of a DOM tree won’t result in the same DOM tree when parsed. This double-parsing behavior is inherent to DOMPurify’s standard usage. However, some unexpected results from double-parsing DOM fragments are not unexpected at all, as they are documented in HTML’s current specification. One of these cases regards nested forms. A DOM fragment with nested form tags is not to be considered a valid construction according to HTML’s current specification. However, the following HTML fragment when parsed once will result in a DOM tree with nested forms.
We can use the LiveDOM++ tool to inspect how the DOM behaves, before and after being sanitized with DOMPurify, when it is fed with the HTML fragment mentioned above.
As demonstrated, when parsed for the first time, the HTML fragment results in a non-compliant DOM tree containing nested form tags. After sanitizing it, DOMPurify will serialize the DOM tree and the resulting string will be parsed again by the browser. It is then possible to verify that the direct parenthood of the input tag is transferred the inner form tag to the outer form tag in the final fragment.
We will refer to the type of HTML markup that results in a mutation that changes the direct parent of a tag as an ownership mutation gadget.
The table tag can also be used to construct an ownership mutation gadget as shown below.
In the example above the direct parent of the inner anchor tag is changed from the outer anchor to the div tag. Understanding how these gadgets behave is crucial to understanding the bypass construction. I recommend experimenting with the gadgets on LiveDOM++ to get a better understanding of how the gadgets behave.
By default, all elements in an HTML document must be parsed by a browser according to the rules defined by the HTML namespace; however, if the parser encounters a
<math> tag, it should then parse those elements and their descendants according to the SVG and MathML namespaces respectively. One also needs to consider that both the MathML and SVG namespaces support foreign content as well, meaning a chain of namespaces transitions can be built as shown below.
Namespace confusion, putting it all together
As mentioned before, homograph elements like the style tag have different properties and rendering rules depending on the namespace they are in. This means that if we use an ownership mutation gadget to change the direct parent of a homograph tag and cause its namespace to change, we can trick the sanitizer into producing a malicious serialized HTML fragment. That is exactly what Michał did. Michal’s original bypass used an ownership mutation gadget to change the direct parent of a mglyph tag from a form element in the HTML namespace to a mtext tag in the MathML namespace. This mutation results in the descendants of the mglyph element becoming members of the MathML namespace in the final DOM tree while existing in the HTML namespace in the tree sanitized by DOMPurify.
At the time of this writing, it is still possible to use the update parser feature of the LiveDOM++ tool to reproduce the original bypass using a vulnerable version of DOMPurify.
The details of why and how this works can be referenced in Michał’s original article. Understanding the original bypass is important because it provides the foundation for the variation described in this article.
Building the final payload
Now that I described the building blocks of a potential bypass I will describe the methodology I used to build my bypass. I began by pondering the following:
Both Michał and Gareth used mutations coming from HTML to MathML and back. What if we throw SVG in the mix?
The picture below illustrates the basic structure of the bypass. Using a nested form ownership mutation gadget it is possible to change the mglyph tag parent to mtext; that places the descendants of the mglyph tag to be in the MathML namespace in the final DOM tree. Nothing new so far. The difference is that instead of using it to go from HTML to MathML, we are using it to go from SVG to MathML. HTML→MathML→HTML→SVG to HTML→MathML to be precise.
At this point, I knew I was on to something but it took me a while to build a working payload. The whole point was figuring out a chain of tags that were safe in the SVG namespace and that contained an XSS payload when transferred to the MathML namespace.
This is what I came up with:
In the SVG namespace, mtext will be handled as an unknown tag. The style tag descendants will be rendered as opposed to its homograph in the HTML namespace, and the id attribute of the SVG path tag is nothing more than a safe, free-form text identifier.
What happens when this snippet is parsed in the MathML namespace though?
Here is the result:
First of all, mtext is handled as a MathML text integration point. This means its descendants will be in the HTML namespace, according to MathML’s current specification.
… when MathML is embedded in HTML, or another document markup language, the example is probably best rendered with only the two inequalities represented as MathML at all, letting the text be part of the surrounding HTML.
Once in the HTML namespace, style behaves differently and treats everything until its closing tag as raw text. Finally, the malicious img tag is parsed followed by a harmless piece of raw text that reads “>.
The following snapshot depicts the DOM tree state before being sanitized by DOMPurify.
This next snapshot shows the DOM tree produced by parsing DOMPurify’s output.
I want to thank Michał Bentkowski and Gareth Heyes for their incredible work and constant contributions to the security community. Dr.-Ing. Mario Heiderich (@Cure53) for acting so fast and for being a gentleman despite having to stay up late to work on the fix. Lastly, @LiveOverFlow for reminding me in one of his videos that we need to move away from the basics as soon as we master them and constantly challenge ourselves.