1. Introduction

This post is going to show you how to build a Jersey Cross-Site Scripting XSS filter for Java Web Apps. As we all know, when building a web application of any sort it’s always important to focus on security. Cross-Site Scripting (XSS) is a popular security issue found across many sites. Here is a brief summary on the current XSS situation and a good Cheat Sheet provided by OWASP.

Update: I recently came across this very good page by Google on Cross-Site Scripting.

2. Objective

Our particular web application is built in Java using Jersey to expose functionality as REST services. Using Jersey already avoids the problem of people attempting to hack the URL itself (Jersey will throw a wobbly if the MediaType and the Path don’t match properly). So with this in mind, the following areas were left for sanitization:

  • Query strings
  • Headers
  • Cookies
  • Parameters
  • Parts (multi-part content)

Most people suggest stripping out any potential XSS vulnerabilities when the contents of the request are displayed to your users. This is great for our administration panel. However, most of our data will be accessed externally through our APIs and we get a lot more reads than writes. So, I decided to strip malicious code from query strings, headers and cookies for every request and to secure the parameters and parts when saving the data.

This was also in part due to a limitation in Jersey. I could not clean the Parts submitted as there is no way to store a clean version of the form data (see here).

So what I ended up building was a filter for Jersey that would handle the Query Strings, Headers and Cookies. The parameters and parts would instead be handled when validating our models prior to saving them to the Database.

I did an extensive search on Google and couldn’t find any solution I was happy with. In the process I came across some very interesting libraries which I decided to adopt for the purposes of XSS cleansing. The libraries I adopted were the OWASP ESAPI library and the Jsoup library. Most solutions I found online rely on Regular Expressions which are actually a poor solution for handling XSS. These two libraries are much more efficient at combating this type of vulnerability.

3. The Code – Jersey 1.x

Finally, let’s look at the code that was used. The first thing we need to do is import the necessary libraries. We use Maven for all our dependencies and you can find all the libraries below using the MVN Repository. Once you have all your imports sorted out, you can move onto the Jersey Filter class:

package com.domain.security.filter;

import java.util.ArrayList;
import java.util.List;
import java.util.Map;

import javax.ws.rs.core.MultivaluedMap;

import org.jsoup.Jsoup;
import org.jsoup.nodes.Document;
import org.jsoup.nodes.Entities.EscapeMode;
import org.jsoup.safety.Whitelist;
import org.owasp.esapi.ESAPI;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;

import com.sun.jersey.spi.container.ContainerRequest;
import com.sun.jersey.spi.container.ContainerRequestFilter;

public class XSSFilter implements ContainerRequestFilter
{
	private static final Logger LOG = LoggerFactory.getLogger( XSSFilter.class );

	/**
	 * @see ContainerRequestFilter#filter(ContainerRequest)
	 */
	@Override
	public ContainerRequest filter( ContainerRequest request )
	{
		// Clean the query strings
		cleanParams( request.getQueryParameters() );

		// Clean the headers
		cleanParams( request.getRequestHeaders() );

		// Clean the cookies
		cleanParams( request.getCookieNameValueMap() );

		// Return the cleansed request
		return request;
	}

	/**
	 * Apply the XSS filter to the parameters
	 * @param parameters
	 * @param type
	 */
	private void cleanParams( MultivaluedMap<String, String> parameters )
	{
		LOG.debug( "Checking for XSS Vulnerabilities: {}", parameters );

		for( Map.Entry<String, List<String>> params : parameters.entrySet() )
		{
			String key = params.getKey();
			List<String> values = params.getValue();

			List<String> cleanValues = new ArrayList<String>();
			for( String value : values )
			{
				cleanValues.add( stripXSS( value ) );
			}

			parameters.put( key, cleanValues );
		}

		LOG.debug( "XSS Vulnerabilities removed: {}", parameters );
	}

	/**
	 * Strips any potential XSS threats out of the value
	 * @param value
	 * @return
	 */
	public String stripXSS( String value )
	{
		if( value == null )
			return null;
	
		// Use the ESAPI library to avoid encoded attacks.
		value = ESAPI.encoder().canonicalize( value );

		// Avoid null characters
		value = value.replaceAll("\0", "");

		// Clean out HTML
		Document.OutputSettings outputSettings = new Document.OutputSettings();
		outputSettings.escapeMode( EscapeMode.xhtml );
		outputSettings.prettyPrint( false );
		value = Jsoup.clean( value, "", Whitelist.none(), outputSettings );

		return value;
	}
}

In order to add this filter to Jersey you must ammend your web.xml:

<servlet>
	<servlet-name>WebService</servlet-name>
	<servlet-class>com.sun.jersey.spi.container.servlet.ServletContainer</servlet-class>
	<init-param>
		<param-name>com.sun.jersey.spi.container.ContainerRequestFilters</param-name>
		<param-value>com.domain.security.filter.XSSFilter;</param-value>
	</init-param>
	<load-on-startup>1</load-on-startup>
</servlet>
<servlet-mapping>
	<servlet-name>WebServices</servlet-name>
	<url-pattern>/api/*</url-pattern>
</servlet-mapping>

You may notice that we are using the SpringServlet, this is because we use Spring. If you are not using spring you can use the standard Jersey web.xml configuration instead.

Note: Joe posted a comment highlighting that the changes to the web.xml aren’t actually necessary. The same can be achieved by simply adding the @ResourceFilters annotation to the XSSFilter class. This saves a bunch of verbose configuration and is rather elegant.

Ok, so with the code above we are now filtering all the requests which hit our REST APIs. Note: the filtering process being executed above could be implemented as a standard Filter and a wrapper around the HttpRequest. In fact, ESAPI offer a security wrapper around the HttpRequest but I didn’t like the fact it relied heavily on Regular Expressions.

Now, onto securing the contents of POST requests. We received our content in a JSON format or as multipart forms. In both cases the content will be put into our own models which we then validate using the Hibernate JSR-303 implementation. In particular we use the @SafeHtml annotation. All the fields of our models which are populated by our users and that accept Strings are annotated with @SafeHtml. Below is a sample model:

import org.hibernate.validator.constraints.SafeHtml;
import org.hibernate.validator.constraints.SafeHtml.WhiteListType;

public class MySecureModel {
	@SafeHtml( whitelistType = WhiteListType.NONE )
	private String userInput;
}

The good thing about the @SafeHtml implementation is that you can select how lenient your verification should be. In our case we mostly don’t allow HTML, which means the WhiteListType.NONE is what we want.

4. The Code – Jersey 2.x

Jersey 2.x has changed how we can apply our XSSFilter and the code needs to be altered to fit the new API. Unfortunately it has become impossible to strip our cookies from evil XSS but the headers and query strings can still be cleaned.

package com.domain.security.filter;

import java.util.ArrayList;
import java.util.List;
import java.util.Map;

import javax.ws.rs.container.ContainerRequestContext;
import javax.ws.rs.container.ContainerRequestFilter;
import javax.ws.rs.container.PreMatching;
import javax.ws.rs.core.MultivaluedMap;
import javax.ws.rs.core.UriBuilder;

import org.jsoup.Jsoup;
import org.jsoup.nodes.Document;
import org.jsoup.nodes.Entities.EscapeMode;
import org.jsoup.safety.Whitelist;
import org.owasp.esapi.ESAPI;

import org.apache.commons.collections4.CollectionUtils;
import org.glassfish.jersey.server.ContainerRequest;

@PreMatching
public class XSSFilter implements ContainerRequestFilter
{
	/**
	 * @see ContainerRequestFilter#filter(ContainerRequest)
	 */
	@Override
	public void filter( ContainerRequestContext request )
	{
		cleanQueryParams( request );
		cleanHeaders( request.getHeaders() );
	}


	/**
	 * Replace the existing query parameters with ones stripped of XSS vulnerabilities
	 * @param request
	 */
	private void cleanQueryParams( ContainerRequestContext request )
	{
		UriBuilder builder = request.getUriInfo().getRequestUriBuilder();
		MultivaluedMap<String, String> queries = request.getUriInfo().getQueryParameters();

		for( Map.Entry<String, List<String>> query : queries.entrySet() )
		{
			String key = query.getKey();
			List<String> values = query.getValue();

			List<String> xssValues = new ArrayList<String>();
			for( String value : values ) {
				xssValues.add( stripXSS( value ) );
			}

			Integer size = CollectionUtils.size( xssValues );
			builder.replaceQueryParam( key );

			if( size == 1 ) {
				builder.replaceQueryParam( key, xssValues.get( 0 ) );
			} else if( size > 1 ) {
				builder.replaceQueryParam( key, xssValues.toArray() );
			}
		}

		request.setRequestUri( builder.build() );
	}


	/**
	 * Replace the existing headers with ones stripped of XSS vulnerabilities
	 * @param headers
	 */
	private void cleanHeaders( MultivaluedMap<String, String> headers )
	{
		for( Map.Entry<String, List<String>> header : headers.entrySet() )
		{
			String key = header.getKey();
			List<String> values = header.getValue();

			List<String> cleanValues = new ArrayList<String>();
			for( String value : values ) {
				cleanValues.add( stripXSS( value ) );
			}

			headers.put( key, cleanValues );
		}
	}

	/**
	 * Strips any potential XSS threats out of the value
	 * @param value
	 * @return
	 */
	public String stripXSS( String value )
	{
		if( value == null )
			return null;
	
		// Use the ESAPI library to avoid encoded attacks.
		value = ESAPI.encoder().canonicalize( value );

		// Avoid null characters
		value = value.replaceAll(&quot;\0&quot;, &quot;&quot;);

		// Clean out HTML
		Document.OutputSettings outputSettings = new Document.OutputSettings();
		outputSettings.escapeMode( EscapeMode.xhtml );
		outputSettings.prettyPrint( false );
		value = Jsoup.clean( value, "", Whitelist.none(), outputSettings );

		return value;
	}
}

Done! You can now filter out XSS vulnerabilities using Jersey 2.x. Note that the cleaning of objects is still done as with Jersey 1.x.

5. Conclusion

So this is how we implemented our XSS prevention. As you may have noticed in the filter itself we use ESAPI and JSoup. I’m thinking this is a bit of an overkill and the tests I’ve put together seem to pass with or without ESAPI. However, after a lot of reading it is highly recommended to keep it in, but I’m still not convinced. If anyone has any feedback on this, it would be awesome!
I would also love to hear about any potential flaws in this approach. There are a lot of discussions online about how to prevent XSS but no full blown details explanation / implementation to be found.

The sample code used in this post is available by subscribing below. The downloaded source code usually comes with a few goodies which aren’t in the post! This code, for example, contains a test page for the filter and the JSR 303 validation, together with a list of example XSS vulnerabilities to try out.
The source code is all configured using Maven, so you can be up and running by simply typing “mvn clean install tomcat7:run-war-only”. This will spin up the demo on http://localhost:8080/.

Alessandro