Forum OpenACS Q&A: Cross Site Scripting FAQ (fwd)

Incidentally, a few days ago I wrote this. I hope people will find it useful.

How to make your web software immune to Cross-Site Request Forgeries (DRAFT)

by Branimir Dolicki, May 16, 2002, version 0.1

Introduction

I am going to describe a security problem that makes many web services vulnerable to a very dangerous kind of attack known as CSRF. Then I'll explain how server-side software can be fixed in order to prevent this kind of attack. CSRF (pronounced "sea surf") stands for Cross-Site Request Forgeries - The name comes from http://www.tux.org/~peterw/csrf.txt.

There are quite a few web pages and archived email discussions out on the net (including the one just mentioned) discussing this problem, although not as many as one would hope given the number of the affected systems and severity of the problem. The reason why I've decided to contribute one more is that so far I wasn't able to find an article that would clearly state what vendors of server-side software should do to make web services based on their software immune to CSRF.

Most of the material for this article comes from discussing this issues with Dave (who first drew my attention to this problem) and Hrvoje.

Am I affected?

Yes, if your web service:

Uses the GET method for significant actions on the server
Doesn't check for the origin of requests that take significant actions on the server.
Uses only cookies for authentication

If the point 1 from the above list doesn't apply to your web service, that is, if you have actually read this line from RFC 2616: Hypertext Transfer Protocol -- HTTP/1.1, section 9.1.1:

In particular, the convention has been established that the GET and HEAD methods SHOULD NOT have the significance of taking an action other than retrieval.

then you are still vulnerable, but to a less dangerous kind of attack if points 2 and 3 still apply. (See below - The Click Here Attack.)

The IMG Attack

Joe Admin is an administrator of a web-based discussion forum in his organization. The admin interface uses the following URL to delete all messages in a forum:

http://example.com/adm/rm-rf?id=1234&areyousure=yes

Joe uses the same browser to surf the Wild Wild Web outside of his safe corporate firewall. For example, to access his private web-based E-mail account.

One day Joe comes accross a very interesting posting on a public web-based bboard that allows users to format their messages with HTML. He was reading the story with great interest. However, he didn't know that the story he was reading contained the following piece of HTML:

<IMG SRC="http://example.com/adm/rm-rf?id=1234&areyousure=yes" WIDTH="0" HEIGHT="0">

Joe's browser simply obeyed what it was told and issued a GET request on the URL. As Joe was already logged in (with a cookie) to example.com, the server software simply fulfilled the request. Joe didn't notice anything until few days later when he saw that all the messages in the forum were gone. There was no way for him to figure out that the interesting article in that public bboard had anything to do with the disastrous attack.

A magnifying aspect of the IMG Attack is that the attacker could enter many such URLs on a single page, either to get Joe's browser to perform a whole range of destructive actions on the site or to try out different URL variants: In our example, he could include hundreds of IMG elements with various ids in the URL, hoping that at least one would match an actual object to delete. His browser would dilligently make all those requests to the server.

The Click Here Attack

The idea of the Click Here Attack is very similar to the IMG Attack. The only difference is that instead of using the IMG tag the attacker uses a HTML form. For example, this snippet of HTML:

<FORM ACTION="http://example.com/adm/rm-rf" METHOD="POST">
 <INPUT TYPE="hidden" NAME="id" VALUE="1234">
 <INPUT TYPE="hidden" NAME="areyousure" VALUE="yes">
 <INPUT TYPE="submit" VALUE="Click Here to Read More">
</FORM>

Renders like this in your browser:

Alternativelly, the attacker could use an image map instead of the button.

To most users this looks quite harmless. Unlike with ordinary links, most browsers won't even show the URL in the status bar when the mouse pointer comes over the button.

The reason why I'm treating this as a separate kind of attack is that the Click Here Attack can also be used to attack systems which strictly use POST for performing significant actions on the server and GET only for requests that don't have significant side effects.

Two main difference from the IMG Attack make the Click Here Attack less dangerous:

Whereas with the IMG Attack all the victim has to do is to visit a page (perhaps through a link from a page that can be considered trusted), with the Click Here Attack the victim must be lured into explicitly clicking on the button on a page that generally can't be trusted (a public forum message for example).
Whereas with the IMG Attack the victim never learns what happened to him, with the Click Here Attack he will most likely get a page from his web site saying something like "You have successfully deleted all messages". That way the victim would at least learn that he is vulnerable and could try to identify the attacker.
NOTE 1: I don't know much about JavaScript but I suspect if it is turned on the attacker could find a way to hide the confirmation page.
NOTE 2: Some systems don't give the user a page confirming the action that he just performed but rather redirect him to to a different page. This is good because it prevents the infamous Netscape 4.x message "Data Missing. This document resulted from a POST operation ...". However some of those systems (and I don't want to name them) do it by including a form variable called, say, return_url in the request. The form handler then redirects the user to the URL in that variable. For these systems the attacker has quite an ellegant way to conceal what happened: return_url=http://cnn.com!

However, this is still a serious security problem and you should consider it if you are developing server-side web software.

Clearly, there are also other scenarios that exploit this vulnerability such as using <A HREF="...">...-style links or JavaScript. I've limited myself to the two scenarios above because the ultimate purpose of this page is to provide practical ways to fix the affected server-side systems: If you make your system immune to these two attacks, it should also be immune to the variants. Of course, I might be wrong - send me an email if you think this is incomplete.

What can we do about it?

Many discussions on CSRF floating around on the Net propose various remedies and workarounds ranging from preventing people from using the IMG tag in public forums and HTML emails to suggesting users to use a separate browser for promiscous browsing.

However, if you are responsible for security of the affected web services (either as a software vendor or as custom software developer) you can't rely on it that external sites will filter malicious HTML or that users will use a two different browsers. If your software is vulnerable to CSRF you must realize that you have a bug in your software and fix it.

1. The form handler should refuse `GET` requests if it is doing significant things

Developers of server-side web applications must make sure that pages that do significant things refuse GET requests.

If you are trying to fix an already existing application there is often no other way but to go through all pages that do something significant (rather than merely return some information) and make them refuse GET requests.

Of course, this also means that if your own application uses <A HREF="...">...-style links to access those pages those should be changed to forms with METHOD="POST".

2. Forms handlers should check the origin of the form

One way to make sure that the request is comming from a form that was indeed issued by our system would be using the Referer header. However, some browsers and proxy servers remove or don't send this header so this solution is limited.

A more robust solution is to generate a one-time random token and include it in a hidden field on the form:

<INPUT TYPE="hidden" NAME="onetimetoken" VALUE="D0E0F0TA9SEF089EJGVW0">

You'll store all so far issued tokens in your database on the server. Here's a possible data model:

create table onetimetokens (
     token char(40) not null,
     user_id integer not null,
     form_url varchar(4000) not null,
     expiration_date date not null,
     primary key (token, user_id)
)

A part of table might look like this:

token	user_id	form_url	expiration_date
D0E0F0TA9SEF089EJGVW0	332	/adm/rm-rf-form	2002-06-01
C0F234ZY8TAG0WN0YBVW0	332	/adm/approve-user-form	2002-06-03
A35BTK0DT0BRT0MLTNNPG	1442	/adm/rm-rf-form	2002-06-02
...	...	...	...

token: This is a randomly generated secret token
user_id, form_url: It's a good idea to include the user_id (or some other thing uniquely identifying the user) and form_url (the relative URL of the form that issued the token) to limit the opportunity for an attacker who might have access to a different part of the system to abuse tokens generated for him.
expiration_date: Although one-time tokens can be deleted or invalidated as soon as they are used, there will always be cases where we issue a token but the user never submits the form. These tokens can then be deleted or invalidated after they have expired.

The form handler would query this table to find out whether the token submitted with the POST request is among the tokens issued in the past for to that user and form.

Previously I called these tokens "one-time". That means that you would can delete the corresponding row from onetimetokens as soon as the job is done (Or mark the token as invalid with an additional predicate column. You can then physically delete rows in batches during off-peak times. Depending which database you use this might be better for the performance.) In addition to that you will need a garbage collector that would periodically remove all the tokens that were issued but the user never actually submitted the form. In addition to the one-time tokens you might also allow tokens that can be used multiple times. That is useful for forms that people use to insert multiple similar records by using the Back button, changing only one or two fields and re-submitting the form. The reusable tokens could then be collected with the garbage collector. Note that this is not necessary if you have a confirmation page. In this case a new one-time token should be generated on the confirmation page.

3. Your developers should know about all this stuff

After you have discovered the framework for doing this in your development environment (or, more likely, written one yourself because there was none), you have to make sure all your developers know how to use it. As you could see in the discussion above, this can't be accomplished with a magic module that you plug into your webserver and forget about it. Depending on what the application does and how it does it, the method check, token generation and token invalidation have to happen at different places and times. Also, different forms might have different expiration times, depending on what they do. The developer has to be familiar with both the problem and the tools at his disposal in order to be able to make the right decision. Additionally, your QA personnel should know about this problem and include testing for vulnerabilities in their test plans.

A note about cookie-based authentication

Some people would argue that the existence of CSRF calls for replacing cookie-based authentication with purely URL-based authentication.

However I'm against this for the following reasons: URL-based authentication has its own security risks caused by storing URLs in logfiles, proxy caches and the alike. It also has usability problems: unability to take advantage of the visited links being shown in different color by the browser, problems with passing the URLs around etc. The solutions with security tokens proposed above is a kind of hybrid one: you normally use cookie-based authentication (or HTTP Authentication) , but in addition to that, for critical requests, you can use a secret token embedded in the POST request (or even URLs).

Conclusion

The CSRF attacks pose a severe security risk. People who produce server-side software for web services must make sure their systems are secure for this kind of problems. That includes fixing their current systems and educating their developers how to write code that is immune to CSRF.

References

http://www.tux.org/~peterw/csrf.txt - an email exchange addressing this problem
RFC 2616: Hypertext Transfer Protocol -- HTTP/1.1, section 9.1.1
CERT® Advisory CA-2000-02 Malicious HTML Tags Embedded in Client Web Requests - describes a related class of problems.
RFC 2617: HTTP Authentication: Basic and Digest Access Authentication)

Branimir Dolicki