CSRF protection with 'self-validating' tokens

By crisp on Saturday 17 April 2010 01:10 - Comments (14)
Categories: PHP, Tweakers.net, Views: 15.415

Cross-site Request Forgery is a very common social exploit method to make people unknowingly do things on their own behalf on a targeted website. It's the number four on the 2010 CWE/SANS Top 25 Most Dangerous Programming Errors list.

The main reason this problem exists in most websites is the fact that they don't check the origin of an incoming request that results in an action on that website. There are several ways a website can protect itself against these sort of attacks and I'm going to explain the way we, at Tweakers.net, have implemented our own protection method.

I'm not saying that Tweakers.net is 100% safe against CSRF attacks; I know we're not, and a recent topic on our forum made that somewhat painfully clear again: with the redesign of our account pages we separated certain data into several forms, but forgot to add our token-based security measure to each of them.

Adding a special token to an HTML form is a way to prevent these CSRF attacks. Such tokens can be used to verify whether the request to change data to your website actually originated from one of your own pages and thus may be considered 'valid'. We have been using our own token-system for quite some time and in various (but not all) places on our website, but our frontpage token-system suffered from some serious drawbacks.

On our forum we are using the user's session-id as a token for forms that can be used to manipulate data. This in itself is fine, but it means that session-id's are being exposed in the mark-up data, and as a token they are not actually limited to anything (except maybe on IP-address from the originating request).

When we designed our token system for the frontpage we wanted some more security; we wanted tokens that can only be used once for each request and that were tied to a user's session-id (or IP-address in case of not-logged-in users). So we simply set up a system that created random tokens and wrote them to the database with a creation-time and the user's session-id or IP-address. Tokens older than 24 hours were removed (and thus invalidated).

This has worked well for a long time, but performance-wise became a real bottleneck: the random insertions of tokens (the random token id being the primary key of the table) in a table having always around 500.000 'active' keys sometimes slowed down page-generation enormously.

When you have a table containing over half a million tokens of which only a very small percentage will actually be used you are most certainly doing something wrong, so we decided to change that. We only wanted to store tokens that where actually used to prevent re-use, and let all other other tokens expire by itself. The question was how to accomplish that without needing storage of those tokens. The answer was: creating tokens that can validate themselves against their use and expiration :)

The idea is simple: take those items that you want to be validated and hash them with a secret key. The result will be the token used in the form, and server-side we can recreate that same token and compare it to the value received from the request. That takes care of almost everything, except for the expiration part. We need to be able to determine the expiration 'plainly' from the token itself. Luckily, if you add the same expiration to the items that are being hashed, it is no problem to add it in a readable format to the token. Also, we can use this expiration as a timestamp to store the tokens that have been used, together with the actual token itself to create an unique key with a primary key that is not random but actual rather sequential. We added another 6-byte random identifier in order to create randomness for the same expiration timestamps.

So in short, this is what we do:
  • determine expiration timestamp
  • create a random 6-byte string
  • hash together expiration timestamp + an action identifier (that binds the token to a certain action or page) + session-id or IP adress using a key consisting of the random 6-byte string and a secret key
  • combine above 3 values into one single token
Even though the expiration timestamp is directly readable from the combined token itself it isn't changeable because it would result in an other hash being recreated upon validation. Also by adding an 'action' identifier we were able to restrict tokens to a certain use. With the expiration being added we are more flexible in how long our tokens should be valid and can (and have) reduced that significantly compared to the 24 hours that our old token-system used.

Here's a simplified version of our token-class that illustrated this method:

PHP:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
/*
CREATE TABLE `form_tokens` (
  `token_expires` int(10) unsigned NOT NULL,
  `token` binary(20) NOT NULL,
  PRIMARY KEY (`token_expires`,`token`)
) ENGINE=InnoDB;
*/

/**
 * Class to validate form-submits to prevent CSRF attacks.
 *
 * This allows us to check whether a user had earlier requested the form from us
 * and can also be used to prevent resubmitting a form. This specific implementation
 * uses "self validating" tokens, for more information, see this blogpost:
 * http://crisp.tweakblogs.net/blog/3928/csrf-protection-with-self-validating-tokens.html
 *
 * To simply get a token to use 'somewhere', use FormToken::generateToken(),
 * to get a bit of ready made html, use FormToken::generateTokenHtml(). 
 */
class FormToken
{
    const defaultTtl = 7200;
    const formTokenName = 'token';
    const tokenKey = 'this is my little secret';
    
    private $action;
    private $token;

    private $tokenError;

    /**
     * Contstruct a new FormToken-instance used for validating some submitted form token.
     *
     * @param string|bool $action The action to use for validation, use true to fall back to the current script's filename.
     * @param string|bool $token The token to use for validation, use false to indicate retrieving it from the HTTP request parameters.
     */
    public function __construct($action=false, $token=false)
    {
        $this->action = $action === true ? $this->getCurrentAction() : $action;
        $this->token = $token === false ? $this->getTokenFromRequest() : $token;
    }

    /**
     * Generate a token for a specific form.
     *
     * This ties the generated token to the given action, so it can't be reused
     * outside the specific form. Obviously leaving the action empty here and while
     * validating mitigates that limitation.
     *
     * @param string|bool $action The form-action where this token is placed in, use true to use the current script's filename.
     * @param int $ttl The time the token is valid from the script's start-up time moment on.
     * @return string A base64 encoded string representing the token data.
     */
    public static function generateToken($action=false, $ttl=self::defaultTtl)
    {
        if ($action === true)
            $action = self::getCurrentAction();

        $timeValid = time() + $ttl;

        $prefix = self::getRandomHexString(6);
        $timeHex = sprintf('%08x', $timeValid);

        $hmac = self::getTokenHmac($prefix, $timeValid, $action);

        return base64_encode(pack('H*', $prefix . $timeHex . $hmac));
    }

    /**
     * Generate a token wrapped in a ready-made hidden input HTML-element.
     *
     * @param string|bool $action The form-action where this token is placed in, use true to use the current script's filename.
     * @param int $ttl The time to live.
     * @return string A string representing a hidden input with a new token. 
     * @see generateToken()
     */
    public static function generateTokenHtml($action=false, $ttl=self::defaultTtl)
    {
        $token = self::generateToken($action, $ttl);

        return '<input type="hidden" name="' . self::formTokenName . '" value="' . $token . '">';
    }

    /**
     * Validate the representated token.
     *
     * This method may also be used to actually prevent reuse of the token, by storing it in the database.
     *
     * @param bool $storeToken Whether to store the token to prevent reuse.
     * @return bool Whether the token was valid and not used previously.
     */
    public function validateToken($storeToken=false)
    {
        if (empty($this->token))
        {
            $this->tokenError = 'token_missing';
            return false;
        }

        if (strlen($this->token) == 40)
        {
            $hex = bin2hex(base64_decode($this->token, true));
            if ($hex)
            {
                // Reconstruct the token's elements
                $timeValid = hexdec(substr($hex, 12, 8));

                if ($timeValid < time())
                {
                    $this->tokenError = 'token_expired';
                    return false;
                }

                $shortToken = substr($hex, 20);
                $hmac = self::getTokenHmac(substr($hex, 0, 12), $timeValid, $this->action);

                if ($hmac == $shortToken)
                {
                    // Write token to DB to prevent re-use
                    $tokenValid = !$storeToken || @mysql_query("
                        INSERT INTO
                            form_tokens
                        (
                            token_expires,
                            token
                        )
                        VALUES
                        (
                            " . $timeValid . ",
                            '" . mysql_real_escape_string(pack('H*', $shortToken)) . "'
                        )"
                    );

                    if (!$tokenValid) // duplicate key
                        $this->tokenError = 'token_already_used';

                    return $tokenValid;
                }
            }
        }

        $this->tokenError = 'token_invalid';
        return false;
    }

    /**
     * Get the reason why the token was considered invalid.
     *
     * @return string The reason.
     */
    public function getTokenError()
    {
        return $this->tokenError;
    }

    /**
     * Generate a HMAC-SHA1 encoded string with the given prefix and some additional identifiyng elements.
     *
     * This mainly uses the user's sessionid or ip to prevent token-sharing.
     *
     * @param string $prefix A (preferrably random) prefix for the token generation.
     * @param int $time The time the token will be valid.
     * @param string $action The form-action to encode with the token.
     * @return string The final HMAC-binary string.
     */
    private static function getTokenHmac($prefix, $time, $action)
    {
        $data = array($time);

        $data[] = session_id() ? session_id() : $_SERVER['REMOTE_ADDR'];

        if ($action)
            $data[] = $action;

        /* you might want to add user agent as well
        if (!empty($_SERVER['HTTP_USER_AGENT']))
            $data[] = $_SERVER['HTTP_USER_AGENT'];
        */

        $key = $prefix . self::tokenKey;
        $data = implode('|', $data);
        $blocksize = 64;

        if (strlen($key) > $blocksize)
            $key = sha1($key, true);

        $key = str_pad($key, $blocksize, chr(0x00));
        $ipad = str_repeat(chr(0x36),$blocksize);
        $opad = str_repeat(chr(0x5c),$blocksize);

        return sha1(($key ^ $opad) . sha1(($key ^ $ipad) . $data, true));
    }

    /**
     * Retrieve a random hex-encoded string.
     *
     * @param int $byteCount The number of bytes for the return string.
     * @return string A random hex-encoded string.
     */
    private static function getRandomHexString($byteCount)
    {
        $byteString = '';

        while ($byteCount--)
            $byteString .= sprintf('%02x', mt_rand(0, 255));

        return $byteString;
    }

    /**
     * Retrieve the current name of the PHP script file.
     *
     * @return string The name of the current PHP script file.
     */
    private static function getCurrentAction()
    {
        return $_SERVER['SCRIPT_NAME'];
    }

    /**
     * Retrieve a token from the POST or otherwise GET data.
     *
     * @param string $tokenName The token's html-element name in the form.
     * @return bool|string The token to retrieve or false if none was found.
     */
    private function getTokenFromRequest($tokenName=self::formTokenName)
    {
        if (isset($_POST[$tokenName]))
            return $_POST[$tokenName];

        if (isset($_GET[$tokenName]))
            return $_GET[$tokenName];

        return false;
    }
}


To add a token to your HTML form, just add the output from
FormToken::generateTokenHtml([action]);
somewhere within the form itself, where [action] is an identifier to identify the current action (you can use the boolean 'true' to fall back to the current script's name). To validate, you can do something like this:

PHP:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
$formToken = new FormToken([action]);
if (!$formToken->validateToken(true))
{
    switch ($formToken->getTokenError())
    {
        case 'token_missing':
            echo 'There was no token in the request.';
        break;

        case 'token_expired':
            echo 'The token was expired.';
        break;

        case 'token_already_used':
            echo 'The token was already used before.';
        break;

        case 'token_invalid':
        default:
            echo 'The token was invalid.';
    }
}
else
{
    // proceed
}


You may use this code in any way you like, it's not licensed whatsoever. Please help make the web a safer place!

Volgende: Tweakblogs: stats, quotes & performance 04-'10 Tweakblogs: stats, quotes & performance
Volgende: The MS browser ballot page analyzed 03-'10 The MS browser ballot page analyzed

Comments


By Wouter, Saturday 17 April 2010 09:01

If a malicious individual gains enough control over a victims browser to POST requests, wouldn't they easily be able to do a GET before and obtain a perfectly valid token to do their nasty business with?

By Wouter, Saturday 17 April 2010 09:46

Nevermind, apparently GET requests have much better XSS protection than POST ones, which creates the need for developers to apply tricks like this one.

By Tweakers user Voutloos, Saturday 17 April 2010 10:49

@Wouter: Nope, they don't have better protection (less actually), but the attacker doesn't get to see the responses. So you could trigger some requests, but since GET is idempotent it doesn't hurt anybody.

And beware, XSS != CSRF

By Tweakers user ACM, Saturday 17 April 2010 10:51

Wouter: if the browsersecurity is indeed that much breached, it is quite hard to do anything about it. If you have enough control to do a GET-request, analyze the page contents and then a POST-request, there is simply no way of distinguishing it from actual usage.
This is not the area of security issues we're protecting against with these tokens. But if someone constructs a POST-form (for instance with javascript) and submits "some" action to your site, you can prevent that since the POST-request will lack a valid token for the specific browser/user.

(its somewhat ironical that a spam-reaction occured here..., lets remove that one)

[Comment edited on Saturday 17 April 2010 10:51]


By Tweakers user Cheatah, Saturday 17 April 2010 10:51

No Wouter, you're wrong. The "problem" is that with a CSRF attack, an attacker can only let the victims submit a single request. They don't get the "return page" that the victim receives. So the attacker is pretty much blind. This is exploitable in combination with another XSS exploit which makes it possible to return information to the attacker, but by itself this is an excellent method.

@crisp, all of this reminds me a bit of Kerberos. It might be interesting for you to have a look at that. It may inspire you with some new ideas.

By JensDT, Saturday 17 April 2010 13:09

In a 'Safe Software' course I took at university last semester we spent some time discussing CSRF. The course teachers also developed a Firefox plugin that provides some extra protection from CSRF by stripping HTTP authentication headers and session cookies from cross site requests. Unfortunately, this also causes some useability issues, certainly in a web2.0 context.

Ofcourse, preventing CSRF is ultimately the responsability of the web site owners, not the users. But unfortunately we all know that not all web applications are as secured as they should be...

The link for those interested: https://addons.mozilla.org/nl/firefox/addon/58189

By Wouter, Saturday 17 April 2010 15:23

Hehe, three people explaining the same thing within a few minutes, I guess I didn't word my second post very well ;).

The attacker being "blind" is what I meant by XSS protection against GET requests, as in: something the browser does to keep me safe. Someone can always request a page as if they were me, but unless there are serious exploits in the browser, or the server software is not so "idempotent" as it should be, that will not matter.

But apparently there is no such protection that the browser gives me against POST. Would it not be pretty straightforward to check form targets against the same origin policy and get rid of this CSRF problem? I imagine this change would break at least some legacy applications, but they could at least add a warning, or perhaps make the switch for HTML5 mode only.

By DuoCoding, Sunday 18 April 2010 19:42

Crisp, first of all a great blogpost! But I want to notice you, there is a small error in your code. The function "getRandomHexString($byteCount)" will return a string with twice the specified length, because there is a concatenation on two random chars in the main loop. I think you meant to append a single random char instead of two chars.

PHP code:
while ($byteCount--)
$byteString .= $hexstring[mt_rand(0, 15)] . $hexstring[mt_rand(0, 15)];

By Tweakers user crisp, Sunday 18 April 2010 20:27

@DuoCoding: no, the function is correct; 6 bytes hex-encoded will give you a string of 12 characters.

However, I can see how this function is confusing so I replaced it with a simpler (and slightly faster) function (at Tweakers.net we read from /dev/urandom, but since that's not supported cross platform I wrote this drop-in replacement function).

By w3cvalidation, Thursday 29 April 2010 15:19

Nice information, I really appreciate the way you presented.Thanks for sharing..

By Tweakers user Cartman!, Saturday 31 July 2010 23:11

Terwijl ik een comment tik en submit krijg ik "The security token is invalid", blijkbaar is het niet 'voutloos' dus. Net moest ik gegevens invoeren en nu ben ik ineens ingelogd met m'n GoT-account. Een bugje zit in een klein hoekje.

Maar verder: mooi artikel, hopelijk nemen velen dit goede idee over.

By Tweakers user crisp, Monday 2 August 2010 09:07

Cartman! wrote on Saturday 31 July 2010 @ 23:11:
Terwijl ik een comment tik en submit krijg ik "The security token is invalid", blijkbaar is het niet 'voutloos' dus. Net moest ik gegevens invoeren en nu ben ik ineens ingelogd met m'n GoT-account. Een bugje zit in een klein hoekje.
Dat je token dan niet meer geldig is is een oorzaak van het uitgelogd zijn; je sessie komt dan immers niet meer overeen en dat is een onderdeel van het token. Wat dan weer de oorzaak is van het uitgelogd zijn is dan nog de vraag; dat kan meerdere oorzaken hebben, en meestal duidt dat op een clientside issue...

By Tweakers user Cartman!, Monday 2 August 2010 16:12

crisp wrote on Monday 02 August 2010 @ 09:07:
[...]

Dat je token dan niet meer geldig is is een oorzaak van het uitgelogd zijn; je sessie komt dan immers niet meer overeen en dat is een onderdeel van het token. Wat dan weer de oorzaak is van het uitgelogd zijn is dan nog de vraag; dat kan meerdere oorzaken hebben, en meestal duidt dat op een clientside issue...
Daarvoor heb ik gewoon op GoT rondgeneusd en gepost. Ik volgde het linkje in PRG en kwam hier. Onderaan werd niet gedetecteerd dat ik ingelogd was want ik moest een hele zwik gegevens invoeren. Toen ik m'n reply wilde posten kreeg ik dus de foutmelding. Toen kreeg ik ineens "Your comment will be posted under Cartman!" etc. Dat om die reden de token dus niet werkte is logisch, het formulier maakte blijkbaar een token zonder mijn sessieId maar met mijn ip waarna ie faalde. Maakt niet uit verder maar ergens zat er dus iets niet lekker ;)

By Tweakers user OnTracK, Friday 10 September 2010 11:56

Ik vraag me af of je add_rewrite_var (PHP) zou kunnen gebruiken om automatisch dit soort dingen toe te voegen aan een form. Ook de controle zou dan automatisch kunnen gebeuren in een frameworkje.

Comments are closed