RFD 169
Console Authentication and Session Management

Per [rfd223], the web console is built as a static JS bundle served by Nexus. This means that it makes API requests directly from the browser. This RFD is about how we authenticate those requests. [owasp-session] is an excellent resource and is well worth reading for background. The determinations below largely represent things that have already been implemented in Nexus at the time of this revision.


Session cookie

In order to authenticate requests from the Console running in the user’s browser, Nexus must accept session cookies as an authentication method. The value of the cookie is a simple random string (see [owasp-session]) that points to at most one row in a sessions table. A session row contains a user ID foreign key. If a request comes in with a session cookie that points to a session in the DB that is not expired, the user is authenticated and identified as themselves.

Security attributes

Details in [security] section below. Secure, HttpOnly, SameSite=Lax, expiration time, don’t explicitly set a domain.

Mitigate CSRF with SameSite=Lax + no mutations in GET requests (which we’re already doing). Due to browser support and subdomain weakness, combine with session-scoped CSRF token. Put token in HTML response (or non-HttpOnly cookie) and have API requests send it back in a custom header. Custom header with arbitrary placeholder value (no token) may actually be sufficient due to browser restrictions on custom headers from other sites.

Behavior for bad requests

If the session is expired or nonexistent, Nexus’s response depends on whether the request is an API request or a console route request. For API requests, we respond with a 401 and the client decides what to do with that (in most cases we will likely redirect to a login page). But for console pages, we will want to return a redirect directly to the customer’s auth provider as indicated in the login flow diagram below.

Expiration and extension

As recommended in [owasp-session], there are two different TTLs that can cause a session to expire: an idle timeout and an absolute timeout. Precise numbers for these do not need to be determined here. They are configurable in Nexus.

Idle timeout is meant to be short, on the order of 30 minutes or an hour. Idle time is measured since the last successful use of the session. If the user does not do anything to trigger an authenticated request for the length of the idle timeout, the session expires. Upon successful use of a session cookie for authentication, the time of last use for that session is updated to now in the database.

Absolute timeout is a bound on the total lifetime of a session, so it is measured from the time of session creation rather than time of last use. If a request comes in and time created was longer ago than the absolute timeout, the session is considered expired. It limits how long a session can be extended for. It is meant to be longer than the idle timeout.

Hard deletion and cleanup

For now we are hard-deleting sessions when they are found to be expired. For sessions that expire without a request actually coming in to trigger a deletion, we will also run a regular job to delete old sessions. Details TBD.


Logging in to the console

This diagram illustrates what happens when a user logs in. I’m using an OAuth 2.0 auth code flow as an example, but it’s going to look similar regardless of the protocol. The mechanism by which we redirect to the original target page (the last step in the flow) is discussed below in [login-redirect].

login flow

Logged in console request

Now that the user is logged in, they can make the same request to /private_route again, but this time it will go through.

logged in request


Cross-site request forgery (CSRF)

If a user is logged in to the console and has a session cookie set in the browser, that cookie will be sent along with any requests to Nexus. CSRF attacks trick the user into sending a request to Nexus from some other site, usually by setting a URL of ours as the action on a form embedded in a page.

SameSite cookie attribute

SameSite=Lax: Cookies are not sent on normal cross-site subrequests (for example to load images or frames into a third party site), but are sent when a user is navigating to the origin site (i.e., when following a link).

SameSite=Strict: Cookies will only be sent in a first-party context and not be sent along with requests initiated by third party websites.

SameSite cookies

The SameSite cookie attribute neutralizes CSRF more or less completely by telling the browser to only send the cookie along with requests that originate from our site. We will want to use the Lax value so that when a user clicks a link to the console from somewhere else, their session cookie is sent along with that request and we do not redirect them to login.

However, there are two problems with SameSite: browser support and subdomains.

Browser support is high but not quite at 100% — you basically have to be using a 2-3 year old browser to not have it. some devs are not comfortable relying on it yet. Virtually all of our users will be using a browser that supports it, but it’s always possible that one might not be. So we should use the SameSite attribute alongside some other mitigation. This StackOverflow answer makes the clever point that browsers supporting TLS 1.3 all support SameSite cookies, so one way to guarantee SameSite is supported is to disable TLS 1.2 in Nexus. Browser support for TLS 1.3 is pretty good, so this is actually a live possibility.

The other problem is that the "site" in question only refers to the top-level domain. This means that SameSite offers no protection against CSRF attacks from subdomains. Because we will not control the domain Nexus gets served through, the customer may well host other sites we cannot trust at subdomains with the same TLD as the rack. For this reason we should use SameSite=Lax in combination with tokens as described in the next section.

CSRF token

The traditional approach to mitigating CSRF is to embed a special token in the page to send along with form posts and refuse any request missing such a token, as POSTS from third party sites would not have it. The traditional way of doing this with server-rendered web apps is to put a CSRF token inside each <form> as a hidden input. That way every instance of every form gets it own token. But given that the console is rendered client-side, it’s hard to do that — we would have to ask the API for a token every time we render a form.

An easier approach for a single-page app is a single CSRF token for the entire session. It can be stored in a column on the session table and sent down in the HTML on initial pageload. When the console makes an API request, it can stick the token in a special header. This is slightly less secure that per-page or per-form tokens because the token has a longer lifetime, but considering that a custom header may be sufficient even without a token (see next section) it seems like a good middle ground.

If the CSRF token is a random token that’s generated alongside the session token and sent along with every request, why do you need it in addition to the session token? The key is that it doesn’t live in a cookie and therefore it is not automatically sent by the browser with every request to Nexus. That automatic sending-along is the root of the CSRF vulnerability. Malicious sites cannot send the CSRF token because they can’t put custom headers on requests to our domain from JS. They also cannot access the token at all because it only shows up in the response to a GET request, which are blocked by CORS when they come from third parties.

The CSRF token can also be sent down to the client in a non-HttpOnly cookie (so it can be accessed from JS), which sounds like it shouldn’t work, but it does as long as the server doesn’t take it back as a cookie — it still has to be in a header or form post when you send a request back. Because it’s a cookie it will still be sent along with all requests, but the server should must it. And because it’s scoped to your site, other sites cannot access it from JS.

Custom header

We may be able to avoid CSRF tokens altogether by using custom request headers.

An alternate defense that is particularly well suited for AJAX or API endpoints is the use of a custom request header. This defense relies on the same-origin policy (SOP) restriction that only JavaScript can be used to add a custom header, and only within its origin. By default, browsers do not allow JavaScript to make cross origin requests with custom headers.
Cross-Site Request Forgery Prevention Cheat Sheet

This seems good enough to me and is trivial to implement. All you have to do is look for the header on all API requests from the console, which can be identified by their use of session cookie authentication. Note also:

CORS configuration should also be robust to make this solution work effectively (as custom headers for requests coming from other domains trigger a pre-flight CORS check).
Cross-Site Request Forgery Prevention Cheat Sheet

Cross-site scripting (XSS)

XSS is mostly about input sanitization and validation and not about cookies (see [owasp-xss]), but setting the HttpOnly attribute on the session cookie (and any others you don’t need client-side) is a simple way to prevent malicious scripts from stealing the session token.

Other considerations

Redirecting to target page after login

The basic idea is you have to persist the target URL somewhere while you do the login rigmarole, and then retrieve the target URL at the end in order to go there. [login-redirect-auth0] covers the options in detail. The short version is either you store it in the browser, in a cookie or in a web storage thing like sessionStorage, or in the state param on the OAuth request. Cookie/web storage is easier and more protocol agnostic (if we’re supporting OAuth from one place and SAML from another, for example) but it might not always work, for example if the user’s browser is overzealous in blocking cookies. The state param is more work to implement (for one thing, it probably needs to be implemented for each auth protocol) but is guaranteed to work. This is a pretty small detail and nothing is blocked by uncertainty about it, so we can figure this out at implementation time.