Your OAuth access token doesn't know who you are
Engineers often implement OAuth 2.0 in the confident belief that they can identify users from access tokens. This is a mistake as there’s no guarantee that the holder of an access token is who they say they are. In this sense, OAuth 2.0 is not an authentication protocol.
This is based on the difference between authentication and authorization. Authentication is the process of verifying a user identity, often by checking credentials such as a username and password. Authorization is the process of verifying what that user is allowed to do, i.e. their permissions.
This distinction is important as it usually involves two separate solutions that need to be scaled very differently. Authentication is a one-time check that allows a system to give a user access to a system, while some form of authorization checking should happen every time a user requests a resource.
OAuth 2.0 access tokens just represent delegated authorization to use a resource. They do not represent a user or provide any guarantee of authenticated identity. All they are saying is that a request is authorized for the audience and scope defined in the token.
This is more than just security pedantry. There is a very real and underappreciated risk of impersonation here. An access token says nothing about who is holding it. It just says that whoever holds it has been granted authorisation for a particular scope and audience.
There doesn’t even have to be a user involved in issuing an access token. In machine-to-machine scenarios a token server can issue access tokens without any authentication taking place. The token grants authorization to perform actions, but there may not be any human identity associated with it at all.
Access tokens are not for client applications
One problem here is that OAuth 2.0 is a very loose protocol that is largely silent on the structure and format of access tokens. It doesn’t even mandate JSON Web Tokens (JWT). There’s no contract here, so no guarantee that a client will be able to read a token. Their format may change at any time and the implementation will still be perfectly OAuth 2.0 compliant.
This makes sense when you understand that access tokens are not intended for client applications. The protocol does not expect a client to read an access token and try to extract identity information from it. A client should just retrieve one from the token server and submit it to a resource server as part of an API request.
Access tokens contain information that is useful to the resource that is being secured. It can tell the resource who issued the token, what authentication method was used, the scope of the token, and the intended audience. All this is useful information when you are trying to verify that the token is valid for a resource. It is not an invitation for the client to process the token.
Given that a client application is not the intended audience for a request, it has no way of validating the token. A client that extracts identity information from an access token is trusting data that was never meant for it. Even if the token is a JWT and appears readable, the client may not be the intended audience and may not have enough information to validate the token correctly. This creates opportunities for token substitution attacks where one valid access token is mistaken for another user's identity.
What problems does OpenID Connect solve?
If an implementation adds user profiles claims into an access token (e.g. the user’s given name) it’s normally a sure sign that the access token is being used as a proxy for identity.
OpenID Connect provides a set of extensions that were added on top of OAuth 2.0 to solve this problem. The ID token is a one-way communication of identity between server and client. Validating this token provides a guarantee of authenticated identity. This is where you are supposed to put stuff like the email address, family name and given name.
OpenID Connect’s ID tokens are meant to be read by client applications - that’s their purpose. They are also mandated as JWTs so there is a more reliable contract between server and client.
These ID tokens are not a replacement for access tokens though. They are a one-shot mechanism that allows a server to reliably tell a client about an authenticated user. ID tokens should not be submitted as part of a request to an API. This is because they are only concerned with authentication rather than authorization. Access tokens are completely unaffected by OpenID Connect and remain the mechanism you use to authorize API requests.
Access tokens are not magically reframed by OpenID Connect so they can be used for identity. An access token still does not provide any proof that a user has logged in.
If a client wants to know the user’s identity it should obtain an ID token, verify it, extract the user profile information from it, and then discard it. Access tokens should still be used for any subsequent API requests, but the client should not try to read or verify the access token - this is a job for the resource server that receives the API request.