Apache Hadoop is a base component for Big Data processing and analysis. Hadoop servers, in general, allow interaction via two protocols: a TCP-based RPC (Remote Procedure Call) protocol and the HTTP protocol.
The RPC protocol currently allows only one primary authentication mechanism: Kerberos. The HTTP interface allows enterprises to plug in different authentication mechanisms. In this post, we are focusing on enhancing Hadoop with a simple framework that allows us to plug in multiple authentication mechanisms for Hadoop web interfaces.
Note that the Hadoop HTTP Authentication module (deployed as the hadoop-auth-version.jar
file) is reused by different Hadoop servers like NameNode, ResourceManager, NodeManager, and DataNode as well as other Hadoop-based components like Hbase and Oozie.
We can follow the steps below to plug in custom authentication mechanism.
- Implement interface AuthenticationHandler, which is under the org.apache.hadoop.security.authentication.server package.
- Specify the implementation class in the configuration. Make sure that the implementation class is available in the classpath of the Hadoop server.
AuthenticationHandler interface
The implementation of the AuthenticationHandler will be loaded by the AuthenticationFilter, which is a servlet Filter loaded during startup of the Hadoop server’s web server.
The definition of AuthenticationHandler interface is as follows:
package org.apache.hadoop.security.authentication.server;
public interface AuthenticationHandler {
public String getType();
public void init(Properties config) throws ServletException;
public void destroy();
public boolean managementOperation(AuthenticationToken token, HttpServletRequest request, HttpServletResponse response) throws IOException, AuthenticationException;
public AuthenticationToken authenticate(HttpServletRequest request, HttpServletResponse response)throws IOException, AuthenticationException;
}
The init method accepts a Properties object. This contains the properties read from the Hadoop configuration. Any config property that is prefixed by hadoop.http.authentication.Type will be added to the Properties object.
The authenticate method does the job of performing the actual authentication. For successful authentication, an AuthenticationToken is returned. The AuthenticationToken implements java.user.Principal and contains the following set of properties:
- Username
- Principal
- Authentication type
- Expiry time
Existing AuthenticationHandlers
There are a few implementations of AuthenticationHandler interface that are part of the Hadoop distribution.
- KerberosAuthenticationHandler — Performs Spnego Authentication.
- PseudoAuthenticationHandler — Performs simple authentication. It authenticates the user based on the identity passed via the user.name URL query parameter.
- AltKerberosAuthenticationHandler — Extends KerberosAuthenticationHandler. Allows you to provide an alternate authentication mechanism by extending
- AltKerberosAuthenticationHandler. The developer has to implement the alternateAuthenticate method in which to add the custom authentication logic.
Composite AuthenticationHandler
At eBay, we like to provide multiple authentication mechanisms in addition to the Kerberos and anonymous authentication. The operators prefer to turn off any authentication mechanism by modifying the configuration rather than rolling out new code. For this reason, we implemented a CompositeAuthenticationHandler.
The CompositeAuthenticationHandler accepts a list of authentication mechanisms via the property hadoop.http.authentication.composite.handlers. This property contains a list of classes that are implementations for AuthenticationHandler corresponding to different authentication mechanisms.
The properties for each individual authentication mechanism can be passed via configuration properties prefixed with hadoop.http.authentication.Type. The following table lists the different properties supported by CompositeAuthenticationHandler.
# | Property | Description | Default Value |
---|---|---|---|
1 | hadoop.http.authentication.composite.handlers | List of classes that implement AuthenticationHandler for various authentication mechanisms | |
2 | hadoop.http.authentication.composite.default-non-browser-handler-type | The default authentication mechanism for a non-browser access | |
3 | hadoop.http.authentication.composite.non-browser.user-agents | List of user agents whose presence in the User-Agent header is considered to be a non-browser. | java,curl,wget,perl |
The source code for CompositeAuthenticationHandler is attached to the JIRA page HADOOP-10307.