Allowing Xplenty access to my data on Hadoop Distributed File System (HDFS)

Xplenty can access data residing on any Hadoop distributed file system (HDFS). This article details creating an HDFS connection in Xplenty.

You must provide Xplenty access to the cluster's HDFS. Please consult our support team if the HDFS is behind firewall.

To define a connection in Xplenty to Hadoop Distributed File System (HDFS)

  1. Click your avatar, then click Account settings.
  2. On the left menu, click Connections. Existing connections are listed.
  3. To create a connection, click new connection
  4. Click Hadoop Distributed File System (HDFS).
  5. In the new HDFS connection window, name the connection and enter the connection information:
    • User Name - the user name to use when connecting to HDFS (Kerberos authorization is not currently supported).
    • NameNode Hostname - the host name of the NameNode server or the logical name of the NameNode in a high availability configuration.  
    • NameNode Port - the TCP port of the name node. Leave empty if the NameNode is in a high availability configuration.
    • HttpFS Hostname - the host name of the Hadoop HttpFS gateway node. This should be available to Xplenty's platform.
    • HttpFS Port - the TCP port of the Hadoop HttpFS gateway node (Default is 14000).
  6. Click test connection If the credentials are correct, a message that the connection test was successful appears.
  7. Click create hdfs connection.
  8. The connection is created and appears in the list of file storage connections.
  9. Now you can create a package and test it on your actual data stored in Hadoop Distributed File System (HDFS).


Feedback and Knowledge Base