HDFS security is tightly coupled with Hadoop authentication, service identity, and data governance controls.
- Enable Kerberos for Hadoop services and users.
- Use dedicated service principals for NameNode, DataNode, and gateway services.
- Protect keytabs and rotate them per policy.
¶ Service and RPC Endpoint Protection
- Keep NameNode and DataNode ports on private networks.
- Restrict administrative web UIs to admin subnets.
- Enforce TLS for web interfaces and inter-service communication where supported.
¶ Authorization and Data Access
- Use HDFS permissions and ACLs with least privilege.
- Separate tenant data paths and ownership.
- Review superuser/group mappings regularly.
- Enable HDFS transparent data encryption zones for sensitive datasets.
- Encrypt data in transit (TLS/SASL settings).
- Protect KMS and key material with strict access controls.
¶ Audit and Compliance
- Enable and centralize audit logs.
- Alert on privileged operations and policy changes.
- Keep log retention aligned with compliance requirements.