We are excited to announce the general availability of fine-grained data access control (FGAC) via AWS Lake Formation for Apache Spark with Amazon EMR on EKS. This enables you to enforce full FGAC policies (database, table, column, row, and cell-level) defined in Lake Formation for your data lake tables from EMR on EKS Spark jobs. We are also sharing the general availability of Glue Data Catalog views with EMR on EKS for Spark workflows.
Lake Formation simplifies building, securing, and managing data lakes by allowing you to define fine-grained access controls through grant and revoke statements, similar to RDBMS. The same Lake Formation rules now apply to Spark jobs on EMR on EKS for Hudi, Delta Lake, and Iceberg table formats, further simplifying data lake security and governance.
AWS Glue Data Catalog views with EMR on EKS allows customers to create views from Spark jobs that can be queried from multiple engines without requiring access to referenced tables. Administrators can control underlying data access using the rich SQL dialect provided by EMR on EKS Spark jobs. Access is managed with AWS Lake Formation permissions, including named resource grants, data filters, and lake formation tags. All requests are logged in AWS CloudTrail.
Fine-grained access control for Apache Spark batch jobs on EMR on EKS is available with the EMR 7.7 release in all regions where EMR on EKS is available. To get started, see Using AWS Lake Formation with Amazon EMR on EKS.
Source:: Amazon AWS