Support record-level insert, update, and delete on Amazon S3 with Amazon EMR
Amazon EMR release 5.28.0 now supports Apache Hudi (Incubating). Data engineers using Amazon EMR for data pipeline development and data processing can now use Apache Hudi to simplify incremental data management and data privacy use cases requiring record-level insert, updates, and delete operations. Apache Hudi enables Amazon S3-based data lakes to comply with data privacy laws, consume real time streams and change data capture logs, reinstate late arriving data, and track change history and rollback. Apache Hudi is open-source and supports storing data on Amazon S3 in vendor neutral, open source formats such as Apache Parquet and Apache Avro.
Source:: Amazon AWS