Amazon S3 Metadata creates and manages metadata that provides information about datasets stored in S3, so you can more easily discover and use your S3 data. Previously, S3 Metadata supported new and updated objects. Now, S3 Metadata also creates and manages metadata for all your existing S3 data so you can write a SQL query, across metadata for any amount of S3 storage.
S3 Metadata creates and maintains two Apache Iceberg tables that give you a queryable and continuously updated view of metadata associated with an S3 bucket. A journal table records changes made to your data in near real time, helping you to identify new data uploaded to your bucket, track recently deleted objects, monitor lifecycle transitions, and more. A new live inventory table provides and maintains an up-to-date view of all objects in your bucket. Both S3 Metadata tables are stored in an AWS managed Iceberg table bucket in your account, and can be queried using standard SQL through AWS analytics services such as Amazon Athena or Amazon EMR, and open source tooling such as DuckDB and PyIceberg.
S3 Metadata is currently available in US East (N. Virginia), US East (Ohio), and US West (Oregon). You pay a small fee for every change to the underlying dataset that is recorded in S3 Metadata journal tables. As of today, we’re reducing the journal table price by 33% to make real-time change tracking and backfilling more cost-effective for large datasets. Additionally, you pay a small per-object fee to fill your live inventory tables with information about existing S3 datasets. For more details on pricing, visit the Amazon S3 pricing page. To learn more and get started, visit the product page, documentation, and read the AWS News Blog.
Source:: Amazon AWS