Skip to content

AWS Glue DataBrew now supports the ORC file format as an input

AWS Glue DataBrew customers are now able to clean and transform data stored in the Optimized Row Columnar (ORC) file format, a widely used data format for storing Hive data. When creating a dataset in AWS Glue DataBrew, you can now use ORC files in addition to already supported Apache Avro, Apache Parquet, Microsoft Excel, CSV, and JSON file formats.  

Source:: Amazon AWS