AWS Glue announces general availability of a new AWS Glue Data Quality(Glue DQ) capability that uses ML-powered anomaly detection algorithms to detect hard-to-find data quality issues and anomalies. This helps customers proactively identify and fix data quality issues.
Data engineers and analysts use rules in Glue DQ to measure and monitor their data. While Glue DQ’s existing rule-based approach works well for known data patterns, it may miss unexpected anomalies . Now, data engineers and analysts can use Glue DQ’s Anomaly Detection capability to easily detect unanticipated data quality issues. To use this feature, customers can write rules or analyzers and then turn on Anomaly Detection in Glue ETL. Glue DQ collects statistics for columns specified in rules and analyzers, applies ML algorithms to detect anomalies, and generates easy-to-understand visual observations explaining the detected issues. Customers can use recommended rules to capture the anomalous patterns and provide feedback to tune the ML model for more accurate detection.
To learn more, visit read the blog, watch the introductory video, or refer to the documentation. This capability is available in US East (N. Virginia), US East (Ohio), US West (N. California), Europe (London), Europe (Stockholm), Europe (Frankfurt), Asia Pacific (Singapore), Asia Pacific (Sydney), and Asia Pacific (Tokyo).
Source:: Amazon AWS