Amazon SageMaker Data Wrangler reduces the time it takes to aggregate and prepare data for machine learning (ML) from weeks to minutes in Amazon SageMaker Studio, the first fully integrated development environment (IDE) for ML. With SageMaker Data Wrangler, you can simplify the process of data preparation and feature engineering, and complete each step of the data preparation workflow, including data selection, cleansing, exploration, and visualization, from a single visual interface. SageMaker Data Wrangler runs on ml.m5.4xlarge by default. SageMaker Data Wrangler includes built-in data transforms and analyses written in PySpark so you can process large data sets (up to hundreds of gigabytes (GB) of data) efficiently on the default instance.
Source:: Amazon AWS