We are excited to announce the launch of the Amazon Kinesis Data Streams Connector for Spark Structured Streaming on Amazon EMR. The new connector makes it easy for you to build real-time streaming applications and pipelines that consume Amazon Kinesis Data Streams using Apache Spark Structured Streaming. Starting Amazon EMR 7.1, the connector comes pre-packaged on Amazon EMR on EKS, EMR on EC2 and EMR Serverless. Now, you do not need to build or download any packages and can focus on building your business logic using the familiar and optimized Spark Data Source APIs when consuming data from your Kinesis data streams.
Amazon Kinesis Data Streams is a serverless streaming data service that makes it easy to capture, process, and store streaming data at massive scale. Amazon EMR is the cloud big data solution for petabyte-scale data processing, interactive analytics, and machine learning using Apache Spark and other open-source frameworks. The new Amazon Kinesis Data Streams Connector for Apache Spark is faster, more scalable, and fault-tolerant than alternative open-source options. The connector also supports Enhanced Fan-out consumption with dedicated read throughput. To learn more and see a code example, go to Build Spark Structured Streaming applications with the open source connector for Amazon Kinesis Data Streams.
Source:: Amazon AWS