StreamSets, a Software AG company, announced its support for Amazon EMR Serverless, the latest Amazon Web Services (AWS) deployment option that makes it easy for data analysts and engineers to run open-source big data analytics frameworks without configuring, managing, and scaling clusters or servers. This new integration is now available to StreamSets users to process and analyze large amounts of data in a cost-effective, scalable, and serverless manner.
Amazon EMR Serverless simplifies the deployment and management of large data processing workloads by automatically provisioning and scaling infrastructure based on the workload demand.
Implementing this capability is the next essential step in our mission to provide organizations with the tools to modernize data integration and operations. – Dima Spivak, COO of Products at StreamSets.
This comprehensive platform allows users to submit Apache Spark jobs to a managed EMR cluster that automatically scales up and down based on the job’s requirements. This more efficient model enables users to only pay for the services they need without having to manage any servers or clusters themselves.
Also Read: Slone Partners Places Bret Christensen as President and CEO at DermTech
Added benefits of Amazon EMR Serverless include automatic software updates, high availability, and built-in security features. Users can also develop and run Spark applications through Amazon EMR Studio, a web-based integrated development environment (IDE) that makes it easy for data scientists and data engineers to develop, visualize, and debug data engineering and data science applications written in R, Python, Scala, and PySpark.
“Implementing this capability is the next essential step in our mission to provide organizations with the tools to modernize data integration and operations, offering a more seamless and efficient method for data engineers to do their jobs,” said Dima Spivak, COO of Products at StreamSets.
Users who leverage StreamSets’ data integration platform with Amazon EMR Serverless receive the benefits of:
- Scalability: Amazon EMR Serverless allows for auto-scaling of compute resources based on the workload demand, processing large amounts of data without worrying about over-provisioning or under-provisioning of compute resources.
- Cost-efficiency: With Amazon EMR Serverless, data engineers only pay for the compute resources they use, without having to manage any servers or clusters themselves. StreamSets helps optimize data integration pipelines and reduce the costs of managing infrastructure.