
Introducing native support for Apache Hudi, Delta Lake, and Apache Iceberg on AWS Glue for Apache Spark, Part 1: Getting Started
AWS Glue is a serverless, scalable data integration service that makes it easier to discover, prepare, move, and integrate data from multiple sources. AWS Glue provides an extensible architecture that enables users with...

Automate deployment and version updates for Amazon Kinesis Data Analytics applications with AWS CodePipeline
Amazon Kinesis Data Analytics is the easiest way to transform and analyze streaming data in real time using Apache Flink. Customers are already using Kinesis Data Analytics to perform real-time analytics on fast-moving...


A dive into redBus’s data platform and how they used Amazon QuickSight to accelerate business insights
This post is co-authored with Girish Kumar Chidananda from redBus. redBus is one of the earliest adopters of AWS in India, and most of its services and applications are hosted on the AWS Cloud. AWS provided redBus the...


Enable cross-account sharing with direct IAM principals using AWS Lake Formation Tags
With AWS Lake Formation, you can build data lakes with multiple AWS accounts in a variety of ways. For example, you could build a data mesh, implementing a centralized data governance model and decoupling data producers...



Code conversion from Greenplum to Amazon Redshift: Handling arrays, dates, and regular expressions
Amazon Redshift is a fully managed service for data lakes, data analytics, and data warehouses for startups, medium enterprises, and large enterprises. Amazon Redshift is used by tens of thousands of businesses around...


How Thomson Reuters delivers personalized content subscription plans at scale using Amazon Personalize
This post is co-written by Hesham Fahim from Thomson Reuters. Thomson Reuters (TR) is one of the world’s most trusted information organizations for businesses and professionals. It provides companies with the...

Near-real-time fraud detection using Amazon Redshift Streaming Ingestion with Amazon Kinesis Data Streams and Amazon Redshift ML
The importance of data warehouses and analytics performed on data warehouse platforms has been increasing steadily over the years, with many businesses coming to rely on these systems as mission-critical for both...



Get to production-grade data faster by using new built-in interfaces with Amazon SageMaker Ground Truth Plus
Launched at AWS re:Invent 2021, Amazon SageMaker Ground Truth Plus helps you create high-quality training datasets by removing the undifferentiated heavy lifting associated with building data labeling applications and...




Monitor AWS workloads without a single line of code with Logz.io and Kinesis Firehose
Observability data provides near real-time insights into the health and performance of AWS workloads, so that engineers can quickly address production issues and troubleshoot them before widespread customer impact. As...


Next generation Amazon SageMaker Experiments – Organize, track, and compare your machine learning trainings at scale
Today, we’re happy to announce updates to our Amazon SageMaker Experiments capability of Amazon SageMaker that lets you organize, track, compare and evaluate machine learning (ML) experiments and model versions from...