Near-Zero Downtime: Transforming Data Access with Real-Time Insights and Apache Hudi

Transforming data access with Apache Hudi enabled real-time insights, reduced downtime, and improved query performance.

~0%

downtime from previously over an hour

80%

reduction in the number of files minimizing S3 throttling issues

45%

improvement in query response times through enhanced data clustering and aggregation

IndustryFinancial services
Company typeEnterprise
CountryUnited States
Services usedAI/Data
Near-Zero Downtime with Apache Hudi

About the client

A global leader in financial technology solutions, providing software, services, and infrastructure to banks, capital markets, and merchants to facilitate secure, scalable financial transactions and operations.

The problem

The client's initial Direct Data Delivery (DDD) system relied on Parquet, causing daily full data reloads and leading to significant downtime. The system also faced severe throttling issues due to numerous files spread across S3, impacting query performance and user experience. Limited data access through API calls hindered partners' ability to perform in-depth analysis, and slow, restricted portal reports further affected data-driven decision-making processes.

The solution

The client, with Turing's support, implemented a comprehensive solution centered around Apache Hudi to transform their Direct Data Delivery (DDD) system:

  • Incremental updates: Transitioned the data lake format to Apache Hudi, enabling incremental updates that eliminated the need for daily full data reloads, significantly reducing downtime.
  • Data partitioning and clustering: Optimized file sizes and reduced the number of files stored in S3 through Hudi's support for partitioning and clustering, mitigating throttling issues and enhancing query performance.
  • Aggregation and indexing: Constructed aggregate tables atop the primary dataset to expedite query processing and reduce the overall data footprint. A comprehensive data dictionary was also created to help partners better understand their data and generate meaningful insights.

The result

  • Near-zero downtime: The transition to Apache Hudi reduced downtime from over an hour to near-zero levels.
  • 80% reduction in the number of files: Optimized data storage minimized S3 throttling issues and improved overall efficiency.
  • 45% improvement in query response times: Enhanced data clustering and aggregation sped up query processing, improving the user experience.
  • Improved partner access: Enabled direct SQL querying, improving data accessibility and analytical capabilities for partners.

Want to innovate your business with AI/ML?

Talk to one of our solutions experts and start your AI/ML transformation.

Get Started

Share

Want to accelerate your business with AI?

Talk to one of our solutions architects and start innovating with AI-powered talent.

Get Started