Yellowbrick Data Warehouse dramatically improves performance and reliability of a critical fraud detection application

ThreatMetrix – Split-second Fraud Detection

Industry: Financial Services

Business use cases: Risk Management & Fraud Detection

Technical use case: Data Lake Augmentation

Overview

LexisNexis Risk Solutions, initially known as ThreatMetrix, is a leader in global digital fraud detection and identity authentication services. Central to its operations is the LexisNexis Digital Identity Network (DIN) powered by a sophisticated ML model, which serves over 5,000 brands in 244 countries, illustrating the company’s extensive global reach and influence in digital security.

Key Statistics and Operation

The DIN processes over 8 billion transactions monthly across 8.2 billion devices.
The system streams 200+ data points and calculates 1,000 extra properties for each transaction, all within an average time of less than 60 milliseconds.
Clients utilize a 300TB multi-tenant database over 25,000 times daily, integrating up to 1TB of new data from a data lake via Kafka.
The platform adeptly handles complex, simultaneous queries from hundreds of users, accessing data across months and millions of records.

Challenges faced by business users

LexisNexis’s data pipeline was initially built using a variety of technologies, including Apache Kafka, Apache Cassandra, Apache Apex, Apache Impala, and Greenplum. Despite leveraging these advanced technologies, LexisNexis encountered significant operational challenges, especially during peak activity periods. The growing size of data sets and an increasing number of users put a strain on their infrastructure, leading to several critical issues:

Data Ingestion Delays: Ingesting data took up to a minute due to small-file writes and necessary compaction.
Long Query Completion Times: Customers faced query times up to three minutes, significantly hindering efficiency.
Frequent Outages: Unpredictable outages in the data pipeline led to customer frustration.
Complex to Change: Implementing business process changes, such as adding new data columns, was a lengthy process, often taking weeks.

Next-Gen Database Needs for DIN:

Flexible Query Capability: Facilitate customer-initiated, ad-hoc queries over a 6-month data period for datasets larger than 3 billion records without preset queries.
Rapid Data Ingestion: Ingest over 5,000 rows per second, with the data being ready for querying within a minute.
Wid Tables: Store a main table with 40,000 rows, 1,200 columns, and more than 1 petabyte of data.
High User and Query Volume: Support over 250 users simultaneously and process more than 100,000 daily queries, keeping query response times below 50 milliseconds.

3X speed from 4X fewer nodes

By transitioning to Yellowbrick, LexisNexis achieved a significant performance boost, integrating smoothly with the existing data pipeline. End-users experienced marked improvements, with most operations completing in milliseconds. This enhancement was realized using only 15 nodes, which is a quarter of the previous number, and with 80% less memory than the prior solution.

Results include:

Improved Customer Experience: Leveraging Yellowbrick’s rapid processing, LexisNexis delivers up-to-date and in-depth insights more efficiently.
Minimal Management: Yellowbrick’s automated resource allocation reduces administrative needs, with no manual performance tuning required.
Enhanced Customer Experience: Stability and global distribution of Yellowbrick instances mean reliable service and flexible workload management, improving overall customer satisfaction.

“Compared to other data warehouses and Hadoop-based solutions, Yellowbrick Data provides superior performance.”

- Matthias Baumhof,

CTO LexisNexis Risk Solutions

Data Warehouse on Kubernetes

Product

Competitive Advantage

Blog

Migration Guide

Solutions

Industries

Product

Yellowbrick Data Warehouse dramatically improves performance and reliability of a critical fraud detection application

Overview

Challenges faced by business users

Next-Gen Database Needs for DIN:

3X speed from 4X fewer nodes

Results include:

- Matthias Baumhof,

Top Rated in Customer Reviews

News

Events

Resources

Join Us for a Webinar

Book a Demo

Product

Solutions

Resources

Book a Demo

Learn More About the Only Modern Data Warehouse for Hybrid Cloud

Yellowbrick Data Warehouse dramatically improves performance and reliability of a critical fraud detection application

Overview

Challenges faced by business users

Next-Gen Database Needs for DIN:

3X speed from 4X fewer nodes

Results include:

- Matthias Baumhof,

Top Rated in Customer Reviews

Search Our Data

Book a Demo

Learn More About the Only Modern Data Warehouse for Hybrid Cloud