Yellowbrick | Spray Paint

Yellowbrick Data Warehouse dramatically improves performance and reliability of a critical fraud detection application

ThreatMetrix – Split-second Fraud Detection
Industry: Financial Services
Business use cases: Risk Management & Fraud Detection
Technical use case: Data Lake Augmentation


One of the key business applications of ThreatMetrix (now known as LexisNexis Risk Solutions), a global digital fraud detection and identity authentication service, is an online portal accessed by thousands of users around the world. The portal serves over 5,000 global brands, helping them verify more than 20 billion financial transactions each year.

  • Customers query a 300TB multi-tenant database over 25,000 times per day, with up to 1TB of new data ingested daily in real time from a data lake via Kafka.
  • Hundreds of external users generate queries simultaneously.
  • Many queries are also complex, accessing over 6 months of stored data spread across millions of records.

Frustrated business users

The application used a variety of different technologies for data processing, including Greenplum Database and Apache Impala, but even with complex, hard-to-manage optimizations, those solutions were unable to respond interactively during busy periods in the face of growing data sets and more users.

  • Some customers would have to wait up to 3 minutes for queries to complete.
  • Unpredictable Impala outages were common, frustrating customers.
  • Business process changes would take weeks to implement (e.g., by adding new columns).

3X speed from 4X fewer nodes

After replacing Impala with Yellowbrick, portal end-users noticed performance improvements immediately, with most operations completing in milliseconds or seconds–and that’s with 4X fewer nodes, and 20X less memory, than what was required by Impala.

Apache Impala vs Yellowbrick

Results include:

  • Faster, more accurate insights for customers. With Yellowbrick’s faster and more consistent performance, even with real-time ingestion in the background, LexisNexis can deliver richer insights to its customers, more quickly, and with fresher data.
  • Far less time needed for management. Yellowbrick automatically reallocates resources to respond to spikes or unusual usage patterns, and performance tuning is no longer needed.
  • Better customer experience. Downtime is no longer a concern, and with Yellowbrick instances located in different global regions, workloads can shift seamlessly between clusters when needed.

“Compared to other data warehouses and Hadoop-based solutions, Yellowbrick Data provides superior performance.”

- Matthias Baumhof,
CTO LexisNexis Risk Solutions
Yellowbrick | Panda
Yellowbrick | Panda

Top Rated in Customer Reviews

Yellowbrick is a leader in Data Warehouse on G2
Review Yellowbrick on G2

Join Us for a Webinar

Meet our experts and learn how to leverage Yellowbrick's secure and fast query response.

Book a Demo

Blazing-fast performance at petabyte scale awaits you.

Book a Demo

Learn More About the Only Modern Data Warehouse for Hybrid Cloud

Run analytics 10 to 100x FASTER to achieve analytic insights that have never been possible.

Simpler to Manage
Configure, load and query billions of rows in minutes.

Shrink your data warehouse footprint by as much as 97% and save millions in operational and management costs.

Accessible Anywhere
Achieve high speed analytics in your data center or in any cloud.