The Yellowbrick Data Warehouse is a cloud native, parallel SQL database designed for the most demanding batch, ad hoc, real-time and mixed workloads. Fully elastic clusters with separate storage and compute run complex queries at multi-petabyte scale with sub-second response times.
Yellowbrick innovates in three key areas:
- Enabling a more efficient business model by allowing customers to consume a modern, elastic, SaaS user experience in their own cloud account with predictable cost.
- Optimizing price/performance for new use cases requiring more concurrent users and more ad hoc analytics.
- Providing deployment flexibility by offering an identical data warehouse for on-premises use cases.
Yellowbrick’s architectural approaches support these innovations. They cross everything from the creative usage of Kubernetes to cluster management, to data storage, to query planning and execution, and even the user interface itself.
Yellowbrick is designed to run Tier 1 enterprise-grade data warehouses with the following characteristics:
- Modern, elastic architecture: Elasticity, separate storage & compute, driven completely through SQL.
- Run in the customer’s cloud account: Yellowbrick customers pay for their own cloud infrastructure, make use of their own cloud storage and control their own data security. Yellowbrick doesn’t see or store user data or queries.
- Run across all public clouds: At the time of writing, Yellowbrick supports AWS, Azure, and GCP public clouds.
- Run on-premises: Provide the same elastic user experience as in the cloud on-premises.
- Hardened SQL support: Yellowbrick can reliably execute complex ANSI-standard SQL queries across massive schemas, supporting complex join hierarchies, correlated subqueries, deeply nested CTEs, stored procedures, etc.
- Execute complex workloads: Yellowbrick handles highly concurrent mixed workloads with continual bulk and real-time ingest, merging, and highly concurrent queries with guaranteed quality of service through workload management. Full ACID transaction semantics are present throughout the stack.
- Reliability: Yellowbrick is highly available and resilient to node, storage, and network failure. Workloads across compute clusters are isolated from one another.
- Support for disaster recovery: Asynchronous replication of both data and DDL with read-only hot standby instances is built in.
- Support for data retention and business continuity: A mature enterprise-level backup scheme for off-site data retention supports incremental, cumulative, and full backups and object-level restore.
- Mature ecosystem: Yellowbrick has enterprise support agreements with all major BI, ETL, data mining, CDC, and machine learning vendors.
- Best query efficiency in the industry: Yellowbrick executes ad-hoc queries against large data sets incredibly efficiently, making use of many technical innovations in query execution.
- Flexible pricing and consumption: Support for subscription contracts as well as workload management allows variable numbers of concurrent users with a predictable price.
- Open interfaces: By making use of PostgreSQL’s wire protocols, *DBC drivers, and metadata schema, Yellowbrick is comfortable for developers and DBAs to work with. Open-source integrations for tools like Kafka and Spark are standard.
Yellowbrick isn’t a SQL-on-Hadoop type of product, or a solo query engine running on top of other open source infrastructure. It’s a database that organizations can trust to store and be the system of record for their most valuable enterprise data, and to generate business-critical, auditable financial reports that their businesses depend on. It requires almost no management, tuning, diagnosing, or handholding and is familiar to modern developers accustomed to working with PostgreSQL.
Download the full whitepaper to learn how Yellowbrick has rebuilt the cloud database software stack by leveraging Kubernetes and modern computer architecture to implement a modern, fully elastic SQL analytic database with separate storage and compute.