Data Warehouse on Kubernetes

Yellowbrick Logo
Yellowbrick | Spray Paint

Andromeda Optimized Instances

Andromeda Optimized Instances

Andromeda Server Hardware Instance – Yellowbrick

The Yellowbrick Data Warehouse is a cloud-native, parallel SQL database designed for the most demanding batch, ad hoc, real time, and mixed workloads.

For on-premises use cases, Yellowbrick has developed the Andromeda server hardware instance and our new Kalidah processor.

Yellowbrick’s Database & Andromeda Server

Together, the Kalidah processor and Andromeda system optimize price/performance, driving new efficiencies.

The Yellowbrick cloud-native data warehouse also provides deployment flexibility by offering a cloud-compatible data warehouse and identical data warehouse for on-premises use cases.

Yellowbrick enables a more efficient business model by allowing customers to consume a modern, elastic, SaaS user experience in their own cloud account with predictable costs.

Significant Andromeda Price/Performance

With our database and Andromeda server, it’s not uncommon to find one server node providing the equivalent query throughput of a dozen or more nodes of competitive cloud and on-premises databases, at a fraction of the total cost.

The Andromeda system provides optimized performance/price for new use cases that require more concurrent users and more ad hoc analytics.

Instance Design for Data Warehousing 

Parallel data warehouse workloads place substantial stress on servers, networks, and storage, similar to supercomputer applications. Unlike storage systems that just read or write data from discs and send it over a network, MPP database servers require large amounts of compute to process and transform the data before it’s read or written, and as much memory bandwidth as possible to support random lookups of data for operations such as aggregates and joins.

Furthermore, all the servers in a cluster need to continually coordinate query processing (requiring ultra-low network latency to rapidly execute short queries) and exchange data (requiring massive amounts of streaming bandwidth for large queries). During query processing, throughput will be bound by the network (latency or bandwidth), computation (cores or memory channels), or storage (reads or writes for spilling), depending on the operators in use.

Data warehouses are becoming Tier 1, business-critical applications, requiring instances to be highly available at the hardware and system level, fully resilient to hardware components (fans, power supplies, drives, adapters, etc.) failure, network failure, server node failure, and partial power failure.

Compute

For compute, we care about the cost of each CPU core, which largely dictates how fast we can go on executing instructions, and the cost per memory channel, which largely dictates how fast we can do large aggregates, joins, and sorts. With the introduction of AMD’s EPYC processors, it is affordable to acquire 64 cores of compute with eight memory channels to result in the lowest possible price per core and memory channel.

Network

100Gb networks are now the sweet spot in cost per unit of bandwidth. Since a redundant network architecture is required for high availability, each server node has access to two network interfaces running over two separate switches. In addition, we have made use of features on the EPYC processor and the network interface to closely couple the fabric and query processing, enabling us to drive an incredible 200Gb/sec per node of data across the network – roughly 20GB/sec per node, full duplex, or 400GB/sec per chassis. To make this process efficient, we use a remote direct memory access (RDMA) fabric that allows direct movement of data – typically cache-resident – between nodes, with no TCP/IP or Linux kernel in the way to slow things down.

Storage

Each Andromeda server supports 8x 7mm NVMe U.2 drives, offering 24GB/sec of read bandwidth per node and 16GB/sec of write bandwidth. Because data is compressed, the effective read bandwidth per node is over 3x higher, sometimes peaking at over 100GB/sec of user data scanned per server node. To scan data at this rate, we need a hardware accelerator.

Learn More

Download the full whitepaper to learn how Andromeda-optimized instances are designed to bring significant performance, efficiency, and economic advantages to customers deploying Yellowbrick inside private clouds.

Andromeda: Performance, Efficiency, and Economic Advantages

For on-premises use cases, Yellowbrick has developed the Andromeda server hardware instance, and our new Kalidah processor. Together, they drive new efficiencies in price/performance.

Cloud-Compatible Data Warehouse

The result is a new kind of cloud compatible data warehouse that provides the best economics in the industry, along with all other expected features and functions of a mature product that can be trusted to help run your business faster and more efficiently.

Yellowbrick’s cloud-native data warehouse innovates in three key areas:

  • Enabling a more efficient business model by allowing customers to consume a modern, elastic, SaaS user experience in their own cloud account with predictable cost.
  • Optimizing price/performance for new use cases that require more concurrent users and more ad hoc analytics.
  • Deployment flexibility by offering a cloud-compatible data warehouse and identical data warehouse for on-premises use cases.
Yellowbrick | Panda
Yellowbrick | Panda

Top Rated in Customer Reviews

Yellowbrick is a leader in Data Warehouse on G2
Review Yellowbrick on G2

Join Us for a Webinar

Meet our experts and learn how to leverage Yellowbrick's secure and fast query response.

Book a Demo

Blazing-fast performance at petabyte scale awaits you.

Book a Demo

Learn More About the Only Modern Data Warehouse for Hybrid Cloud

Faster
Run analytics 10 to 100x FASTER to achieve analytic insights that have never been possible.

Simpler to Manage
Configure, load and query billions of rows in minutes.

Economical
Shrink your data warehouse footprint by as much as 97% and save millions in operational and management costs.

Accessible Anywhere
Achieve high speed analytics in your data center or in any cloud.