Migrating your data warehouse to the cloud doesn’t have to be a giant, expensive project. Instead, incrementally upgrade or replace your on-premises technology one step at a time to reduce risk and expense.
Details:
Many customers such as the US Navy and Zurich North America chose Yellowbrick for its compatibility with traditional on-premises ETL technology as well as its ability to run on-premises and in the cloud. These capabilities enable low-risk incremental migrations of users and workloads to the cloud.
Surrounding a typical on-premises enterprise data warehouse are a web of complex ETL flows designed to import and transform data from on-premises OLTP databases using a variety of techniques, such as CDC tooling, Informatica PowerCenter or script-based ELT. Diverse groups of users connect to the warehouse with different BI tools, SQL clients, data mining tools or Python scripts.
The typical steps involved in a cloud migration look like this:
Step 1: Migrate databases, optionally on-premises.
Databases (or warehouses) are incrementally migrated from the legacy platform to a Yellowbrick on-premises or cloud instance. Customers choose to start with an on-premises instance if such a deployment lends itself best to current users and network topologies.
From PostgreSQL-compatible platforms such as Redshift, Netezza and Greenplum, little to no porting work is required. For customers that have taken advantages of complex platforms like Teradata, a fully automated toolset is available that provides migration assessment, 90+% automation of database object migration, and even automated porting of BTEQ scripts to Python.
ETL is kept largely intact at this stage: Yellowbrick is uniquely compatible with Informatica PowerCenter, including full pushdown ELT support, as well as a variety of other standard on-premises ETL tools.
As each database is migrated, the ETL is redirected to Yellowbrick and the users of the database are moved over. Users experience faster response times and financial reports are generated faster. Simply moving just the database and clients, leaving ETL processes intact, simplifies testing and assuring correctness.
As legacy database instances are moved to Yellowbrick and correctness validated, they can be turned off leading to substantial cost savings as soon as the databases are moved and ETLs retargeted.
Step 2: Replicate the databases to the cloud
Yellowbrick supports full asynchronous replication of data and DDL, including between on-premises instances and cloud instances in AWS, Azure or GCP. Yellowbrick cloud instances run inside customers’ own cloud accounts and networks, shielding them from the risks of public network data breaches and lowering costs since customers pay for their own cloud infrastructure.
If Yellowbrick was initially deployed on-premises, asynchronous replication is used to mirror on-premises databases to the cloud in real-time. The replica isn’t just a low-cost disaster recovery alternative, it also serves as a fully query-able, elastically scalable instance for cloud-based users and applications.
Step 3: Deal with ETL
ETL migration is performed incrementally, with a variety of viable approaches. As a first option, ETLs can be left as-is on-premises, but just switched to target the new Yellowbrick instance. Even Yellowbrick in the cloud is completely compatible with Informatica PowerCenter and all other common ETL technology.
A second option is to incrementally migrate ETL to a cloud-based technology such as Informatica Cloud (IICS), AirByte or FiveTran and dbt.
A third option is to port the ETL to Python. Our partners have automated tooling available to port ETL and ELT flows from a variety of source platforms including Informatica to Python.
Step 4: Turn off the on-premises instance, or keep it
Yellowbrick costs a fraction of legacy data warehouse platforms, leading to cost savings early in the migration. Once migration is complete, customers may choose to turn off an on-premises instance or keep it running. Since Yellowbrick’s licensing – per vCPU/hour – is consistent same between on-premises and cloud, capacity can be transferred or recouped as soon as practical.
Partners to help
Yellowbrick has an extensive network of partners able to assist with all aspects of migration including: Automated discovery of the existing environment, migration cost and effort assessment, automated migration of ETL, DDL and data, BTEQ emulation, rewriting and re-targeting of client tools, testing, quality assurance and consulting services.
Software automation solutions include:
DATOMETRY Hyper-Q | In-flight translation of queries from Oracle and Teradata with zero change to ETL, BI or queries. |
SmartAssociates | Automated discovery and migration from Netezza, Snowflake, Greenplum |
NextPathway | Automated discovery, migration, and test tooling from Teradata and other platform sources. |
Summary:
Migrating your data warehouse to the cloud with Yellowbrick can be accomplished incrementally, avoiding the costs and risks of large, expensive migration projects, delighting users and achieving business ROI earlier. Once complete, licensed capacity can be transferred from on-premises to the cloud or recouped.
Yellowbrick runs in customers’ own cloud accounts and networks, allowing customers to pay for their own infrastructure with existing contracts and credits as well as shielding their data from public network data breaches.