Organizations are being more strategic in their approach to the cloud, recognizing the value of data in the enterprise to drive informed, data-driven decisions.
Yellowbrick’s Head of Product Marketing Umair Waheed joins EM360 Podcast Editor Matt Harris to discuss why hybrid and multi-cloud are the new normal for data warehouses. Read on to learn about the biggest trends in data warehousing, the adoption of hybrid and multi-cloud across industries, the future of data warehouses, and more.
Welcome to the EM360 podcast where we have a weekly conversation with people who are impacting their enterprise tech landscape. My name is Matt Harris, Editor here at EM360 and your host on today’s episode. Make sure you start today with all of our latest episodes by subscribing to Apple Podcasts, Spotify, Google Podcasts, or wherever you go for your podcast needs. In today’s episode, I’m joined by Umair Waheed, head of product marketing at Yellowbrick, and we’re going to be talking about why hybrid and multi-cloud are the new normal for data warehouses. Umair, welcome to the show.
Yeah, lovely to have you today. Could you just give us a brief outline of who you are and what you do?
Sure. So I head up product marketing recently for Yellowbrick Data. Joined them about six months ago. Really innovative startup in the data warehousing space. It’s kind of a startup, but it’s been going for about eight years, so we’re kind in that sort of transition from startup to non-startup.
I’ve spent my career last 25-odd years in the data analytics space as a practitioner helping customers build data warehouses, do data-related projects and migrations. And then stepped into the vendor space through working with a company called DATAllegro about 12 years ago, which got bought by Microsoft and is now the basis of the Synapse product line there or the history behind that. And my career’s gone in a number of different directions from pre-technical, presales to sales to kind of consulting, and lastly into the product marketing space where I’m really enjoying helping customers understand what the value proposition around some of these technologies are and how they should be using it.
Oh, sweet. Sweet. So I wanted to ask you today about obviously what you mentioned the data warehouse space. What are some of the top-level trends that you’ve been noticing? Because I’ve seen the topic of hybrid cloud has been getting quite a lot of attention recently.
Yeah, absolutely. So the top level, I think people are being a lot more strategic in their approach to data warehousing this time around. We’ve gone through a lot of hype cycles of technology that have promised to do everything, promised to be the panacea for all data needs. And I think people are starting to understand data, putting data together is more complex, but equally, it’s more essential, and people recognize the value of data in the enterprise.
So I think people are being a lot more strategic in their approach. They’re thinking more deeply about the future and how things are going to evolve over time. They’re thinking about how people are going to be engaged in that process. So it’s not just about the technology – which I think we’ve over-indexed in the technology in the past – but really about how people engage with data. By people, I don’t just mean data professionals and data scientists and the roles that you might expect, but everyone within the organization.
How do people actually experience data? How do they gain insights? Where and how do we extract value from the experience that people have? How do we allow them to take advantage of data? And without being a SQL guru, I think there’s a lot of work being done on that side and that flows into things like data quality – and not just fixing data, but helping organizations explain to people how and why data quality makes sense, for example.
And you mentioned hybrid cloud. I think that’s another interesting trend. Obviously there has been a big push towards cloud in the last sort of decade. I think that would not surprise anybody. And I think potentially we may have over-indexed on the cloud, particularly in the data warehousing space. So hybrid cloud, I think, is people are starting to realize that certain workloads work well in the cloud and there are other workloads that actually work better on-premises.
And being able to do both is important. So the cloud brings agility, allows us to really start new projects and get things going without having to wait six months to go order some hardware and get it deployed in a data center and get it provisioned and get it set up and deploy a database technology.
So cloud is definitely important. But when you get to a steady state when you’re running something 24/7 and it kind of becomes mission critical sometimes for regulatory reasons, you always need it to be on. And you need to have absolute certainty, and you can’t rely on a cloud vendor’s assurances, then sometimes you need some stuff on-prem. Most of the data warehouse technology vendors haven’t really stepped up in that space. So they’ve kind of created some of these hybrid offerings which are a little bit weak.
So the ability to reach on to on-prem for data, but not for you to host that same service on both cloud and on-prem. And that’s really what we’ve been doing at Yellowbrick, we truly believe in hybrid cloud and actually in multi-cloud. I think that’s the other thing. Someone mentioned the term “poly cloud” the other day – which may not take off – but hybrid cloud and multi-cloud I think are certainly things that we should look for in the future.
And as said with Yellowbrick, we kind of built those from day one. And if you’re building modern software on modern cloud stacks and cloud-native technologies, it shouldn’t be a challenge. You should be able to deploy the same technology on-prem, under the cloud, and having the same means that you can start in one place and move to the other, but it also means that you can manage both in the same way at the same time without adding a huge amount of complexity to your architecture. So yeah, absolutely, hybrid cloud is getting a lot of attention.
Yeah, I like the fact that Yellowbrick has products that span both. One thing that I wanted to ask as well, we spoke quite a lot about cloud there. Are you seeing all industries equally going towards the cloud or is it just tech or is it just financial? How are you seeing that space evolving at the moment?
Yeah, good question. I think we saw different industries move at different paces. Industries like retail moved really quickly. Things like web applications, and web delivery, and mobile phone-type applications and services. A lot of the retail services moved really quickly. Areas like Healthcare and Financial Services were lagging slightly and now that’s starting to catch up.
So we’re seeing a little bit of catch-up from some of those industries starting to think about how data lives in the cloud. But actually, the solution for those organizations that are a little bit more regulated, that are a little bit more risk averse, are going to be slightly different to those early adopters that went all in the cloud early on. For example, in Financial Services, we’re seeing this topic of cloud concentration risk come up. So we’re all familiar with when our bank, or we might hear it in the news of a bank where a certain service people don’t receive their paychecks on time because some things out in terms of there’s a backlog and payments from transfers and that affects that one bank.
But there are also issues. For example, if multiple Financial Services institutions all pick certain technologies or cloud vendors and they’re all looking at the same thing, if there is an outage on one of those services, then that could potentially impact the Financial Services industry at a systemic level. So cloud concentration risk is an umbrella term to encompass that risk of the potential systemic impacts from over-reliance on one or more cloud providers or software vendors. And that’s an interesting topic and potentially that’s affecting where people deploy technologies and their choice of what goes on the cloud and what stays on-prem.
Yeah, I think it’s interesting, you’re obviously moving at different speeds. Did you feel as though there are some big misconceptions that some companies have got about data warehouses that might be affecting the take up? And just to add to that as well, are there any common fixes that a lot of companies should be looking at when it comes to this?
Yeah, I think the biggest challenge and the biggest misconception that we have, or the biggest problem that we have within the data warehouse industry – and actually, we went through lots of debates for the Yellowbrick as to whether we call ourselves data warehouse. The term data warehouse in itself, it’s quite an overloaded term. It comes with connotations of projects from sort 15, 20, or even 30 years ago. Projects that were really slow moving in terms of change, didn’t always deliver the value that people expected from the technology deploying itself, and cost a lot more than people were anticipating.
So people recognized the value of data, but then the cost of the platform and the complexity of deploying data warehouse, particularly towards the centralized model which people were moving towards was pretty onerous. So I think that’s the biggest misconception with modern technologies in modern advances in the data warehouse space, Yellowbrick. But there are other technologies out there.
The ability to scale elastically, the ability to leverage things like cheap storage in the cloud, like object storage means that we can really build really large data warehouses in the cloud with huge amounts of data, with very little complexity. Yellowbrick for example – we don’t require indexes, which was the maintenance of the database in the same way that you would’ve done 20 years ago if you were building on an Oracle or a SQL Server or even a Teradata type platform. So it can be very hands off.
We see this trend towards data lake because you as a sort of a backlash from data warehousing in terms of an easier way to store large volumes of data that are less frequently used potentially or less structured. But there’s no reason why with one technology you can store data in a relational data lake, data lakes have uses.
But if you’re actually going to query that data regularly or you want it available to be queried regularly, so naturally it doesn’t cost much more to put it in a modern cloud data warehouse platform or on-premises data warehouse platform these days. So looking at that mix, I think, is something that people should be looking at around, what information needs to go in a data lake? What is really only accessed very infrequently, and you need it in nearline but you don’t need it online. And I think that balance between data in a relational context and data in files in a data lake, people should be looking at the balance and what balance is correct for them.
Yeah, this is an area of tech that you feel quite excited about and quite optimistic about. Where do you see the data warehouse space in say five to 10 years’ time then?
Yeah, that’s a tough one. A lot of people have forecasted things around data warehousing and got them wrong in terms of the long-term story. But I’m hopeful that in five to 10 years’ time that we will see a lot more automation happening in terms of how data flows from source systems through to systems that are available for people then to query to get insights from concepts like data fabric, which are kind of emerging now.
Really talking to that need to drive automation, but that need to drive automation also drives a need for technologies that describe data, both from when you bring in a system like a cloud system or a SaaS system or an ERP system – those systems will start to automatically provide feeds of data, which are well described because at the moment, I always feel like data’s a bit of an afterthought when people acquire new technologies and new datasets.
So hopefully it will become more business as usual to plumb in a bit – like we plumb in a washing machine into our system, plumb in new data sets, not have that heavy lift of, “Oh, I’ve got a new system, a new ERP system. I need a three-year project to figure out how I’m going to gain insights from this thing,” which is kind of where we’re at today.
So a lot more automation, a lot more plug and play. And I think as a whole, my hope is that we’ll see a lot more education, people being able to exploit data. People actually understanding how to use data in their day-to-day and get value from it and it not being a skill that’s left to those sort of gurus that can write hardcore SQL queries or build really complex reports and BI tools, which are accessible. But to get the real values can still be pretty complex to build the right models, et cetera. So yeah, it’s not what I hope for, it’s what I’m forecasting.
Yeah. Absolutely. Absolutely. Well, in fact, thank you so much for joining us today. It was really, really great to hear your insights into the data warehouse space.
Perfect. Thanks, Matt. Great speaking to you.
Yeah, and you. Thank you to everyone listening as well. We hope you took a lot away from today’s podcast. But for further information what we talked about them, please head on over to yellowbrick.com. We’ll be back next week with another episode in our podcast series. But until then, make sure you subscribe to this podcast and all major platforms, follow the conversation on our socials at EM360 Tech on Twitter, and for great daily content, please head on over to em360tech.com.