How to Optimize Large-Scale Reporting and Complex Analytics

Yellowbrick

September 16, 2022

5 Min Read

Yellowbrick Product Data Practice

Home

blog

Yellowbrick Product

How to Optimize Large-Scale Reporting and Complex Analytics

For organizations in insurance, financial services, and other industries, data demands are growing faster than legacy data warehouses – making cloud migration a necessity.

In this video, Mark Cusack, Yellowbrick’s CTO, Srinivasan Mani, Zurich North America’s Vice President/Application Portfolio Manager, and David Stodder, TDWI’s Senior Director of Research for BI discuss how to meet demands for faster performance and scalability for reporting and analytics, cloud migration and hybrid data environments.

Learn strategies for improving report production at scale, how to reduce costs and risks with cloud migrations and hybrid environments, best practices for insurance and financial services reporting, and more.

Yellowbrick has been a true partner with us, worked with us in crossing all those hurdles and making sure that we get the best support possible in all aspects. Starting from network connectivity, to the engineering problems, to the performance issues – all those things have been effectively handled by Yellowbrick.

Srinivasan Mani

Transcript:

Andrew Miller:

Hello, everyone. Welcome to the TDWI webinar program. I’m Andrew Miller and I’ll be your moderator. For today’s program, we’re going to talk about how to optimize large-scale reporting and complex analytics with cloud data warehousing. Our sponsor is Yellowbrick and for our presentations today, we’ll hear first from Dave Stodder with TDWI. After Dave speaks, we’ll have a presentation from Srinivasan Mani with Zurich North America, and then we’ll be joined by Mark Cusack with Yellowbrick for a round table discussion.

Andrew Miller:

Before I turn over the time to our speakers, I’d like to go over a few basics. Today’s webinar will be about an hour long. At the end of their presentations, our speakers will host a question and answer period, and if at any time during these presentations you’d like to submit a question, just use the ask a question area on your screen to type in your question. If you have any technical difficulties during the webinar, click on the help area located below the slide window and you’ll receive technical assistance. If you’d like to discuss this webinar on Twitter with fellow attendees, just include the hashtag TDWI in your tweets. And finally, if you’d like a copy of today’s presentation, use the click here for a PDF line. In addition, we are recording today’s event and we’ll be emailing you a link to an archived version, so you can view the presentation again later if you choose or share it with a colleague.

Andrew Miller:

Alrighty, again today, we’re going to talk about optimized large-scale reporting and complex analytics with cloud data warehousing. Our first speaker today is Dave Stodder, senior director of research for business intelligence at TDWI. As an analyst, writer, and researcher, Dave has provided thought leadership on key topics in BI, analytics, IT, and information management for over two decades. Previously, he headed up his own independent firm and served as vice president and research director with Ventana Research. He was the founding chief editor of Intelligent Enterprise, a major publication, and media site dedicated to the BI and data warehousing community, and served as editorial director there for nine years. With TDWI Research, Dave focuses on providing research-based insight and best practices for organizations implementing BI, analytics, performance management, and related technologies and methods. Dave, I’ll turn things over to you now.

David Stodder:

Thanks very much, Andrew, and welcome to everyone here. I’m glad you’re all joining us today. I think we’ve got a good topic that I know is certainly of interest to organizations and financial services, insurance, but really a lot of organizations, how to optimize large-scale reporting and complex analytics with cloud data warehousing. So, I’m going to get started with an opening presentation of my own, and then as Andrew mentioned, we’re going to move into… We’ll hear from Srini and then have a round table discussion. I’m certainly interested in your questions, so please supply those as we go along. Well just to start, data-driven business, and I think that’s really what’s happening out there, and we just returned from our conference in San Diego and it’s incredible always, for one thing, to see everyone in person of course, but then the other is just to really hear how data-driven ambitions have really advanced in organizations to try to get to that data-driven business.

David Stodder:

But the other side of the coin though is that – to quote Gene Kranz, the great chief flight director at NASA – failure is not an option. More users need more data at the right time because those organizations try to become more data-driven. Of course, they want to extend it to all their different functions in the organization and make that the language that they’re using to collaborate with each other, to see different perspectives on things. And so, they need the right data at the right time, and this could not be more true than in financial services and industry insurance, and other firms that are using data heavily to understand customer patterns, to look at different risk factors that are out there to be taken into account. Legacy systems are really hitting capacity and performance limits.

David Stodder:

So, things that were set up for fixed limits in the past, as organizations try to become more data-driven and extended out, become a problem. And so, those capacity and performance limits are an issue, and of course, that’s a big driver of organizations heading to the cloud. Dependable and scalable performance is critical to reaching business objectives. So the data, the engines of the data, the data management systems, the data integration, the scalable performance is critical, being able to adjust as you try to take on new projects and start new initiatives, new products, and services. So, feeding analytics about policies and claims, that’s really an area particularly about the new products and new services need to understand that quite, and we’ll be talking about that in our round table and managing risk tolerance, so understanding what the risk tolerance is, what the risks are and be able to do the analytics that are necessary to understand that.

David Stodder:

Improving operational efficiency – we see in our research this is always if not the number one, is very close to number one in terms of goals. Organizations are always looking at ways they can be more efficient to reduce costs and also to extend that to customers because customers like efficiency too, and like good data that interchange with them, and so this is very important. Visibility to monitor regulatory compliance, this is again something that’s affecting really all industries, but certainly, in financial services and insurance, it’s a big factor. And so to be able to monitor regulatory compliance, have visibility into how people are using the data, how it’s being shared, and so forth is really critical, and then innovation, data-driven innovation for new revenue streams, new levels of personalization. This is how organizations are competing today, and so always needing to look at new data sources, be able to see data relationships that they hadn’t seen before.

David Stodder:

And so, all of these are really factors and failure cannot be an option with any of them really. So large scale reporting challenges, so organizations are really trying to push out reporting to more users to make the reporting more personalized, to give users more options. And so, these are four factors that I would… Well, really three factors, and then I’m going to quote a little bit of our research data that are, of course, the challenges. So, democratization is one challenge, a variety of users trying to move beyond just giving reports to a few users and analysts and those who maybe have the technical capabilities to do the right queries, to be able to drill down and look at the data more carefully. Of course, we’re at a whole new age now of self-service, business intelligence, and reporting. And so, there’s a great variety of users, business users, analytics teams, data scientists, they’re depending on often in many cases in an organization, hundreds if not thousands of reports in particularly large organizations, so that scale is a big issue.

David Stodder:

Slow performance. It impacts the business as I mentioned so much is depending on it. You’ve got these hundreds of concurrent reports. You’re going to have spikes at the end of the month, end of the month spikes, data insights that they’re not in time, again as I mentioned, to the right users at the right time. It can really slow everything down and create problems, really extended downstream, eventually reaching customer products and services. Data volume, variety, and distribution, these are perennial or constant data management challenges, but there’s no doubt that data sets are getting bigger. There’s a more variety of data that organizations want to look at. They want to be able to incorporate new data faster into systems because as we’ve just seen in our research, it can often take months to get that done just to bring new data into data warehouses, and for certain kinds of applications that’s really too slow.

David Stodder:

It’s drawing on multiple sources. It’s in many cases increasingly distributed. So as you get to new data sources, as different departments maybe develop their own self-service data platforms, you get a lot of distributed data out there, so that becomes an issue. So, we’ve seen in our research cloud migration is definitely a dominant trend, and want to do it for growth and to reduce costs and just three pieces of data here. 43%, that’s really the highest actually in this research survey, expand access with lower investments, so certainly looking for better price performance as they expand to more users and more data. 41% want to increase scale and speed, so they’re looking at the options in the cloud, not just to reduce costs and to reduce the delays of configuring systems and so forth, but also just to increase the scale and speed of the data and how they’re using it.

David Stodder:

And then 38% to unify the silos, I mentioned that increasingly distributed data and data fragmentation issues, so that’s a problem that they’re also trying to solve by consolidating data into the cloud. Supporting increasingly complex analytics, so we talked about reporting, but also just analytics themselves growing in organizations from citizen data scientists up to more advanced data scientists who are developing AI and machine learning programs. It’s definitely a key modernization driver for data management systems is to support growth in analytics, handling bigger and faster data for customer personalization, so understanding the data relationships across sources to provide that personalization based on what customer behavior has been, what their buying patterns have been, what you’re looking at in terms of segmentation. Omnichannels, so customers are interacting with organizations across lots of different channels, even multiple eCommerce channels through both directly and indirectly, and so it’s important to be able to analyze all of that data and look at customer behavior, and then the other is assessing risk tolerance.

David Stodder:

So as I mentioned, looking at different factors, different variables, complex queries, updating the risk models, having to keep those up to date as conditions change. These are all big challenges now running on, of course, enormous data sets, well, maybe even hundreds of terabytes, certainly hundreds of terabytes, and then to supply AI and machine learning infused automation in these applications. You want these to be continuously running 24/7 in many cases, and so that in itself is a data management challenge for data warehouses. So the automation, operational efficiency there are very important. Cloud migration to handle analytics, again, it’s a pretty big driver.

David Stodder:

32% want to modernize their data integration and management to create a new foundation for advanced analytics and machine learning. And, then 23% looking to align business needs with short, flexible analytics development cycles. And so, they’re trying to give the business agility, basically, to develop analytics models quickly and deploy them quickly, operationalize them, see what the results are, and make any changes that are necessary. All right, so let’s ask a poll question of our audience to see what is your biggest business driver. So the question is, what is the biggest business driver behind your organization’s plans to modernize reporting and analytics? Is it to improve speed to insight, so we can gain better business outcomes? So, you’re looking for speed to insight to reduce risk exposures and comply with regulations more effectively, increased satisfaction with risk analytics models.

David Stodder:

Next one would be increased user satisfaction with reports and interactive access to big data, and then reduce costs and optimize reporting and analytics workloads. And then if you have another answer that I did not list here, please supply it, just type an answer in our ask a question box. This will give us something we can talk about in the audience Q and A. All right, maybe I’ll give it another second here. I know all of these are important, so I’m just interested to see which is your biggest business driver at the moment. So, let’s take a look at the answers and we can see… Let me just refresh this one more time because I don’t know if I did catch everybody. Well, it’s changing a little bit and we see decidedly actually 52.9% saying improve speed to insight. I’m going to try to jot some of these down.

David Stodder:

Let me just refresh this one more time to make sure we’ve got all of everybody. It changed again, so now we’re down to 42.9% speed to insight. Risk exposure is not big, but 4.8% saying increased satisfaction with risk models, 14.3% user satisfaction, and then 38.1% reduce costs. Great, thank you. So, that gives us some interesting answers there. So cloud challenges in maximizing benefits, so as I mentioned, most organizations are very interested in going to the cloud, but issues managing costs, as I mentioned, costs being a big driver, but as you have more users and more workloads, we see 36% have concerns about all the things that can be cost drivers: data ingestion, data egress, extracting data, migration costs, so looking for ways to manage those costs. Then 25% accessibility in a number of users constrained due to cost concerns because there we see organizations actually setting limits, scalability limits really, because they want to manage costs. Keeping pace with the data explosion.

David Stodder:

So, even though you’re moving to the cloud and lots of scalability, the data explosion is not ending. And so, 30% seeing speed to data insights is not fast enough, particularly as this data explodes, they’re trying to get to more data, a lot of waiting for results, so queries are still taking hours and cannot be run due to competition with other workloads and conflicts with other workloads, so key issues there. And then hybrid multi-cloud challenges, so maybe moving to the cloud that still have systems on-premises or maybe moving to multiple cloud platforms, and so just managing this and avoiding… And, maybe organizations are also adopting a multi-cloud straight to avoid having a single point of failure, which is an important idea, avoiding vendor lock-in, but then of course the issue of new silos that can arise there. All right, I think I’m going to save this poll question for our panels.

David Stodder:

Let me just skip over that and we’ll come back to that later. So just to close up my presentation, realizing the cloud’s data warehouse potential, set priorities certainly is very important, and moving to the cloud is often we see is the number one thing an organization is trying to do, but in a way that’s not good enough. You got to have, what are the priorities? What are you trying to accomplish by doing this? Especially, in terms of reducing time to value. Users definitely need speed. They need improved query performance for large-scale reporting and analytics. So, that really can often be the priority, and we did see 49, 2.9% as it kept changing, speed to insight to get better business outcomes, and so users really needing data as that being a driver. Make performance more dependable, anticipate growth, and then take advantage of the cloud elasticity for better price performance.

David Stodder:

Support growth in analytics, so you want to be able to use the cloud’s potential for cloud data warehousing. What is the intended ROI of the analytics? And again, what is the purpose? What are we trying to do? What’s the business case for it? And, often it can be say to get more accurate and up-to-date risk models for customer policies and services, which of course really extends immediately to customer satisfaction and the bottom line really. Analytic is key to new data-driven competitive advantages. So, it’s really an innovation engine to use the data, and moving to the cloud can play a big role in that, and then adapt to hybrid multi-cloud. So, most organizations are going to be in a hybrid multi-cloud environment. Single platforms may not be enough, as I mentioned, the issues around that single point of failure, trying to avoid that key. So, think about those as priorities, and as you set your priorities in moving the cloud and using cloud data warehousing. Andrew, I think that’s going to do it for my presentation. I think you were going to introduce Srini.

Andrew Miller:

That’s right. Thank you very much, Dave. And before I introduce our next speaker, just a quick reminder to our audience. If you have a question, you can enter it at any time in the ask a question window. We’ll be answering audience questions in the final portion of our program. And as already mentioned, our next speaker is Srinivasan Mani. He’s a Vice President Application Portfolio Manager at Zurich North America. With over 20 years of data delivery, IT service delivery, and consulting experience in the PNC insurance industry, Srinivasan Mani now manages data delivery engagements in Zurich North America’s data and analytics organization. He has delivered multiple large-scale programs that include complex financial data integration initiatives, analytical ecosystem modernization, and advanced analytical reporting and modeling. He’s an MBA graduate from Northern Illinois University and received his postgraduate degree in applied mechanics from the Indian Institute of Technology in Chennai, India. Please welcome Srini.

Srinivasan Mani:

Thank you, Andrew, for the introduction. Good to have you all here. So let me introduce I think… Thanks for a great introduction, and let me introduce myself a little more, like whom I work for. So, this is Srinivasan Mani and I work for the data management organization with Zurich North America and we are part of the city office as well. To briefly introduce the company, Zurich in North America is a property and cash quality insurance company, as many may be aware, and its firm offers property and cash quality and life products to multiple large-scale customers and as well as small and medium-sized customers as well, plus also the global companies who need insurance coverages across the continents. So, we do support their business model as well. From the challenging aspect perspective on the data space, one of the challenges we had, as probably many other people who have had a company, would’ve had synthesizing this data in at scale.

Srinivasan Mani:

And, the volume is what is driving a lot of complexity as well, and how to use this data effectively on a day-to-day business, while we deal with lots of the data and how the reporting can be done effectively. That’s one of the challenges. Another is about the choices we can make in the marketplace. There are so many products and partners that are there in the marketplace. There are 100-plus partners that are available in the DW space who is ready to provide that solution. The challenge is we have to pick the optimal one that suits the current needs and as well as it solves all the problems we are trying to solve for as part of this monetization initiative. So, these are some of the challenges. I just want to highlight it here, but eventually we’ll talk about how we resolve those challenges.

Srinivasan Mani:

The challenges are on two fronts. One is on the technical aspect, and one is on the business aspect. The technical aspect is the warehouse itself is running in a performance configuration mode because of the volume and as well as the compute needs. Another one is the complexity of the workloads are changing every day because user expectations are changing, and the next is about the volume of the data. More and more workloads we add to the system and plus as well as the regulatory requirements and the financial reporting requirements drive a lot of the data granular-related questions, and that adds a lot of volume.

Srinivasan Mani:

And, the other is about the legacy part of the system that we continue to carry over when we do these migrations in a practical sense. So, there’s another challenge. We want to take a look and solve all of this holistically on the technical side. On the business front, there are a lot of other business processes, which is related to the preserving and as well as the estimating the losses and getting the financials reported to the CFO office on time, so that the P index can be reported to inform about the business performance, plus as well as any corrective measures have to be taken from the financial standpoint.

Srinivasan Mani:

All those challenges are coming into play because the timeliness of the data readiness is all driving this. Everyone would like to get the data on day one or day two after the month close, irrespective of how much of the processes we have to run. All those are driving the timing critical needs. Availability of data on time is one of the important aspects which we are trying to solve as part of this delivery implementation, and again, we have been successful in that. We’ll talk about that more, how we got there. Another aspect to look at is the operational aspects of the data. The availability of the data, just to give an example, something which is available at the 7:00 a.m. in the morning, comparatively that is available only in the afternoon. There’s a lot more nuances around that, primarily to drive operational efficiency.

Srinivasan Mani:

So, that’s where the timeliness and availability of the data become more useful in the operational environment and as well as more on the business efficiency driving factors as well. I think that summarizes the analytical challenges and the Yellowbrick outcome. So, what was the outcome after we implemented the Yellowbrick implementation? On the technical front, we were able to see a great level of performance improvement that supports all the users across the organization, including finance, actuaries, and other operational users, and as well as they’re able to get the data on time, and as well as we were able to beat the current performance, which is the native product, what we used as Netezza, and as well as we were able to decommission the C++ solution, which we had for the reserving process, which sometimes takes 72 hours to complete. It’s because it’s all written based on the C++ code.

Srinivasan Mani:

And, we were not able to do that in SQL and RAM because the existing platform does not support the workloads, but Yellowbrick is much more compatible to support such workloads. So, we’re able to successfully migrate that C++ code to a SQL-based code and it can run flawlessly in Yellowbrick. So, that reduced a lot of timing on the reserving cycle. Another one is the KPIs standpoint. The performance KPIs were really, extremely… has beat the performance levels. We just need to say the benchmark levels, what we had in Netezza, I would say it has beaten that by at least a two to 3X on a performance aspect. So, that’s where the KPIs, what we have seen on the performance from Yellowbrick’s standpoint was extremely fruitful, and it’s extremely impressive from that KPI standpoint.

Srinivasan Mani:

Another one is the operational reports that users are executing, and as well as the other financial users where the users used to wait for 10 minutes, 15 minutes to do an interactive analysis. To give a real use case scenario here, the users sometimes have a discussion in a meeting room. They want to talk about some of the business outcomes by looking at the data. Those reports used to run for 30 minutes, 45 minutes by the time the entire meeting is done, but now the users are able to pull the data while they discuss with their managers or with their partners. They’re able to get the data insights within 10 to 15 minutes, quickly able to review the data and give the answers then and there, so that the meeting is becoming much more fruitful and the insights are delivered as needed within the same meeting hour.

Srinivasan Mani:

And, they’re able to see that kind of improvement after this Yellowbrick implementation. That’s a great end-to-end situation I would say because the business outcome is much more influenced by the Yellowbrick platform. On the other side, also from the data availability and readiness, the financial flow cycle, which we used to execute our Netezza used to run for three plus days and a massive amount of people used to be involved to monitor the performance and as well as the workloads and the timeliness of the data delivery and a lot more resources were involved. With Yellowbrick implementation, we’re able to cut down that time by at least 70%, that frees up the resource time, and as well as the data is available much more effectively than before. We’re able to report the financial data to the CFO office well in advance than before. The CFO office has a deadline to meet.

Srinivasan Mani:

Day five the P&L has to be ready for review for management. Right now there are a lot more times saved because of Yellowbrick implementation. Same on the operational aspect, very simple use case I would like to call it out here. Every day we have to generate a report about the workflow data that we manage for claims. The workflow data says how many claims is pending for today, how many FMLAs are there today to make sure that we are able to optimally allocate the claim adjusters to work on those spending FMLAs and then make sure it’s staying close to within the stipulated SLAs. Those reports used to be available only in the afternoon prior to Yellowbrick, but now with Yellowbrick, they’re able to deliver that before seven o’clock Eastern time, and the client supervisors were able to assign the task information to the claim just before they start the day, so that we can see an optimal outcome from all the claim adjusters.

Srinivasan Mani:

That’s a great example for the operational efficiency improvement, and that is possible after we were able to do this Yellowbrick implementation. So, that’s one of the use cases I can call it out as part of the operational efficiency improvement. Similarly, there are many other operational efficiency improvements we have seen. For the sake of argument, I’ll just share an example of this. And, we found Yellowbrick is extremely a very good partner to work with us, and they’ve been supporting all aspects whenever there’s a technical problem, bringing their engineering team, and making sure that the problems are resolved on time. That’s the overall outcome and some performance metrics I would like to share here. The performance metrics, what you’ll see is comparing the Netezza run with Yellowbrick. Here, I’m looking to share some examples of the October run and November run. Those are the baselines we have captured in Netezza that is compared against the March and April execution times we have captured in Yellowbrick.

Srinivasan Mani:

So, you have totally five months. The first two months are related to Netezza; the last three months are related to Yellowbrick. So if you see, the performance levels are extremely impressive and we have seen lot more timing improvements and as well as be able to cut down the close execution time by 70%. When I say close, these are all back office processes. We have multiple business processes embedded in the back office system, and we have run those processes to make sure that the desired outcome is achieved from the financial data perspective, like IBR calculation, new earnings calculation. All of these calculations are critical, IFRS 17 calculations. These are all critical to finally understand the financial numbers that are reported on the SAP system for P&L reporting, profit, and loss reporting.

Srinivasan Mani:

All these processes executed well on time and as for our expectations, and we are able to save a massive improvement on the timings, and that’s where we were able to reduce the close cycle, at least from the back, the standpoint inside of three days. We are able to deliver that or execute that within a day, it’s less than 24 hours I would say. That’s where the total gain of 70% adds up. So just the performance metrics on the month-end process, similarly we had increments on the daily incremental loads where we process the data from multiple systems. Just for the sake of material constraint, I have not added here, but this month end is a critical one I would say from an organization standpoint and where we can highlight more value that’s getting delivered from the data processing and the user experience and all of this. That’s on the performance metric on one of the critical processes.

Srinivasan Mani:

On the overall experience with Yellowbrick, it’s a low-risk migration because of the compatibility of Netezza with Yellowbrick. There are challenges, usually, there are some differences as always observed, no technology migration, it’s always seamless. It’s not just “press the button and it happens.” There are challenges, but Yellowbrick has been a true partner with us, worked with us in crossing all those hurdles and making sure that we get the best support possible in all aspects. Starting from network connectivity, to the engineering problems, to the performance issues – all those things have been effectively handled by Yellowbrick.

Srinivasan Mani:

And as well, we migrated a huge set of data from Netezza to Yellowbrick. They supported those migrations, giving us tools and options, and as well as making changes to those tools based on the data requirements. So, all those aspects Yellowbrick did a fantastic job, and as well as in the overall aspect, the data theory timelines have been constantly reduced and multiple concurrent processes where we execute on time and as well as it also helped us to drive higher operational efficiencies. So, all the things where we’re able to made it possible through this Yellowbrick implementation. So, that’s where the true partnership comes into play, and as a company we are thankful to Yellowbrick for working with us and making us successful on this journey.

Srinivasan Mani:

That’s pretty much on the Yellowbrick experience and thank you for listening to us, and also our success story here, and looking forward to work with Yellowbrick and the partners in the upcoming analytical improvements, and as well as the next roadmap-related work items that might be coming into play in the next few months. And with that, thank you for the opportunity, and let me hand over this to Andrew to introduce the next speaker. Thank you once again.

Andrew Miller:

Thank you very much, Srini, and we are also going to be joined today by Mark Cusack, the CTO at Yellowbrick. He’ll be the third speaker in our panel discussion coming up. Before joining Yellowbrick, Mark was vice president for data and analytics at Teradata, where he led a variety of product management and technology teams in data warehouse and in advanced analytics groups. He was also Chief Architect of Teradata’s IoT analytics efforts. Mark joined Teradata in 2014 when Teradata acquired the startup RainStor where he was co-founding developer and chief architect. Prior to RainStor, Mark was a lead scientist in the UK Ministry of Defence. Mark holds a Ph.D. in computational physics from Newcastle University in the UK with a thesis centered on discovering the electronic and non-linear optical properties of quantum dots. As a research fellow at Newcastle, he developed new technologies to model these novel quantum structures using large-scale, parallel, and distributed computing approaches. Welcome, Mark, and with that, I’ll hand it back over to you, Dave, for the round table discussion.

David Stodder:

Thanks, Andrew, and welcome, Mark. Srini, so we’ll have our panel discussion now. So, I just wanted to talk about some of the issues and really the first one is just to talk about the journey. Srini, maybe we can get your thoughts about how you arrived at cloud data warehousing, what was the migration experience, and what were some of the dominant business objectives or drivers you were trying to solve with cloud data warehousing?

Srinivasan Mani:

Well, a couple of three important aspects I would say here is one is about the agility, and as well as the execution, run cost, another one is the speed to deliver. These are the different aspects we were being challenged by our leadership team and across everywhere because everyone would like to move to cloud to get some edge on the performance and see the operational cost and all of this. So, those are the three different key elements we looked into before we ventured to Yellowbrick, and we found that Yellowbrick is suitable to meet all those objectives.

David Stodder:

How about the migration experience itself? What were any issues you faced or how’d that go?

Srinivasan Mani:

Migration experience to be transformed, it was a very good experience, and as well as always, there are challenges to deal with. Technology is always different, and Netezza has a different way of engineering and the Yellowbrick is engineered differently to perform better than Netezza. So, we have to make our core compatible and we have to go through some of the customizations here in some places to make sure that the processes are working as expected, and we have seen there are processes executed well on time and are beating the Netezza performance without any changes, but there are processes we had to go and make a tweak. Tweak means in a sense some kind of re-engineering required. I would say the re-engineering effort was probably overall like 10 to 15% of the project, and the reason for that 10 to 15%, even I would say the business processes are complex and the calculations are complex.

Srinivasan Mani:

So, that’s what was driving those changes, but from the compatibility aspect, Yellowbrick was really good on that from the compatibility aspect compared to Netezza, that’s what we have seen. And, on the other aspect is on the migration experience, the support from Yellowbrick was very effective so that we are able to solve the problems on time. Always there are problems. Every day we’ll have 100 calls and we were seeing performance issues, compatibility issues, but everything were being solved without any hiccups because of the seamless support we received from Yellowbrick, and the partnership was very good with that aspect. And overall, the migration experience had both positives and negatives, but at the end of the day, we achieved the results that we want. That’s what I would say.

David Stodder:

That’s good to hear. Well, Mark, we’ll bring you in on this discussion. What are you seeing just broadly in terms of customers, their journey to cloud data warehousing, the drivers, and maybe some of the issues they may be facing?

Mark Cusack:

Go ahead.

Srinivasan Mani:

Go ahead.

Mark Cusack:

Sorry, Srini, go ahead.

Srinivasan Mani:

No, the drivers would be in any case… So, the important aspect would be to make sure that what we are solving for, and sometimes the intentions, we go and pick up a solution, which is very shiny, but really speaking the value that comes out of that is less right, and we have seen cloud data warehouse, which are really shiny, but whether it’s going to be useful for us, whether there may be a return on investment would be there, the answer is no. And, that’s what we as an organization realized. Running your own shiny thing is not wrong, but at the same time, how much you’re going to get out of that? So, that’s what we found. Yellowbrick has the capabilities, there’s strengths and weaknesses, but whether it can meet the requirements of a problem we are trying to solve for, and that’s the principle we followed and we found that Yellowbrick is suitable to resolve those issues and address those problems, and as well as to support the ongoing business needs, and we found it suitable in all those three different aspects.

David Stodder:

It’s definitely good to keep your priorities in mind and not be distracted I think, as you say, by the shiny things out there, that’s important. Mark, what are you seeing in terms of organizations trying to get to the cloud, particularly for large-scale reporting? There’s a lot riding on that and they probably have some established systems in place. So, it’s a big deal to migrate to the cloud.

Mark Cusack:

It is, and I think most organizations have a cloud strategy in one way or another, but what I’m seeing particularly in financial services and insurances, is that a lot of companies are looking to take a considered move and considered approach to how they move workloads to the cloud and when. And many are going to be in a hybrid cloud stance for many years to come, partly because of the regulatory aspects around their industry or concerns around data security and privacy. That means that some of their workloads are going to be retained on-prem.

Most organizations have a cloud strategy, but what I'm seeing particularly in financial services and insurances, is that a lot of companies are looking to take a considered move and considered approach to how they move workloads to the cloud and when. And many are going to be in a hybrid cloud stance for many years to come, partly because of the regulatory aspects around their industry or concerns around data security and privacy.

Mark Cusack:

And so, the customers I’m working with at the moment are taking a very judicious look at what makes sense to move to the cloud and why and when, and putting together a solid business case around that. I think, as you mentioned earlier, Dave, doing a deep TCO, ROI calculation on what it means, but there are great drivers for why folks want to get there. It’s not all about to the cloud or nothing, cloud or bust or anything like that, it’s about how would the cloud help us be more agile, develop new use cases, get to market quicker, not be constrained by the physical confines of a data center, and how can we do things faster than we could and have that industry advantage? That’s what I’m seeing there.

David Stodder:

Is it a challenge at all to get business and IT on the same page? We’ve talked about I think some good issues about trying to keep your eye on the ball, as Srini put it, what you’re solving for. So, that’s really the business side, but is IT in good communication and understanding what you’re trying to solve, and then, of course, their aspects on their issues in terms of the challenges they see, in other words, in moving to the cloud? How do you see business and IT working together?

Srinivasan Mani:

I think that process works well once you define the benefits and we have a collaboration. Everyone knows IT is not a P&L unit. You’re not generating any revenue in most instances, but you’re a support unit. The cost allocation goes back to business. So, the business alignment is more important. That comes up very well. Once we define the benefits that the solution can deliver and where the business finds the value. In that aspect, in this project execution, what we have done as a team is we’re able to articulate the benefits very well and being practical and communicate those benefits to our business partners like finance, actuaries because they’re all back office systems, and they’re able to see a value, which is getting out of that. And when they recognize there’s a benefit and we didn’t see any pushback or anything on the budgets or anything on the execution front, and we got full collaboration around that.

Srinivasan Mani:

And, the UAT cycle for this initiative or the user acceptance testing was around for roughly three months because of the complexity of the environment. We’re able to get a full alignment because it’s day one we started engaging by articulating the benefits and as well as how we are going to look at this environment or this platform for the next five years and everyone embraces those ideas, and as well as the benefits, which is communicated to the business partners. In that aspect, the important thing we have learned is being transparent and communicating about this change and then no surprises, and as well as to see how we can deliver value and quantifying that value element, which is going to come out of this project. So, those are the different areas or the different communication paths that would help us to get a smooth delivery, and as well as making sure that we have a full alignment from the business partners.

Mark Cusack:

Actually, Dave, if I could just add onto that, actually just a vendor perspective in general, when you are looking at selling an analytic solution into an enterprise and you absolutely need that line of business, IT integration and collaboration, even as a vendor, or else we will not be successful there because if we choose to focus all our efforts on selling our solution to a particular line of business, the line of business has the responsibility to then to try and convince their own IT to adopt it, and very often IT departments have architectural committees you have to go through. They don’t like having vendor solutions thrust down their throat from a line of business. And on the flip side, if we as a vendor just focus on selling into IT without having the use cases and the business value associated with it, we’re not going to get anywhere either. And so, it’s absolutely incumbent on vendors as well to make sure that we establish strong relationships with lines of business and our customers and in central IT too.

David Stodder:

I think those are really great points. Actually, I think at this point, let me go back to that poll question I was going to run before, so we can get our audience contribution on this. So yes, this is the question I was looking for. To our audience, what is your biggest challenge in optimizing large-scale reporting and complex analytics with data warehousing? So the title of our webinar. Is it lack of scalability to handle more data workloads and users? Is it data quality concerns? Is it managing and accessing distributed data? For example, across a hybrid multi-cloud environment? Not having enough skilled personnel or that performance is too slow for business-critical analytics. And finally forecasting costs and staying on budget.

David Stodder:

So, we did see before in some of the TDWI research that’s an issue. So, let’s just see what the results are for this, what our audience thinks. Well, maybe they need a little bit longer time there. So, some of these I think we’ve discussed already, but we’ll talk a little bit further in our round table. Interesting, we’re not getting the… There we go. Wow, so it’s actually divided, 50% saying managing and accessing distributed data, and then the other is not enough skilled personnel. Mark, let’s bring you in, any thoughts on those results? Actually, we were going to talk about just the distributed data and hybrid and difficulties there in particular.

Mark Cusack:

I would start off on the managing and accessing distributed data and hybrid. I think it’s a well-trodden statistic that around 80% of time is spent moving data and preparing data before you even do any analysis on it. And so, whether it’s a siloed organization within a single data center within IT, or whether it’s multiple clouds, it’s that movement of data that’s both expensive from a time and cost perspective. So, managing that is extremely difficult to do. And so, as you look at newer technologies and ways of doing this, you go, “Well, how do we minimize data movement between clouds or between silos or whatever?”

Mark Cusack:

And, then you start to think of ideas like data fabrics being applied to here, where you try and leave the data where it lies and try and push down processing to where that data is and do some level of federation on top of that data, but then it leads to other problems around data management and governance and a whole separate line of discussion, I think, about how you understand the quality of the data, the lineage of that data, where it might be in some other… Who owns it? Who’s responsible for maintaining that quality of the data and so on and so forth? And so, the whole problem starts to snowball from there.

David Stodder:

Srini, do you have any thoughts about higher multi-cloud environments and how to handle that? Have you been looking at data fabrics or data mesh?

Srinivasan Mani:

I think the data fabric and data mesh’s concept choice is really good, but the practical sense has to be looked at in detail. That’s my view. Why I’m saying that is if I take my company, we have so much of legacy systems built in the 1980s, 85, in the 90s, and all are built with different data models in place. Relating the data through data mesh or data fabric, it’s practically sometimes impossible, so we have to move the data to some place to consolidate it and try to find the relativity around the data. So, it’s a tough task and it’s more about how much of data you have and how long your company is existing. All those are driving factors, so that concept is still maturing, and again, there should be a fine balance somewhere, and even we are trying to experiment as a company to see how we can leverage those concepts.

Srinivasan Mani:

That’s one aspect of the data mesh or data fabric. Hybrid cloud is something that’s inevitable because of the capabilities available with the different cloud vendors or product partners. Everyone has a different offering, so the hybrid cloud is going to be always delivered and how to optimally manage, that’s where the skillset of the managers and the ideal leaderships comes into play to see that we have limited wastage on that, but again, all of these are maturing, and I think it’s probably in the… I’m expecting as an organization… For example, Zurich North America, we are maturing on the cloud space because we are slowly migrating the applications and we have applications in multi-cloud and we are also learning as part of this journey, and I don’t think anything wrong with that, and it takes time to mature.

Srinivasan Mani:

Plus, the application rationalization aspect also will help to make sure the data mesh and data fabric can work effectively. So, it is just not about the technology. The organization also has to build a culture, some discipline has to be built to be successful on that, and the skillset is obviously everything is new. We cannot expect someone to come up with the skillset, which is going to be something is introduced yesterday, you cannot find someone with that experience tomorrow with everything he or she knows. That’s not going to be feasible, so let’s be pragmatic on that, and that’s where the product companies will be helping us more. Yellowbrick is also literally new when we ventured into and we have gone through the learning curve as an organization, and I think we have gone through all those learning hurdles and we are very successful end of the day after this project execution. That’s where the product partner’s partnership and relationship comes into play to make the customer successful.

David Stodder:

For sure. They did not have skilled personnel, that’s interesting that we did get such a high percentage there and I think you’re right, certainly that in a transition from one technology to another that’s what you’re going to face and it’s difficult to find people who have the right skillsets. From on-premises to the cloud though, I guess in theory you don’t need all the people who know about configurations and all the storage and processing assets that you have on-premises, so that goes away, but there are other skills that are necessary in the cloud I imagine.

Mark Cusack:

Yeah, there are because you never just deposit a cloud data warehouse, spin it up, and you’re off to the races. You’ve got all of that entire analytical ecosystem around it that you are looking at. And so when you’re looking at a cloud migration, for example, from your data center into the cloud, you’re not taking along just a data warehouse, you’re taking all the upstream data integration and pipeline feeds. You’re talking about downstream applications, analytics, model scoring, for example, and so that entire ecosystem, and along with the entire ecosystem is a whole set of cloud-native services that you’re likely to want to integrate with as well. So, the skillsets go far beyond just, how do I manage a cloud data warehouse, but how do I start to manage my broader ecosystem and interconnect all of that within a particular cloud provider or across cloud providers?

David Stodder:

It’s interesting stuff. Let me push back to our round table discussion. Other questions we have. Let’s see, one is, have you noticed a change in behavior in your customers in the business since you completed the project, Srini? How are the customers responding to the changes?

Srinivasan Mani:

The changes is a challenging task. When we deal with an organization of 1,000-plus users, community, and various partners on different levels, one of the pushback comes is, why do I need to do this differently than what I was doing yesterday? What is the benefit I’m getting? I think that’s where the challenge is. We have to explain those benefits and why that change has to happen, otherwise as a company or a user base, we are restricting ourselves from the growth path. I think that information has to be shared with real transparency and also getting an alignment from the reporting managers and everyone, and as well as from the partners as well from within the organization. That’s where the real challenge comes into play, but overcoming that challenge, I would say it’s not an impossible task. It takes time and it just takes more of a collaboration discussion.

Srinivasan Mani:

And, that’s where we found that from the change management perspective, when we tried to venture into Yellowbrick, first of all, wherever possible, we made sure that the user experience is seamless. We made the technology changes in the back end so that the users are not experiencing the change, but it’s always users are going to see here and there, there will be changes, always expected, but that’s for good reason. We’re able to communicate those rationally very well, and the user base is able to embrace those changes and it went well from the transition perspective, I would say. And, the essence is proactive communication and being transparent, and plus following those basic things about user manuals and making sure that the changes are communicated upfront, it’s not a surprise once you roll up. I would say there’s nothing beyond that. I would say it’s a normal process here to follow in any large-scale implementation.

David Stodder:

Sure.

Mark Cusack:

And, I’d just love to make a point on migration to in data warehouse migration in general because I think one of the dangers in the things that is off-putting is a boil the ocean notion where you’ve got to change everything, both the data warehouse. Are you changing your BI tooling? Are you changing your data integration side of things? And so, one of the things that we’ve done, and I know that Srini has taken great advantage of this, it’s been more bite off the migration process, keep your existing Informatica jobs in place, keep your downstream micro strategies or SaaS applications in place, and you’re replacing the back-end data warehouse with Yellowbrick for example, as Srini said. So, hopefully, you’re getting all of the price performance benefits that are getting delivered to your end users, but they’re still using the tools and techniques that they’re familiar with. So, a lot of that ecosystem hasn’t changed in this instance.

David Stodder:

That seems like it’s critical. It’s an interesting balance because you want to keep things so that the users are not disrupted, but at the same time you’re opening up so many new things for them to be able to do to get to new kinds of data and maybe support their self-service analytics. And so, it’s an exciting time too. Is the diversity of users a challenge between business users and then you have data scientists in terms of keeping them all satisfied, or how is your approach to that?

Srinivasan Mani:

So, the business users were very easy to handle. The reason for that is I think Mark stated already, so we did not change the presentation very much. We did all the back end writing ourselves to make sure that the users are not experiencing this, but the data scientist group is different. They want to deal with the data directly, so that the queries have to be rewritten, and some challenges are there. Yellowbrick works a little differently than the Netezza platform. So, we have to follow some of that transition-related nuances. Again, it is not that intensely I would say. In my sense, it’s more about the communication and also documenting the changes about how it works and take the patterns, read the patterns, and then come up with a tablet format or a cheat sheet more about if you see something in Netezza like this it’s going to work in Yellowbrick in a different way, just make a note of that and communicating those changes.

Srinivasan Mani:

So, I think the challenge was over communication and supply, more on the people who directly deal with the technical aspect of the data by directly extracting the data from the systems to their own software tools like DataIQ or Alation or Alteryx and those tools, which is available in the marketplace. So, there was some transition required for those users, and I think it went well from the transition aspect because of the communication, plus the amount of time we’ve given them to validate or anything on the UAT phase, which is close to three months. So, that made the user to feel comfortable on the transition phase.

David Stodder:

Well, good. I want to make sure we have some time for our audience questions. So Andrew, let me go hand it back to you for those.

Andrew Miller:

Terrific. Thank you, Dave. Thank you, Mark, and thank you, Srini. We will be moving into our Q and A period now to answer some of the audience questions. I’m going to begin with this question for you, Dave, but I think maybe all three of you will weigh in on this potentially. This is a question from our audience member, “Data ownership is an issue for us. We have many data silos owned by different departments that make it difficult for us to scale up our reporting in the cloud. What strategies will help us solve this problem?”

David Stodder:

Well, we could talk a long time about that. I’d be interested to hear what Mark and Srini think about this, but I think some of it, that communication is obviously critical, and to show the data owners what the benefits are going to be of the new platform, from their perspective, from the business perspective, what more are they going to be able to do? What things are going to be better? What may be more efficient, where data quality could improve and where it may even be easier to handle risk calculations, data regulations certainly, being able to adhere to those? So, I think that’s the main thing is really communicating in their terms what’s the advantages for them. Srini or mark might have some thoughts.

Mark Cusack:

I would just add that in some cases it makes sense to just try and eliminate some of those silos to start to consolidate some of that data where it makes sense too, but you’ve really got to start, I think, by having a proper governance framework around all of this, and most of the successful companies that I speak to, they actually have a central governance committee that starts to put data definitions, semantics around, puts proper data ownership, responsibilities in place, and then works to extract the most important data that’s valued across the whole business and proceed in that way.

Srinivasan Mani:

This will be a perspective on that question, right? The silos eliminating, everyone has a practical challenge. In the organization I work for, we do have the same challenge. The more and more legacy systems we make the systems to hang around, we’re going to build the silos. The single source of truth is going to be a challenge. From the organizational standpoint, when we execute the projects, the simple piece we always miss, or the challenge we have is leaving something to run for a longer time. When we try to decommission, I think we should be way harder on doing the decommission. So that way, when you try to build a system, another system is in parallel to replace this… It’s getting replaced with the new system. The old system should be gone as soon as we finish this project or have a roadmap to get it down.

Srinivasan Mani:

Those things doesn’t happen because of the various pressing priorities as an organization dealing with. So, I have to solve for something which is important, then trying to put an end of road for another system, which isn’t required anymore. So, when we leave that, the silos are going to exist and the practical challenges… Where you would you like to invest more, your resources and time, what brings more value on the table? So, that’s what drives all those discussions and the answer lies within the people who make those decisions. So, that’s where I would say most of the time it comes to the execution and the prioritization. That’s where the silos can be eliminated. Otherwise, I don’t see any… Technical solutions won’t help here any time, whatever you bring in a data visualization layer or any of this, that’s not going to help much. That’s just my two cents on that.

Mark Cusack:

Good points.

Andrew Miller:

Terrific. The next question, I know that we’re running a little long here, so I’ll try and get maybe one more in here before the end of this webinar. This one I believe we’ll start with Srini, but again, everyone can weigh in. The question is pretty straightforward, “How did you minimize the risk of migration?”

Srinivasan Mani:

Sorry, can you repeat that question once?

Andrew Miller:

Yes. The question was, “How did you minimize the risk of migration?”

Srinivasan Mani:

The risk of migration is primarily around how we are trying to face the rollout, and sometimes the approach we followed as part of this Yellowbrick implementation was a big bang approach. Unfortunately, the way the data relationship and all of this works, we cannot have… The practical challenge we had was we cannot run something on Netezza and leave it and then run something on Yellowbrick in parallel. Coexistence was not a solution for us because of the data dependencies. The migration risk was mitigated in the sense that how we are trying to compare the data between the Netezza and Yellowbrick and how the baselines were built. The data baselines and how it was compared was a key for us to be successful on this, and that’s where we were able to get a proper baseline to compare and make sure that the Yellowbrick functionality is completely matching to the existing Netezza functionality.

Srinivasan Mani:

Plus, there is no data-driven problems due to this migration. So, that’s where we spent a lot of time making sure that the processes were running as expected, and also the data is matching to the expectations onto the baseline. And when we got that comfortness, the risk is completely eliminated, I would say, in my view. So, the migration was done without any issues, keeping that there was enough testing done to make sure the data is matching to the baselines. When you don’t have a proper baseline to compare, and when you try to do this a large-scale migration, then we are going to deal with a lot of migration trust. Those risks can only be eliminated when we have a baseline to compare, and then we are able to completely certify towards the baseline and getting convinced with those results.

Andrew Miller:

All right, terrific. Well, we have come to the end of our time. So, let me take a moment to thank our speakers today. We’ve heard from Dave Stodder with TDWI, Mark Cusack with Yellowbrick, and Srinivasan Mani with Zurich North America. Also, I’d like to thank, again, our Yellowbrick for sponsoring today’s webinar, and please remember that we recorded today’s webinar and we’ll be emailing you a link of the archived version for this presentation. Feel free to share that with colleagues, and don’t forget, if you’d like a copy of today’s presentation, use the click here for a PDF line. Finally, I want to remind you that TDWI offers a wealth of information, including the latest research reports and webinars about BI, data warehousing, and a host of related topics. So, I encourage you to tap into that expertise at tdwi.org. And from all of here today, let me say thank you very much for attending. This concludes today’s event.

Platform

workloads

Resource Center

Customer Stories

About us

Newsroom

careers

partner

CONTACT US

How to Optimize Large-Scale Reporting and Complex Analytics

How to Optimize Large-Scale Reporting and Complex Analytics

Transcript:

More like this

The Power of Real-Time Analytics and Yellowbrick’s Role in the Data Revolution

Workload Analytics: Tickling the Soft Underbelly of the Platform

Text-to-SQL with Dataherald and Yellowbrick

Keeping Your Cloud Data Platforms Secure

Why Private Data Cloud?

DBAs Face Up To Kubernetes

Customers

Learn Why Customers Love Us!

Sign up for our newsletter and stay up to date

How to Optimize Large-Scale Reporting and Complex Analytics

Product

Customers

Pricing

Resources

BLOG

Resources

competitive

Platform

workloads

Resource Center

Customer Stories

About us

Newsroom

careers

partner