The proper use of Data — data about team performance, data about customers or data about the competition, can be a sort of force multiplier. It has the potential to dramatically help a business to scale. But sadly, many businesses have data but don’t know how to properly leverage it. What exactly is useful data? How can you properly utilize data? How can data help a business grow? To address this, we are talking to business leaders who can share stories from their experience about “How To Effectively Leverage Data To Take Your Company To The Next Level”.
Neil Carson is currently CEO and co-founder of Yellowbrick Data. He started the company in a garage in San Mateo, California, along with Mark Brinicombe. Under his leadership, the company built an award-winning SQL data platform, widely adopted by FIS Global, the US Navy, America Movil, Zurich North America, and other leading global financial services institutions in place of legacy data warehouse technology. Prior to Yellowbrick, Neil Carson was CTO of Fusion-io, Inc, a pioneer in the SSD (solid-state disk) industry.
Thank you so much for joining us in this interview series. Before we dive in, our readers would love to “get to know you” a bit better. Can you tell us a bit about your ‘backstory’ and how you got started?
I moved from England to California to work for Oracle back in 1998 along with my friend Mark, who is also the co-founder of Yellowbrick. We’ve now been building software together for around 30 years. We met during university times in the UK. My main talent was the languages of humankind, but I was one of the people who always had a knack for computers. I started programming when I was nine years old and was quite good at it. I was monetizing shareware by the time I was 13. For me, building complex software is explaining to a very simple entity (the computer) what to do by decomposing problems. I scraped through my degree at the bottom of the class, despite acing the programming classes, because I was spending my time on entrepreneurial pursuits instead of studying.
The idea for Yellowbrick came while at Fusion-io; as the CTO (Chief Travelling Officer) I was spending a disproportionate amount of time on the road visiting customers and prospects, fulfilling the role of head product manager, field CTO and evangelist. A lot of partners had come to us wanting to benchmark their SQL data warehouses on our fast SSD storage, but the benefits were marginal. We realized it was possible to produce a much higher-performing data warehouse at a far lower cost by making use of high-performance SSD storage and bypassing main memory. This patented architecture is the core of Yellowbrick’s secret sauce.
It has been said that sometimes our mistakes can be our greatest teachers. Can you share a story about a humorous mistake you made when you first starting and the lesson you learned from that?
One of the funniest moments has to be learning how to manage and work with a start-up Board of Directors. There’s a natural tension between company (common) directors and the investor (preferred) directors.
Rapport and mutual trust take time to build. A good Board doesn’t just always say Yes but gives advice on strategy and pushes entrepreneurs to do better. There’s no shame in asking for help and sometimes there can be controversy or a difference of opinion in the room.
I failed to appreciate all of these dynamics during our first board meeting. I wore a brand new T-shirt from artist Plastered8, which I thought was quite funny — a cartoon of a man in a Chinese fortune cat outfit with giant dollar sunglasses, smoking a cigarette and proudly raising its middle finger. I stood up and presented to the newly formed Board of Directors, with the bright red cat front and centre. This wasn’t the best way of building a new relationship; some background phone calls between Directors asked if I was “giving them the flip now we got the money!”. Despite that setback, many years later, we all trust and appreciate each other.
Leadership often entails making difficult decisions or hard choices between two apparently good paths. Can you share a story with us about a hard decision or choice you had to make as a leader?
There have been several. As a leader, when you feel a department or project is on the wrong path, it almost certainly is, even though there’s often an instinctive belief that things can be rectified given time, and surely my own bad decisions didn’t lead to this path? This has happened a couple of times in the history of Yellowbrick, both with departments and projects. The hard choice is always “take time to build and develop teams and people, encourage them to get them where they need to go” or “accept that the people you have aren’t the people you need now, eat some humble pie that you should have realised that earlier, and rebuild the department or project.” In a larger company, you can take lots of time to follow the former path. In a smaller company where time is of the essence, you have to take the latter path: At small scale, it’s almost always quicker to rebuild than to get there incrementally. It’s been a hard lesson for me to learn, and has broken many close relationships, but has been a core factor in my personal growth.
One such decision was rebuilding our Kubernetes-based cloud platform from scratch. We’d been on a path with an experienced team I’d hired for a couple of years, but the architecture was needlessly complex and it caused reliability issues at customers. At some point we had to cut bait and rebuild the thing. One year later, we had a different team composed of engineers already in the company, and built a more reliable, easier to use product with 80% less code.
Are you working on any new, exciting projects now? How do you think that might help people?
Absolutely! We’re talking about data and data-driven organisations here. Data projects have to have an ROI — what’s the business value of the new insights compared with the cost of building them? Normally an enterprise-wide data warehouse is a shared resource, used by many projects, departments or lines of business. Historically it’s been really hard to allocate the costs of the data preparation, ETL, queries, reports and even AI assistants that comprise a modern data project. We’re building a new feature that does exactly this — allowing all costs, from queries to software licenses to infrastructure spend, to be surfaced, understood and passed on to lines of business, even when they might be sharing the same data models and schemas.
You are a successful business leader. Which three character traits do you think were most instrumental to your success? Can you please share a story or example for each?
- Persistence — never give up. All businesses have ups and downs. There have been a couple of times in our history where seismic shifts have then place — the move to the cloud and remote work during Covid, for example, which essentially halted or killed our on-premises business. If we stayed the course, we’d have gone out f business. You alter course, drive change, and eventually get to where you need to be. You see the same trait in the leaders who went on to build other great, long lasting technology companies like Apple, Nvidia or Oracle.
- Pragmatic humility — we’re all learning and growing. Yellowbrick is doing things that have bever been done before, and we’re bound to make mistakes along the way. Admitting to the whole team where you’ve made mistakes encourages others to do so, as well, and that way we all learn. At the same time, you’ve always got to have a direction that people believe in, a hill to climb, and you always need to be executing instead of standing still. The collective effort and outcome is more important than my own personal achievements.
- Stay hands-on — the product vision comes from founders, and it’s difficult for others to embrace and imagine what’s possible the same way. This is how we built our direct data accelerator technology, how we were the first product to put a SQL user interface on top of Kubernetes, and how we drive other innovations. It’s important to stay involved in product development and technical decisions as well as meeting customers and prospects for inspiration.
Thank you for all that. Let’s now turn to the main focus of our discussion about empowering organizations to be more “data-driven.” For the benefit of our readers, can you help explain what it looks like to use data to make decisions.
We’re in the position of building a product that enables companies to make faster and more accurate decisions with more data, more often. Obviously, we use our own technology internally to make decisions too! Large scale examples within our customer base include financial allocation — credit card companies predicting ahead of time how much cash to keep on-hand for members to exercise reward points vs. investing the cash at a higher yield; infrastructure optimization — telcos deciding where to play new cell towers and where to open new stores; inventory optimization — retailers deciding which products to place on which shelves depending on the weather forecast or size of parking lot; customer loyalty programmes; etc. Internally with Yellowbrick we manically embrace automation and data capture within our product development group, making sure that the results of all tests are continually analysed and optimised.
Based on your experience, which companies can most benefit from tools that empower data collaboration?
Almost all businesses that have any need for efficiencies or scale. Data collaboration is the foundation of modern enterprise decision-making. At Yellowbrick, we see this firsthand — our customers, from financial institutions to telcos and retailers, rely on seamless, high-performance data insights to drive critical decisions. Internally, we embrace a data data-driven mindset, automating analytics across our teams to improve efficiency and innovation — from sales to customer support to product development to financial planning and operations to demand generation and marketing. Literally every function benefits from data. The companies that win today are those that remove data silos, enabling their teams to collaborate in real time, securely, and at scale.
Can you share some examples of how data analytics and data collaboration can help to improve operations, processes, and customer experiences? We’d love to hear some stories if possible.
Modernising the data infrastructure can yield massive operational and process benefits.
Just last week, I met with a customer who struggled with managing 3,000 disparate SQL Server databases spread across 75 servers. The databases were accessed through end-user mobile applications that wanted interactive response times, but the larger databases were unable to answer questions fast enough. We helped them modernize their data architecture platform and architecture, moving everything into one database which not only dramatically simplified management with a centralized data model but also allowed them to query all data simultaneously. Queries became universally interactive, reports ran in seconds instead of hours, automation eliminated the need for manual database maintenance and scaling became seamless — rather than provisioning and balancing new servers, they could simply add more compute or storage when needed. The punchline here was they saved $10M over 5 years in operating expenses while improving their time to insight and end-user satisfaction.
From your vantage point, has the shift toward becoming more data-driven been challenging for some teams or organizations? What are the challenges? How can organizations solve these challenges?
It’s a never-ending challenge. The challenges now are much the same as always — poor data quality, data silos, resistance to change (often individuals would sooner make decisions based on experience than on data), data wrangling skills shortages, and sometimes even over-analysis — it’s almost always possible manipulate analysis to support your point of view, just as with statistics.
Ok. Thank you. Here is the primary question of our discussion. Based on your experience and success, what are “Five Ways a Company Can Effectively Leverage Data to Take It To The Next Level”? Please share a story or an example for each.
Note that these aren’t in any particular priority order, and we’ve seen so many complex migration and new business analytics projects that the list is nowhere near exhaustive:
- Don’t create data silos. The first choice should be to consolidate and integrate data in one place. This isn’t always possible when data sovereignty / and residency rules come into play, but it eliminates data duplication, increases data quality and opens access to more users across the business. For example, data is far easier to join and correlate when it is in a single database instance.
- Make data platform technology choices that are fit for purpose without technology proliferation. One technology doesn’t solve all the problems, but having 30 relational databases, 5 graph databases, vector stores and multiple ETL and data transport tools doesn’t make sense. Most valuable data is structured, and architects are far too keen to deploy large complex application stacks to solve problems that can be simply accomplished with a good database, open source data wrangling tool, ELT and IDE.
- Get some minimum viable data governance in place. This doesn’t have to be heavy weight, but adopt consistent naming and data type conventions and agree on primary keys and data quality metrics across business domains. Adopt enterprise standards for access control to data. This will make it much easier to integrate data across the business. This is primarily a people and process problem to address rather than something you can just throw technology at.
- Take data architecture seriously, using star and snowflake schemas and strict typing in preference to big wide tables, since they will be more future proof, more extensible and more accurate when working with financial data. Semi-structured data has a place but should be used for arbitrary extensibility rather than core data due to integrity issues. Consider leveraging a “data product” architecture, where snapshots of each business consumers’ data sets can be copied into individual sandboxes for experimentation and discovery of new insights to encourage creativity and formulation of new business ideas.
- Consider query cost, response time or SLA requirements and future scalability ahead of time, rather than retroactively. Divide your workloads between “base” and “spikes.” Base workloads are typically continuously running data load, processing and query activities. In many cases it makes sense to run these on reserved cloud instances or even in a private cloud setting. Spike workloads are occasional in nature; these are well suited to public cloud if your on-prem infrastructure doesn’t have spare capacity. Typically, you will want to “own the base and rent the spike” when it comes to infrastructure.
Based on your experience, how do you think the need for data might evolve and change over the next five years?
A good leading indicator would be the stack of letters I have on my desk at home informing me that my personal data was stolen. Late last year, data warehouses stored in a well known cloud data warehouse provider were breached, leading to personal data theft from AT&T, Santander, Ticketmaster, my healthcare provider and others. If, as Clive Humby stated, “data is the new oil,” we have to get a lot better at protecting it, understanding who has access to it and the value of the insights generated from it. Enterprises have got into the habit of throwing customer data into SaaS platforms and sticking with the usual dismissive narrative of “I trust ’s security much more than ”. Misconfiguration in cloud deployments is so common. It’s trivial to grant public access to services. Just a couple of clicks and all the data stored in an S3 bucket appears on the internet for everyone to access. The famous verifications.io data breach happened because of a MongoDB database with 763,000,000 email addresses sitting on the internet with no password. Netflix, TD Bank and Ford had data stolen due to a public bucket at a data services provider. The lists of cloud breaches go on and on.
AI is now driving a reevaluation of not just the value of data, but the value of insights that can be gained from it, and who has access. This reevaluation intersects with increasing requirements around data residency and borders. All of which make the case for distributed data architecture along with a resurgence in interest in on-premises data warehouse modernisation. Furthermore, as AI starts doing the work of data analysts and we can create them virtually, the cost of querying and analysing the data will become top of mind.
If you could inspire a movement that would bring the most amount of good to most people, what would that be?
I believe that the most important problem of our time, and the cause of most issues in society, is inequality. Starting a movement to recognize this as the root cause of so much anger and divisiveness in society and proactively suggest and drive ideas to address it would perhaps do the most amount of good, especially if applied internationally. Much of the world still suffers, even children, yet so much excess wealth is concentrated in the hands of so few. Raising and debating the issues in the context of bringing good to most people can yield more peaceful solutions than class conflict.
How can our readers further follow your work?
You can get the latest news about Yellowbrick at https://yellowbrick.com and my personal blog at https://neilcarson.me.
Thank you so much for sharing these important insights. We wish you continued success and good health!