The power of data: a CFO perspective (part 1)

Dufrain’s Chief Financial Officer Alex Meakin shares his perspective on the power of data and turning spreadsheets into analytics.

In an age of data proliferation and seemingly every possible thing that can be measured is captured, stored and analysed, understanding your data, the wider data universe in which it lives in and what this means for your business has never been more important.” – Alex Meakin, Chief Financial Officer


If you are a CFO or wider Finance professional, am I telling you something that you didn’t know? Hopefully not (if I am, this may be the point to reflect on whether you picked the right career). The value of using data to make commercially informed decisions is not a new concept for the Finance community. Finance professionals have been pushing this agenda for many years, often being one of the key (if not the primary) data champions within a business. 

So, what’s changed and why should this matter to CFO’s? There are clearly many aspects including the range of data sources, accessibility, speed of change, ability to gain greater insight, regulation and what the competition is doing amongst others. In this blog I’ll be touching on a few of these topics and reflecting on my own experiences.


The world runs off of Excel, and that’s just ridiculous…

An aerial view with red-roofed buildings and geometric shapes

When I hear this type of narrative, I find it generally quite irritating (and frankly I often switch off at this point). Not because I think that there’s no truth in it (which there is some) or because I’m a particularly staunch lover of Excel (which I’m not). It’s just a gross oversimplification of the situation, often to suit an agenda, and shows little insight into what the most appropriate solution is for a given problem.

Clearly spreadsheets have a very strong use case for certain types of situations. Often where a quick answer is required or where you want to wireframe some calculations, which you’re not quite sure what the structure will be or how it should be modelled, it is a very flexible tool. After all, why do you need a sledgehammer to crack a nut?!

The answer to that is clearly linked to how many nuts you want to crack and how tough those nuts are (further extension of this metaphor and the audience of this blog becomes squirrels, so let’s move on.). For simple use cases and / or tasks that are non-repetitive, Excel is ideal. I’m sure we’ve all seen examples of (and probably been guilty of) using a spreadsheet way beyond what it’s appropriate for though…

  • The file has become many MB’s big
  • There’s a spider’s web of linked files (none of which you’re sure whether they’ve all updated correctly)
  • Excel sheets are going on for hundreds of thousands of rows / columns and the scroll bars have become invisible to the naked eye
  • A formula has inadvertently been changed to reference the wrong cell screwing up your forecast (which you usually don’t discover until too late) 
  • Your team is taking an interminable amount of time to produce regular reporting and / or it’s wrong
  • A key file has become corrupted and “needs rebuilding”
  • There’s only 1 person in your team who understands how “the spreadsheet” works….

…I could go on. 

We all know there’s times when there’s probably a more appropriate solution, but either haven’t got the bandwidth to slow down to invest the time in the new solution or don’t really know what the best alternative is (again a function of time available) especially against the backdrop of rapid technology evolution. So often, we muddle on.

But if we don’t want to just muddle on and gain a competitive edge from our data, often related to achieving a sharper insight more quickly than the competition, what are the things that should be considered?


Where is my data?

data leaders discussing how to be empowered by data literacy

If looking at options of how we could use data more effectively one of the fundamental questions is… “what data are we talking about?” For Finance teams a core component is clearly financial data which may be derived from sources such as the accounting / ERP system, bank feed, the billing system, stock control system and a myriad of other supporting data sources. Layering on top of that comes non-financial data which may include data coming from CRM systems, HR systems and a variety of internal operational systems. Beyond that there are potential external data sources and unstructured data (data held in files, images, documents, emails, etc.).

For structured data held within their source systems there will clearly be defined relationships between the various data points. But what about data coming from different sources? Clearly there will also often be relationships between data held in different systems. Some of these may be clear and explicit (e.g. common ID’s / keys for fields used between interfaced systems), others may not be explicitly defined but still exist. If you want to work with these combined data sets regularly though, where the explicit and implicit data relationships are already embedded within the datasets, then combining this data manually repeatedly is laborious, error prone and not a good use of time.

Data lakes, data warehouses, and data lakehouses

This is where the concepts of data lakes, data warehouses, and data lakehouses come in. Let’s break these concepts down: 

  • Data lake – is a vast pool where you can dump all of your data, regardless of format and structure. Including structured output from your ERP system, unstructured social media mentions data, or big data from sensors on a production line or fleet of vehicles. Data lakes are great for flexibility, scale, exploring your data, and experimenting with AI & ML. However, be aware this is raw data that needs wrangling before analysis.
  • Data warehouse – this is an extremely curated space which contains processed and highly relevant data. Warehouses are built for speed and structure, and are perfect for traditional BI, reporting, and analysis. However, their rigidity means they can become costly to expand, change, and maintain.
  • Data lakehouse – a hybrid of the previous two, which can host curated data like a warehouse, as well as raw data like a lake. This offers flexibility to enable any planned AI & ML analysis use cases, whilst also meeting your current performance needs, to future-proof your data landscape.

How do I trust my data?

Power of data

So you’ve successfully stored and structured your data in a central repository ready to be analysed, mined and interrogated. The value which you derive from this data though is a function of the quality of the data and how up to date it is. Much like herding cats, without the correct governance framework and controls in place your data can quickly become disorganised and lose the underlying relationships you were trying to derive value from (cue lots of spreadsheet reconciliations…)

There are a number of techniques and controls you can implement however to avoid this potentially massive headache, ensuring that your data maintains a high level of integrity and consistency and you continue to get the value you were hoping from it.

Other considerations include data security, ensuring only those that should be able to access the data have access to what they should see, and data protection. Clearly if you are storing any kind of sensitive or personal data then data protection laws come into play (GDPR being one of the most prominent).

Applying an appropriate control framework means that these need not become significant overheads, more just sensible best practice protecting both your data and the wider business.


How do I interrogate my data?

How do I interrogate my data?

At this point your data is stored, accurate and secure. Now it becomes a question of what you want to do with it. Clearly you want to report off of that data and also use that data to help you forecast and predict the future. 

Leaving these points aside momentarily, having got to the point of all of your brought into one place wouldn’t it be great to explore the data you’ve got and the relationships that you may derive value from?

In terms of data exploration and data analytics there are a wide range of tools on the market you can use to combine, blend, shape, pivot and automate various views of your data.

For those happy working at a more programmatic level you can use open source query and high level programming languages such as SQL or Python which, given the size of the communities and the proliferation of online training videos and articles, have never been more accessible to a wide audience as they are today.

Alternatively there are low code and no code tools that allow users to analyse their data without having to write any significant amount of code. Whilst these are generally paid for products, the learning curve is shallower than the options previously mentioned and are more accessible to a wider range of users thereby potentially shortening the time to insight.

Data visualisation tools, such as Power BI and Tableau, can also be very useful in data discovery, as they can connect to a wide array of data sources (including those not held in captured within a central data store) and allow you to create visual representations of your blended data sets. This can reveal relationships not immediately obvious in other forms of analysis.

An overarching point here though is that ideally you want to put the tools in the hands of those users who have the commercial understanding of the data being interrogated and will ask the pertinent questions (which will likely be iterative) to gain the greatest value. Often, this will be dependent on the skills and experience within your team. For data savvy teams who are willing to go up a bit of a learning curve this may be the team members themselves or alternatively they may rely on a separate data team, or a blend of the two. Horses for courses!


Ready for more?

To read more of Alex’s insights, continue to part 2.

For more information on unstructured data and powerful visualisations take a look at some of our dashboards here.