The curious case of the mediaeval hippo

Why you need the right data foundation for AI and ML

In line with Moore’s law, whereby speed and efficiency of computers is doubling every two years, big data and cloud data capabilities are rapidly evolving. The traditional manual process of creating and analysing spreadsheets of data is giving way to machine learning (ML) and artificial intelligence (AI), providing businesses with new and more efficient ways to manage information.

In this modern working environment, machines do the hard work and insightful data to drive better business decisions is just a click away. So, it’s easy to see why ML and AI have become the new buzzwords in boardrooms across the world, especially when they bring things like:

Predictive customer purchasing forecasts Marketing insights
Resource efficiencies Cost reductions

But first, let’s define the two terms.

Artificial intelligence

AI is the theory and development of computer systems that can perform tasks which normally require human intelligence. This includes things like visual perception, speech recognition, decision making and translating languages.

Machine learning

ML is a process that uses mathematical data models to help a computer learn without direct instruction. A subset of AI, ML uses algorithms to identify patterns within data. Those patterns are then used to create a data model that can make predictions.


Introducing the mediaeval hippo

mediaeval hippo

While many modern businesses now consider ML and AI to be silver bullets to solve all their challenges, it’s important to emphasise that having the right foundations in place is vital. Otherwise, you might find yourself with a mediaeval hippo scenario.

At this point, you’re probably wondering what on earth a mediaeval hippo has to do with ML and AI. To answer this, we’d like to take you way back in time to circa 500AD. A time of adventure and discovery when people across the UK were fascinated by new and exotic creatures.

[km-cta-block padding=20 block-classes=”has-dark-teal-background-colour has-white-colour” label=”Contact us to discuss your Big Data requirements ” ]

Find out how we can help

Our data experts would love to hear from you

[km_button link=”https://www.dufrain.co.uk/contact/” classes=”cta-2″]Contact us[/km_button] or [km_button link=”tel:08001303656″ classes=”cta-2″]Call us on 0800 130 3656[/km_button][/km-cta-block]

In many ways, the artists of this era carried out the tasks that ML and AI fulfil today. They would often be given a second or third hand description of a place, object or event, and based on a set of assumptions and criteria, they would convert this information into a visual representation. From designing new maps to depicting new worlds or the latest inventions, it was down to these artists to make such concepts a reality on the page.

This might sound straightforward, but there’s always scope for error when information is passed on, especially when it’s passed on multiple times, and if the correct foundations aren’t in place, things can go very wrong very quickly.

Consider this description that may have been given to a mediaeval artist:

“A big herbivore that spends more time in the river than it does on land, although it can’t breathe underwater. It has big tusks, greyish skin and small legs. Also referred to as the elephant of the Nile.”

Now try to imagine you’re living in mediaeval times, and you have no idea what a hippo looks like. Using your knowledge of elephants and fish – you’ve seen plenty of pictures of those already – you probably wouldn’t be surprised to see this artist’s interpretation:

The curious case of the mediaeval hippo

To be fair, the image ticks all the boxes of the description and in this respect, there’s nothing to say it’s wrong. Of course, this isn’t a hippo by any stretch of the imagination. 

The main problem is that there was:

  • No quality check on the source information
  • No review of potential historic bias
  • No checks to locate additional or missing data
  • No way to gauge if the artist is falling into false correlation 

Applying the mediaeval hippo analogy to ML and AI

Professionals using a laptop and AI-powered interface for business tasks

Before you dive into an ML and AI programme, you must carefully consider the following data aspects:

Data quality

The analysis and output of your data is only as good as the quality of the data you input. You might have the best algorithms in the world, but if your source data isn’t up to scratch, you’re not going to gain great insights from it.

Data bias

AI often looks for patterns in data and uses previous knowledge to decide how to interpret this information. So, if AI is fed certain information with a certain result, it will use this to predict future results.

Let’s use AI for staff selection as an example. Imagine you have an AI system for recruitment, which is fed historical CVs along with interview results. It uses that information to predict who you should hire. But the system may not consider changes in demographics, changes in job requirements, or the long-term performance of the people hired. This could lead to the wrong people being suggested for new roles. 

In addition, some AI models are designed to remove data with a high or low likelihood of being a specific class to reduce the workload of humans. A sample of this data should still be manually verified to reduce potential bias.

Data accuracy

Models require all data to be available or they assume it is. Therefore, missing values need to be input with meaningful values. If this fails to happen, it may lead the AI to draw false conclusions.

Data interpretation

It’s important to understand that ‘correlation’ and ‘causation’ are not the same thing. Although the hippo was described in a similar way to an elephant and a fish, that doesn’t make it half-elephant/half-fish.

Accurate data makes a huge difference

Let’s attempt to draw the hippo again using a different data framework. Based on countless sources, here’s the definitive description of a hippo.

A poster displaying hippo in blue and white colors

These high quality data inputs are more likely to generate this accurate representation of a hippo.


Dufrain can prepare your business for AI and ML

hand touches a business graph and data chart on a dark background

Dufrain specialises in providing businesses like yours with the right data infrastructure foundations, upon which you can then build accurate ML and AI solutions.

These are the 6 essential steps on your journey to becoming ML and AI-ready:

1. Create a data strategy

Learn more

2. Control your data

Learn more

3. Data architecture

Learn more

4. Data management

Learn more

5. Data quality

Learn more

6. Data governance

Learn more

If you’re looking to implement ML and AI solutions, our data consultants can help you to build a solid foundation and support you at every stage .

[km-cta-block padding=20 block-classes=”has-dark-teal-background-colour has-white-colour” label=”Contact us to discuss your data strategy”  ]

Do you need support with ML and AI solutions?

Our data experts would love to hear from you

[km_button link=”https://www.dufrain.co.uk/contact/” classes=”cta-2″]Contact us[/km_button] or [km_button link=”tel:08001303656″ classes=”cta-2″]Call us on 0800 130 3656[/km_button][/km-cta-block]