About AI Beyond Blind Trust
Dufrain’s AI Beyond Blind Trust series explores the realities of modern AI, cutting through hype to show how explainability, governance, and strong data foundations build real confidence in AI-driven decisions.
As we prepare to launch our new series with lots of insights and opinions AI Beyond Blind Trust, Dufrain’s Data & AI team are exploring the foundations of trustworthy AI starting with the basics of supervised machine learning.
Supervised Machine Learning: What It Is and Why It Still Matters in the Age of GenAI
In a world obsessed with generative AI, it’s easy to forget the power of traditional machine learning. While large language models dominate headlines, the methods that quietly predict fraud, optimise marketing, recommend products, and assess credit risk all stem from a field that’s been evolving for decades – supervised machine learning (ML).
“There is often a temptation to default to using a large language model, when actually a supervised machine learning model trained for a very specific job would be a much better choice. Choosing the right tool for the job is one of the most important decisions in building trustworthy AI.” Isobel Daley Head of AI
At Dufrain, we recently ran one of our Data Science & AI fortnightly learning sessions exploring this very topic, breaking down what supervised ML actually is, how it works, and why it still underpins so many of the systems that make AI practical, measurable, and explainable.
What is supervised machine learning?
At its core, supervised ML is about teaching algorithms using labelled data, datasets where both the input and the correct output are known.
If you’re trying to predict house prices, for example, you might feed a model data about square footage, number of bedrooms, and location (your features), along with the actual sale price (your label). The algorithm learns patterns that link features to labels so it can later predict the price of a house it hasn’t seen before.
There are two key types of supervised ML tasks:
- Regression – predicting a continuous number (like price or forecasted revenue).
- Classification – predicting a category (for example, whether an email is spam or not).
As one of our data scientists explained during the session:
“With regression you’re predicting a number; with classification you’re predicting a category. But both are about using labelled examples to help a model learn what to expect next.”
Algorithms made simple: Logistic regression, decision trees, and random forests
The session explored three of the most common and explainable algorithms in supervised ML, ones that combine power with transparency.
Logistic regression is often used to predict probabilities. Imagine you’re trying to determine whether it’ll rain today. The model considers features like cloud cover and humidity, then calculates the likelihood of rain. If the probability exceeds a threshold (say, 50%), the answer is “yes.” It’s simple, powerful, and crucially …explainable.
Decision trees take a more visual approach. They resemble a flowchart, splitting data at each node with a yes/no question until they reach an outcome. For instance:
Is the sky cloudy? → Yes.
Is there rain in the forecast? → Yes.
Result: Bring an umbrella.
Each split is chosen strategically by the algorithm to reach the right answer as efficiently as possible.
Then there’s the random forest, which expands on this concept. Instead of one tree, it builds many each trained on slightly different slices of the data. Every tree makes a prediction, and the final output is determined by a “majority vote.” As one of the team put it:
“It’s the wisdom of the crowd lots of trees making slightly different predictions that, when combined, give a much stronger and more reliable result.”
Evaluating performance: Accuracy isn’t everything
Training a model is one thing. Knowing whether it actually works is another.
Supervised ML relies on a train/test split, training the model on one dataset and then testing it on unseen data to check if it generalises well. That’s where the confusion matrix comes in a simple but powerful way to visualise what the model got right and wrong.
The matrix shows:
- True Positives (TP): predicted correctly as positive
- False Positives (FP): predicted positive but actually negative
- True Negatives (TN): predicted correctly as negative
- False Negatives (FN): predicted negative but actually positive
From here, we derive metrics like accuracy, precision, recall, and the F1 score. Accuracy is intuitive but not always useful. In fraud detection, for example, 99.9% of transactions might be legitimate so a model that predicts “not fraud” every time could still claim 99.9% accuracy. Yet it fails to detect actual fraud.
That’s why precision and recall matter more in many business cases.
- Precision asks: When the model predicts “yes,” how often is it right?
- Recall asks: Of all the true “yes” cases, how many did we actually catch?
Balancing those two gives a more meaningful measure of real-world performance, particularly when missing a positive case carries real risk.
The real-world challenges: Data quality, governance, and explainability
One of the most valuable parts of the discussion came from the team’s collective experience deploying models in real environments.
A Dufrainian added that real datasets are rarely clean. Category names are inconsistent, values are missing, and codes vary between systems.
As Isobel Daley explained:
“That translation from text to numbers is often a significant cleaning step. Inconsistent or duplicate labels can completely derail training. Having consistent reference data and validation at source makes an enormous difference.”
Bruce Miller one of our project managers built on this with a real-world reminder:
“It’s one thing to build a model on cleaned data, it’s another to deploy it into production when live data doesn’t look the same. Good data governance & management and validation at source are critical so your operational data doesn’t undermine the model.”
This touches on something Dufrain’s teams see often: machine learning only performs as well as the data foundation it sits on. Poor data hygiene and lack of governance lead to overfitting, bias, and brittle models that fail when exposed to the messiness of reality.
Explainability also matters, not every model needs to justify itself, but in regulated sectors like banking and insurance, it’s non-negotiable. As Bruce noted, credit risk models must go through committees, testing, and audit to justify every decision.
“For something like Netflix recommendations, a false positive just annoys someone. For lending decisions, it can change lives, and that’s why you still need high levels of transparency and traceability.”
Why explainability and monitoring should never be optional
An important dimension discussed was around the multi-model pipelines used in production:
“In practice, you rarely see a single model pipeline. Models feed into each other, segmentation informs recommendation, recommendation feeds into product selection, and all of that needs constant monitoring.” Dufrain Data Scientist.
That’s where tools like MLflow and strong ML governance come in enabling teams to track performance, monitor drift, and maintain responsible oversight as models evolve.
Three essentials for successful supervised ML
- Strong, labelled data foundations. Collect, label, and govern your data properly from the start. It’s the biggest predictor of success.
- Explainability first. Choose the simplest model that meets your goals. If you can’t explain it, you can’t trust it.
- Continuous monitoring and governance. Models aren’t static, keep validating, refining, and ensuring alignment with your organisation’s risk appetite.
Final thoughts
Supervised machine learning may not be as headline-grabbing as generative AI, but it’s still the engine room of intelligent decision-making. From predicting customer churn to detecting anomalies in financial transactions, it remains one of the most powerful, reliable, and explainable approaches in data science.
And like any high-performing system, it relies on getting your data house in order first. At Dufrain, that’s where we start building the strong data foundations that make advanced analytics and AI not just possible, but practical and responsible!
If you want to see AI in practice with our demo showcasing our Microsoft business solutions, visit our webinar here.
Frequently Asked Questions
What is supervised machine learning?
Supervised machine learning is an approach where models are trained on labelled data examples where the correct answer is already known. It helps predict outcomes for new, unseen data using the patterns it has learned. At Dufrain, our AI & Advanced Analytics team use these techniques to help clients turn historical data into actionable intelligence.
What’s the difference between supervised and unsupervised learning?
Supervised learning uses labelled data to train models for specific predictions or classifications. Unsupervised learning, by contrast, finds hidden patterns or groupings in unlabelled data, for example, segmenting customers based on behaviour. Dufrain’s experts work with both, depending on whether an organisation needs targeted predictions or broader pattern discovery.
Why is data quality so important for supervised machine learning?
Poor-quality or inconsistent data leads to inaccurate predictions and bias. High-quality, well-labelled data allows models to learn effectively, perform accurately, and remain explainable. At Dufrain, we start by strengthening data foundations, so our clients’ AI initiatives deliver reliable, responsible results from day one.
