How machine learning works: Modelling data, training AI systems, and understanding their limits

 
AI

This is a guest post by Dr Stephen Anning, Visiting Researcher in the Department of Web Science at the University of Southampton and online tutor for the MA in Artificial Intelligence. 

This blog post shares insights from student discussions in the ‘Introduction to AI’ module on the University of Southampton’s online MA in Artificial Intelligence, exploring how machine learning systems are created and how they model the world through data.  

The course is designed to give non-STEM students access to artificial intelligence and the technologies that underpin it. If you have never touched code or have no great love for maths but want to understand how AI works and how it is shaping society, this course is for you. 

How machine learning models use mathematical data to make predictions 

This week developed upon the thread of how to artificially produce intelligence using machine learning by delving into the underlying code. We lift the lid on machine learning by giving students access to Google Colab Notebooks, which contain machine learning workflows.  

Students get first hand insights on how machine learning models are created and trained. There is no expectation for students to become coding experts, or even learn the basics of code. This is an interdisciplinary course, so we offer a range of options for students to explore as part of their research.

Students with or without coding knowledge can be equally successful. Giving access to code this week gives first hand insight into how machine learning is about the mathematical modelling of data to make predictions. 

The central takeaway is the fundamental question of how effectively machine learning can truly "understand" the world. As we have explored in the Google Colab notebooks, the process begins with an analogue to digital conversion by transforming real world phenomena into machine readable representations captured in images, recordings, or language. The machine then constructs a mathematical model of these representations to generate training data for an algorithm.  

Ultimately, the "intelligence" produced by a machine learning algorithm is a statistical prediction of an input relative to that training data. Consequently, the extent to which a machine can be said to "understand" its environment depends entirely on the fidelity with which those phenomena can be mathematically modelled. 

There is no expectation for you to become coding experts. Go as far into the code as you see fit. Since doing the predecessor to this course in 2017, I have become a self taught coder. Nevertheless, people I studied with have also been awarded PhDs without having to learn code. Interdisciplinary courses like this one give you a range of options to choose from, my advice is to choose one that suits you and go for it. 

Why data quality and implementation matter in machine learning systems 

We're beginning to understand that building machine learning systems is far more than writing code. It requires a robust implementation plan. This plan must include work streams for data gathering, quality checks, governance, and testing.  

A significant portion of our discussion focused on the origin of data, and we explored methods such as crowdsourcing and web scraping for data gathering. The quality and volume of this data are paramount, as the algorithm’s ability to "understand" or predict is entirely dependent on the diversity and integrity of its training set.  

For instance, a model trained solely on social media data will produce vastly different outputs than one trained on an authoritative academic corpus, which is often behind a paywall. When building machine learning systems, the majority of time is spent on gathering, organising and quality assuring the data. The algorithm is the easy bit. 

How AI image recognition works using mathematical models 

To explain how a machine "understands," the weekly exercise used the example of image recognition. The Colab notebooks contain code for an AI powered flower recognition system. A computer does not see a "flower" in the human sense. Instead, it deconstructs a digital image into pixels, each represented by numerical values such as RGB or hex codes.  

These pixels are further modelled into shapes and proportions, such as the ratio of a petal's length to its width. Human or automated annotators then label a flower for its type, for example Iris or Rose. This mathematical representation allows the algorithm to compare new inputs against its training data to find the most similar composition, thereby "identifying" the object. An input with shapes, ratios, and colours similar to those of an Iris will generate an output of "Iris". 

The same principle applies to language, where words are converted into vector embeddings, mathematical points in a multi dimensional space. This allows the machine to predict relationships between terms based on their proximity. I will leave you to decide whether this approach constitutes intelligence. 

The limits of machine learning: subjectivity vs objectivity in AI models 

This whole point about mathematically modelling data to make predictions reveals the critical tension between subjectivity and objectivity in data modelling. While a machine can objectively identify a rose based on its physical features such as petal shape, colour, and dimensions, the "meaning" we attach to that rose is inherently subjective.  

For example, a human might annotate a rose with the tag "romance." When a user asks for something romantic, the machine reaches for the rose not because it understands the emotion, but because it has been programmed with a human’s subjective interpretation. And while roses are a typical romantic gesture, they are not necessarily a romantic gesture for everyone. An algorithm not trained to know pink hydrangeas will not account for outlier tastes. 

This distinction becomes even more complex when modelling human tone, sentiment, or personality. Essentially, these are qualitative interpretations, but we are attempting to mathematically model them with machine learning. While we can attempt to quantify tone through variables such as volume, speaking rate, or specific word choices, these numerical representations often miss the nuance of human experience.  

The webinar concluded that while we can model the "objective" world with increasing fidelity, the "subjective" world of emotions, sarcasm, and cultural values remains a challenge. People can have different emotional responses to the same data.  

Consider the phrase, "Britain has left the European Union". This phrase will invoke positive, negative and neutral emotions depending on how you feel about the event. As such, the subjectivity of data interpretation reveals the limits of machine learning. 

The "intelligence" of the machine is ultimately limited by how well these subjective phenomena can be translated into the objective language of mathematics. Machine learning is transformative for many objective applications, but we have to be aware of the limits for more subjective questions. 

Find out more about the MA in Artificial Intelligence 

Machine learning is transforming how organisations analyse data, make predictions, and automate decisions across many sectors. Understanding how these systems are built, how they learn from data, and where their limitations lie is becoming an essential skill for professionals in a wide range of fields.  

The University of Southampton’s online MA in Artificial Intelligence is a conversion course designed for people from non-STEM backgrounds who want to understand how AI works and how it can be applied responsibly in real-world contexts. You don’t need prior coding experience to take part. 

Explore the course