Undertsanding Machine Learning – First Steps

Figure 13. Video demonstration showing how to use Excel to run a KNN classification model

Practical task - download the KNN Excel file and change the position of the unknown flower. [Link to Excel KNN file]

Working With Data.

In machine learning, the machine has to learn from something. That something is data. Usually the data comes from files or streams – e.g. serial data. Data can also be generated by setting up competing algorithms in a game or adversarial scenario.

However it is generated, data needs to be processed by machine learning algorithms, and this means that the data needs to be prepared for ‘ingestion’.

Watch the video below to get an understanding of how data can be prepared for use in machine learning.

https://onedrive.live.com/?authkey=%21AH9ufP8gPMnMMLA&cid=2D07F5043DA09852&id=2D07F5043DA09852%2186577&parId=2D07F5043DA09852%2186339&o=OneUp

Figure 14. Preparing Data video

Let’s begin by learning how information can be broken down into data that machine learning can learn from.

Take a look at these zeros and ones in the matrix below.

It is a binary representation of something, but what does it represent ?

Figure 15. Matrix of zeros and ones

The matrix of zeros and ones represents a pixelated face.

Figure 16. A rough depiction of a human face in large pixels

Let’s look at a different set of binary numbers in a matrix of the same size. What could this be ?

Figure 17. Randomly generated zeros and ones

It could be a range of different things, but it’s not a pixelated human face for sure. It’s just a set of randomly generated zeros and 1s in a 3 x 4 array.

Figure 18. Black = 1, White = 0

In labelling one image a ‘face’, and the other ‘not a face’, we have used one of the machine learning methods that we explored in our combine harvester example.

But which one?

Figure 19. Which machine learning method should be used to categorise these images?

The answer is classification.

To be precise, its Binary Classification – classifying the elements of a given set into two groups by predicting which group each one belongs to, based on a rule.

Whilst we’ve used our human powers of pattern recognition and abstraction to do this, lets now think about how we can apply this principle to a practical machine learning problem.

Let’s look at a high-level abstraction of the type of things that machine learning lets us do.

To start, where might the ability to extract a face from a background be useful? Examples include:

• Security

• Border control

• Photo enhancing apps

• Snapchat

• Facebook tagging

So, how do we turn this into a system for managing the categorisation of faces from other features, such as the face from the yacht in the picture below?

Figure 20. Different degrees of shading turned into a matrix of 0s and 1s

We first sort the data out into rows.

We then call row (line) data ‘labels’ and the zeros and ones ‘features’.

Next, we add a ‘vector’ column to ’award points’ for rows that match the shape you are looking for

Then we set an algorithm for classifying the whole image based on the sum of the vectors.

Figure 21. Assigning vector values to a matrix

A key part of machine learning is learning, and this comes from ‘training’ the software.

So, we can now think of the digital representation of the faces as ‘training data’.

To train software to recognise faces, we feed large numbers of facial images to the machine learning system. The more images we feed in, the more accurate the result.

After a while, the computer learns what should be categorised as a face.

Figure 22. Training data for facial recognition

When our machine learning system has learned what counts as a face, we can present it with images of faces and other items and it should correctly categorise the images.

Figure 23. Distinguishing between features in a picture

Here will learn how to handle different types of data to make it useful in a machine learning environment. The first question to ask is how a machine can learn from data.