Introduction
In this blog, we are going to talk about defining machine learning operations. What does that mean? Well, we're first going to take a look at what machine learning is. I felt that it was important before diving into the machine learning operations scenario that we first cover the basics of machine learning and machine learning operations. So we'll start off this blog by talking about where we use machine learning and some basic terminology that you need to know. Follow this up by talking about machine learning operations. What is that? And some basic terminology that you need to know there, as well.
Applications Of Machine Learning:
So let's start off by talking about where machine learning is used. And to be fair, it's used in a ton of different applications, but I wanted to give you a few of the main categories to help you to get a good sense for some of the places that machine learning is used in.
To start off with, let's talk about Image recognition: So image recognition is analyzing pictures. So for instance, taking pictures of dogs and being able to say, "Yes, that's a dog." Even though it's dressed in a sweater, it's still a dog. So that is a machine learning category that is very large.
Speech recognition: So converting speech to text or text to speech; that is a machine learning function.
Medical diagnostics: Huge area of machine learning. So looking at pattern recognition. So for instance, we could look at a heart monitor and we could pull out all of the details of that in a continuous streaming feed and from machine learning applications, pick out abnormalities that a doctor might want to look at.
Fraud detection: Another huge area. So being able to look at all kinds of different data points coming in real-time and analyze those for things that stand out.
Marketing: so, what's your likelihood to buy? Let's look at customer funnels, figure out things that customers like to do and see if we can pick out patterns to determine the likelihood that someone's going to buy a product or where they fall in a customer funnel.
Almost all of the big hedge funds have prediction or analysis algorithms that they have created that look at fundamental and technical indicators to determine when to buy and sell.
Basic Terminology:
So now let's talk about some basic terminology that you need to know. We'll start off by talking about the model. A machine learning model just takes data and it applies an algorithm to determine a formula to predict patterns. So basically, we take some starting data, we do some stuff to it and then at the end, we should be able to predict something that we want to know. That is a machine learning model.
A data consists of a data set which is training data, validation data, test data. So basically, our data is broken up into different pieces and that is used to run through our math formulas to predict what's going to happen.
Next up, we have feature extraction. So a feature is just an individual measurable property and characteristic. So what that means is just a column of data. So let's say for instance that we're trying to figure out how successful a basketball player is going to be. So columns of data that we might be interested in as features would be height or points scored per game or rebounds, right? Those would be columns of data that we would be interested in using as features in our model to determine how successful a basketball player is. Those features are then run through an algorithm.
An Algorithm is just a fancy math function or logic that adjusts as it's exposed to more data. So basically, what we do is we take our data, we separate out the features or the columns of data that we're most interested in; we run those through a fancy math formula or algorithm, and then that gives us predictions.
Now, once we get a prediction, we then go back through and we adjust our features slightly and how we look at those features and we run through this cycle multiple times. As we do that over and over again, our model should be getting better predictions, and that's how machine learning works.
Machine Learning Operations:
So, Machine learning operations, then. We start off with a pipeline. So this is just a collection of machine learning activities. So for instance, we might be looking at ingesting data first as an activity. And then we might look at cleaning up that data or transforming it--that's an activity. Or applying an algorithm--that's an activity. So we start off by ingesting data, moving it into storage, and then taking that data and transforming it, or doing data wrangling. Basically, that just means taking all of our data sources and making them all look the same.
If we start off with messy data or poor data or not enough data, we're not going to get a good response at the end. We're not going to get a good prediction. We have to have good clean data that all looks the same to be able to have a chance to get a good prediction. So then once that's done, we're going to go ahead and do the rest of our steps or our activities.
We're going to do our feature extraction. We're going to run our data through that algorithm. Then we're going to get our predictions. And so then, at the very end we have our orchestration, which is automating all of this workflow that we've created. And so, machine learning operations then is going to be not as concerned with choosing an algorithm or why we need to choose a specific feature, but it's going to be more interested in this workflow or this pipeline.
So how do we ingest the data?
How do we transform it?
How do we apply that algorithm that has been given to us by the data scientists?
And then getting those predictions, and then also,
how do we automate this entire thing?
How do we monitor our model to make sure it's working appropriately?
Those are all things that are a part of machine learning operations.
Machine learning ops (or MLOps) involves creating pipelines and automating data movement and machine learning activities. It's all about that pipeline. Moving data through that pipeline, automating your data, automating your workflows, and then looking at how do we set up alerts and monitor the activity of our pipelines.
Machine learning ops blends security, machine learning, data engineering, and traditional cloud operations. You have to be a little bit of a Jack of all trades in order to be a good machine learning operations engineer becauase you have to understand the security. You also have to understand a little bit of machine learning so that you know kind of what you're doing and how to apply those algorithms. You have to understand data engineering and how data moves through a system. And then of course, you have to understand the traditional cloud operations, which is going to be something that is required for anything in the cloud.
Conclusion:
I hope this blog has been helpful. In the next lesson, we're going to jump in and explore more in detail of all these activities.
All right, I'll see you there.