Machine Learning Model Inside a Container

Training a Machine Learning Model inside a Docker Container

MishanRG

Published in

Nerd For Tech

7 min readMay 29, 2021

Greetings Everyone!!! Here I am back with another blog.

In this blog, we will be discussing how we will be creating a machine learning model in a Docker Container and using the model for some predictions. For the task, we will be creating a fundamental machine learning model which can predict salary based on years of experience.

So let's get started with the blog!!!

Practical

We can use the local system or an AWS instance for this task. So whatever we are comfortable with, we can use that.

First, we need to have Docker installed inside our Linux Operating. Here I am using RedHat Linux, so we need to configure yum to install docker. We need to create a .repo file inside our /etc/yum.repos.d/. We have created a file docker.repo, and then inside the file, we have configured the repo. You can see the below image for the same.

Now we can install the docker using the command:

yum install docker-ce --nobest

Now check if docker is installed or not. We can use the command:

docker --version

After that, the docker service should be started. We can check the service status using the command systemctl status docker. If it is not running, we can use systemctl start docker to start it, and systemctl enable docker to enable it on every boot.

We can use “docker ps” to see all the containers running and “docker images” to see all the images that we have in our system, which docker can use to build containers. Here in the below image, we can see I have quite many images but no container running.

We will be using ubuntu image to launch our OS, and we will be using the latest version of ubuntu.

We can use the command as I have used above:

docker pull ubuntu:latest

The above command helps us pull the image from any repository, and if we have the image, it will see if any updates are there and pull the image with the latest version.

Now we have the ubuntu image we can launch a container using that image.

In the above image, we can see that we have used

docker run -it --name=mlContainer ubuntu:latest

The command means that we want to run a container where -it denotes that we need an interactive terminal of that container when launched and — name takes the name of container we want and at the end the image we want to use to create the container. (on the left)

On the right, I have opened another terminal, and there we can see when we use the docker ps command, a new container appear, which we have created. So our container is created, and we are inside the container now. As you can see on the right the root@localhost has changed to root@0d260d5…. which is container id.

Getting the Environment Ready

Now inside this container, we will be training our model.

First, we need some software to be present inside the container. We will need Python and some libraries for the machine learning model to be trained. We will also need the dataset we have in our Windows System, and we will transfer it to our Linux System and inside the container (we will discuss this).

First, let's see if we have Python available in our system, or now. As in the below image, we can see python is not there in our container, so we will install it using:

$ apt-get update #first update apt-get
$ apt-get install python3-pip -y

We have installed python with pip as we need to install other libraries too.

Now we can see that the Python is installed and working in the below image.

When we are working with Machine Learning, we need to get or read the dataset files, which we can do with the pandas library. We also need numpy, and we also need sklearn to use the model to train our data, so we have installed everything using the pip.

We can check if every module is there or not using the command.

$ pip list #shows all the modules available$ pip install pandas
$ pip install sklearn

Docker Tips

When we are inside the docker container, we can use “$ exit.” This brings us outside the container. That means the container is stopped. To start the container again, we can use the command “$ docker start <container name/ id>.” Now to again get inside the container, we can use the command: “$ docker attach <container name/id>.”

Model Training

Now we will need the dataset we first need to have the dataset in our system. Here my dataset is in the windows system so that I will bring it to my Linux System(RedHat) first.

I am going to use WinScp Software to do so. We can also get the dataset from the internet using wget in the Linux System directly.

We need to open the software in Windows, give the IP address of the Linux system we can use the username and then pass work or key to connect.

Here in the above image, we can see that I have transferred the “Salary_Data.csv” file from my windows(on the left) to my Linux System(on right) using WinScp and we can see on the terminal we have the folder called “task1WS” to store it.

Remember to make the file fully readable otherwise we might face reading and writing issues inside directory. I have shown how we have do that in above image.

Now in the file “Salary_Data.csv”, we have our dataset and in the file “salaryModel.py” we have our Python Code.

Here is the link for the dataset:

Salary_data

Kaggle is the world's largest data science community with powerful tools and resources to help you achieve your data…

www.kaggle.com

And here is the salaryModel.py code:

Above I have imported all the required libraries, then I have read the dataset and stored it in the variable data and then I have stored the training data in x and then stored data in y I have also reshaped the x data and then converted it into a 2D array. Then I have used the module LinearRegression() module to train my data. I have used model.fit(x, y) to train the data.

Now when I run this program using the command:

python3 salaryModel.py

It asks me the experience as an input and give me the answer as output in form of salary for my years of experience input.

We can see that the model is working and made so to save the model so we don't need to run or train it again we have used

>> jb.dump(model, 'salaryPredictor.pk1')

This helps us store our model inside the file “salaryPreditor.pk1” which can be used later.

Conclusion

We have successfully created the Machine Learning Model inside a docker container and save the model for later use. I hope I have explained everything, and if you have any doubts or suggestions, you can comment on this blog or contact me on my LinkedIn.

Mishan Regmi - Research Intern - SkillGeek | LinkedIn

View Mishan Regmi's profile on LinkedIn, the world's largest professional community. Mishan has 1 job listed on their…

www.linkedin.com

Thank you for staying till the end of the blog, and please do suggest to me some ideas for improvement. Your suggestions will really motivate me.