Introduction:

This book has many examples and patterns that data scientists use every day and is highly helpful and well written. I would suggest it to any working data scientists who are interested in learning Docker as well. Using the Jupyter platform, you will learn master interactive development. You will learn how to design a linked system using the Docker-Compose technology, with Jupyter handling these background operations and Python processing data in the background. Using the Docker-compose tool and its docker-compose.yml file format, you will be able to run and create Docker containers from scratch as well as from publicly accessible open-source images and write infrastructure as code. A cloud-based system could support the deployment of a multi-service data science application. Both the best practises for using already-existing photographs and creating your own images to implement cutting-edge machine learning and optimization techniques are examined. I’m hoping you’ll find this book to be worthwhile and useful.

Why Docker?

Information Technology is always under immense pressure to increase agility and speed up delivery of new functionality to the business. It is essential for data scientists to be self-sufficient and participate in continuous deployment activities. This happens many times in our work; whenever you develop a model, code, or build an application, it always works on your laptop. However, it gives certain issues when we try to run the same model or application in the production or testing environment. This happened because of the different computing environment between a developer platform or production platform. Docker is the world’s leading software container platform. Docker is a tool which helps to create, deploy, and run applications by using containers in a simpler way. In simpler terms, a developer and data scientist will package all the software, models, and components into a box called Container, and Docker will take care of shipping this container into different platforms. You see, the developer and data scientist clearly focus on the code, model, software, and its dependencies and put it into the container. They don’t need to worry about deployment into the platform which Docker can take care of. Machine learning algorithms have several dependencies, and Docker helps in downloading and building the same automatically.

Topics covered by this book:

  • Introduction is covered in Chapter 1. Beyond concerns arising from their system infrastructure, the typical data scientist continually has a number of incredibly complex challenges on their mind. However, there will inevitably be problems with the infrastructure. We may categorise the issue as either the “engineering problem” or the “modelling problem,” to put it too simply. The data scientist is uniquely qualified to address the first problem, but they frequently struggle with the second.

  • Chapter 2 is about Docker. A process can be separated from the system on which it is running using Docker. It enables us to decouple the hardware that an application runs on from the code that defines it and the resources needed to run it.

  • Interactive programming covered in chapter 3  . A conversation between people and machines is interactive computing.

  • Docker engine Described in Chapter 4 If I haven’t made this point clear enough, the Docker engine will always function in the same way regardless of the operating system and hardware that is used. The Docker engine is utilised throughout the development process, testing phase, and deployment phase.

  • DockerFile is discussed in Chapter 5. Each layer in a Docker image defines a fundamental modification to the image that is stateless. The operating system of the virtual machine may be the initial layer, followed by the installation of any dependencies required for your application to function, all the way up to your application’s source code.

  • Docker Hub is discussed in Chapter 6. With tools for creating our own photographs, the ability to save and distribute the images we have created outside of our system rapidly becomes crucial. We can accomplish this thanks to Docker Registries. Although additional registries exist, the public Docker Registry, Docker Hub, will be more than adequate for your needs.