Day21 of #90DaysOfDevOps Challenge

Docker Interview Questions and Answers

Day21 of #90DaysOfDevOps Challenge

Important interview questions and Answers for Docker:

1. What is the difference between an Image, Container and Engine?

Image - An image is a template to create a container, just like a machine image to create a VM. It is an executable package that has all the required software, applications, libraries, dependencies etc. which are required for the container.

It is created from a set of instructions called a Dockerfile, which specifies how to build the image. Images can be stored in a registry, such as Docker Hub, and can be shared and reused across different environments.

Container - Containers are created from images and can be started, stopped, and managed independently. Multiple containers can run on the same host, and each container operates as if it has its isolated environment, including its own file system, network stack, and process space. Containers provide a lightweight and consistent runtime environment, encapsulating the dependencies and configuration required by the application.

Engine - Docker Engine is the underlying technology responsible for building and running containers. It includes a runtime, an image format, and a set of tools for managing containers and images.

2. What is the difference between the Docker command COPY vs ADD?

COPY is used for basic file copying, while ADD provides additional features like tar extraction and URL retrieval. it's generally recommended to use COPY for simple file copying operations to ensure transparency and avoid unexpected behaviour.

COPY -

It takes two parameters: a source and a destination. The source can be a file or a directory on the host machine, and the destination is the path inside the container where the file or directory should be copied.

Example:

COPY file.txt /app/

ADD -

ADD: The ADD command also copies files and directories from the host machine to the container, but it has some additional features compared to COPY.

a. Tar Extraction: If the source is a local or remote tar file (ending with .tar or .tar.gz), Docker automatically extracts the contents of the tar file into the destination directory.

ADD app.tar.gz /app/

b. URL Retrieval: If the source is a URL, Docker can fetch the file and place it in the container. This can be useful for downloading files directly from the internet.

ADD https://example.com/file.txt /app/

3. What is the difference between the Docker command CMD vs RUN?

CMD and RUN commands are used in a dockerfile to execute a command but at different stages.

CMD: The CMD is used with arguments and it runs when the container is created from an image. So, the CMD command is executed when the container is started. If there are multiple CMD commands mentioned in a dockerfile then only the last one is considered and executed. CMD commands can be overridden by arguments provided during the 'docker run' command.

CMD ["systemctl", "start", "nginx"]

RUN: The RUN command is used while building an image from the dockerfile. It adds a layer to the image. It is mainly used to install dependencies, set up the environment, and perform any necessary actions to build the image.

RUN apt-get update -y
RUN apt install nginx -y

4. How Will you reduce the size of the Docker image?

Reducing the size of a Docker image can be beneficial for various reasons, including faster image builds, improved network transfer times, and reduced storage requirements. Here are some approaches you can take to reduce the size of a Docker image:

  1. Use a Smaller Base Image: Choose a minimal and lightweight base image for your Dockerfile. Some popular options include Alpine Linux, which is known for its small size and provides only the essential runtime components, unlike feature-rich base images like Ubuntu or CentOS.

  2. Minimize Installed Packages: Only install the necessary packages and dependencies required for your application. Make use of package managers with built-in capabilities for minimal installations, such as --no-install-recommends in apt or --no-cache in apk.

  3. Optimize Dockerfile Instructions:

  • Combine Commands: To reduce the number of layers created in the image, combine multiple commands into a single RUN instruction, using logical operators like && or ; to chain commands together.

  • Use Multi-Stage Builds: Utilize multi-stage builds to separate the build environment from the runtime environment. Build your application or dependencies in a separate intermediate image and then copy only the necessary artifacts into the final runtime image. This helps to exclude unnecessary build-time dependencies from the final image.

  • Leverage .dockerignore: Create a .dockerignore file in your project directory to exclude unnecessary files and directories from being added to the image.

4. Compress or Minimize File Sizes: Compress files and directories within the image to reduce their size. For example, you can use tools like gzip. Ensure that the files are decompressed and available during runtime as needed.

5. Use Docker Image Build Caching: Take advantage of Docker's layer caching mechanism during the build process. Ensure that the frequently changing instructions are placed towards the end of the Dockerfile, allowing Docker to reuse cached layers for the earlier instructions.

6. Clean Up Unnecessary Files: Remove temporary or intermediate files, caches, and artifacts generated during the build process within your Dockerfile. For example, delete downloaded package archives after they have been installed.

5. Why and when to use Docker?

Docker is a containerization tool that helps create an environment for our application and then shares that with our team so that they can also work in a similar environment.

Before docker, the developer used to build an application in a local machine or VM, and used to create the whole environment for it by installing the required software, packages, dependencies etc. which makes the image of the machine so huge that it becomes difficult to share that with others. Now if the other team wants to recreate the environment it becomes a difficult task since they need the list of packages and everything which the developer used for his environment with the correct version.

With the help of docker, it is very easy to create an environment with bare minimum libraries, pack it as an image and share it using a private or public registry like DockerHub. The docker images are very small in size which makes them portable and easy to work with. It also allows us to just write a script with a set of instructions to create an image and then docker takes care of that.

Docker also offers resource management capabilities, allowing you to allocate specific CPU, memory, and network resources to containers, ensuring optimal performance and isolation.

Docker allows us to scale applications horizontally by running multiple containers, either on a single host or across a cluster of machines.

6. Explain the Docker components and how they interact with each other.

No alt text provided for this image

Docker Engine

It is the core part of the whole Docker system. Docker Engine is an application that follows a client-server architecture. It is installed on the host machine. There are three components in the Docker Engine:

  • Server: It is the docker daemon called dockerd. It can create and manage docker images, Containers, networks, etc.

  • Rest API: It is used to instruct the docker daemon what to do.

  • Command Line Interface (CLI): It is a client which is used to enter docker commands.

Docker Client

Docker users can interact with Docker through a client. When any docker commands run, the client sends them to dockerd daemon, which carries them out. Docker API is used by Docker commands. Docker clients can communicate with more than one daemon.

Docker Registries

It is the location where the Docker images are stored. It can be a public docker registry or a private docker registry. Docker Hub is the default place for docker images, it stores public registries. You can also create and run your private registry.

When you execute docker pull or docker run commands, the required docker image is pulled from the configured registry. When you execute the docker push command, the docker image is stored on the configured registry.

7. Explain the terminology: Docker Compose, Docker File?

Docker Compose-

Using docker commands, we can only run and manage a single container at a time, but there can be scenarios when you need to create multiple containers and manage them. It would help if you had a single command solution to create various containers, manage them and then stop them with a single command.

Docker-compose does precisely what I mentioned above. It allows docker to set up the multi-container environment.

There is a three-step process to work with Docker Compose.

1. Define the application environment with Dockerfile for all services.

2. Create a docker-compose.yml file defining all services under the application.

3. Run the docker-compose up command to run all services under applications.

You can run the docker-compose down command to destroy the above infrastructure.

Dockerfile-

Instead of manually creating docker images by running multiple commands one by one, we can write a script where we can specify everything required for our image, and run one command to build the whole image.

Each command in Dockerfile, adds layers to create a Docker image.

Here are some of the most commonly used commands in a Dockerfile:

  • FROM: Specifies the base image required for our new image.

  • RUN: Executes a command in the image during the building of the image, like installing packages.

  • COPY: Copies files from the host machine to the image.

  • ENV: Sets an environment variable in the image.

  • EXPOSE: Specifies the ports that should be exposed on the container.

  • CMD: Executed by container by default, when we launch the image to run the container.

8. In what real scenarios have you used Docker?

  1. Web Application Deployment: I have used Docker for the deployment of my application by packaging the application and its dependencies into Docker containers. The deployment becomes consistent and portable across different environments.

  2. Continuous Integration/Continuous Deployment (CI/CD): Docker containers can be easily integrated into CI/CD workflows, allowing for automated testing, staging, and production deployments.

I used it in my CI-CD pipeline. First I build a docker image from my CI pipeline which has my application and all the required dependencies. Then I deploy that and run the container so that my application is up and running.

  1. Microservices Architecture: It is very interesting how we can create a microservice architecture with Docker. We can have different applications running on different containers and make connections among them so that they can communicate with each other. Each microservice can be containerized and independently managed, allowing for scalability, fault isolation, and easier development, testing, and deployment of individual components.

I created multiple containers for my database and application, keeping both in separate containers.

2. Desktop Virtualization: Docker containers can be used to create isolated and reproducible development environments, enabling developers to quickly set up and share consistent desktop environments across different machines.

Instead of using a whole VM, setting up everything there, it is so much easy to spin up a container and test your scenario and share the same environment with other teams.

9. Docker vs Hypervisor?

No alt text provided for this image

10. What are the advantages and disadvantages of using docker?

Disadvantages -

  • Docker networking can be complex, especially when dealing with more advanced networking configurations, such as connecting containers to external networks or orchestrating multi-container communication.

  • Containers are designed to be stateless and ephemeral, which means data is not persisted within the container by default. Managing persistent data, such as databases or file storage, requires additional considerations and configurations, such as using Docker volumes or external storage solutions.

  • While Docker provides isolation, improper configurations or vulnerabilities within containers can still pose security risks. It's crucial to ensure secure practices, such as regularly updating images, managing container permissions, and following security best practices.

  • Although Docker containers are lightweight compared to virtual machines, there is still a performance overhead due to containerization. Depending on the workload and configuration, this overhead may be negligible or more noticeable.

  • Docker may introduce compatibility challenges when running legacy or complex applications that have specific requirements or dependencies.

Advantages -

We have already talked about the advantages of docker in the above questions.

In short - It is a portable, scalable tool that helps with the isolation of the environment, rapid deployment, resource efficiency and more.

11. What is a Docker namespace?

The Docker namespace helps us create separate environments for different requirements. For example, if two teams working on different features of the same application and they are deploying it with the same name, then the environments can get mixed up. In such scenarios, we can create namespaces and work separately. We can also limit the resource quota so that we can efficiently manage the resources like CPU and memory and make sure that it is divided as per the requirement and that one container is not consuming all the available resources.

Namespaces allow Docker to create isolated environments, known as containers, where processes can run without interfering with each other or the host system. Each container has its own set of namespaces, which includes -

  1. PID Namespace: Isolates the process IDs (PIDs) of processes within a container.

  2. Network Namespace: Provides network isolation for containers.

  3. Mount Namespace: Controls the file system mount points seen by processes within a container.

  4. UTS Namespace: Isolates the hostname and domain name of a container.

  5. IPC Namespace: Provides isolation for inter-process communication (IPC) mechanisms, such as shared memory segments and message queues.

  6. User Namespace: Allows mapping of user and group IDs between the container and the host system.

12. What is a Docker registry?

A Docker registry is a place where you can store your custom images and can share them with your team or the whole world. Docker has its public registry called docker hub, which is an open-source platform where anyone can create and share images and also use the available images for their work. It makes the docker more powerful since there are a lot of images available here which might match your requirement and you do not need to do all of that again, you can simply pull the relevant image and focus on the main work.

13. What is an entry point?

In Docker, the ENTRYPOINT instruction is used in a Dockerfile to specify the command that should be executed when a container is started from the corresponding image.

ENTRYPOINT command param1 param2

14. How to implement CI/CD in Docker?

On a high level, a CI-CD for docker can be as below however, it can be a little different as per the requirement of your application.

  1. VCS - The source code should be in a repository and should have the dockerfile which is required to build your image.

Create a Dockerfile that defines the steps to build your Docker image. It should include the necessary instructions to install dependencies, copy your application code into the image, and specify the entry point for the container. Keep it in your repository.

2. CI pipeline - It will have the steps required for your docker image build. It will be triggered every time there is a commit to the repository, and perform below steps -

  • Checkout: It will checkout all the source codes from VCS.

  • Build: Then build the docker image.

  • Test: Run tests against the built image to ensure the application behaves as expected.

  • Publish: Push the built image to a Docker registry, such as Docker Hub or a private registry.

3. CD pipeline - This pipeline can be triggered manually or automatically after a successful CI build and perform the below steps -

  • Provisioning: Set up the target environment where the application will be deployed, such as a staging or production server.

  • Deployment: Deploy the Docker image to the target environment. This can be done using tools like Docker Swarm, Kubernetes, or a custom deployment script.

  • Configuration: Apply any necessary configuration settings for the deployed application, such as environment variables or network configurations.

  • Testing (optional): Run additional tests or validations against the deployed application to ensure it functions correctly in the target environment.

  • Release: If all tests and validations pass, release the application to production or make it accessible to users.

4. Monitoring and Logging - Implement monitoring and logging solutions to track the performance, health, and logs of your Dockerized application. Tools like Prometheus, Grafana, and ELK Stack can help in this regard.

15. Will data on the container be lost when the docker container exits?

Yes, the data stored in a container gets lost when a container starts or restarts because Docker containers are designed to be stateless and ephemeral, meaning they do not retain data or changes made within the container after it stops or restarts. This behaviour is intentional and aligns with the principle of immutability and reproducibility that Docker promotes.

As a solution, docker has a concept of volume for persistent data. So you can create local volumes or map the local directories from the host machine to a path inside a container so that the data stored on that path in the container is also saved in the local system. And it will not be lost when the container exits. It is also helpful in creating shared storage among multiple docker containers.

Use -v or --mount options.

16. What is a Docker swarm?

Docker swarm is an orchestration tool that works on master node architecture and manages multiple containers running on multiple nodes at all times. It manages the health of containers and makes sure that the required number of containers is up and running. It offers built-in features for load balancing, scaling, and high availability.

17. What are the docker commands for the following:

  • View running containers
docker ps
  • Command to run the container under a specific name
docker run -itd --name=C1 httpd
  • Command to export a docker image
dokcer save -o [image.tar] [image]
  • Command to import an already existing docker image
docker pull nginx               # To import from dockerhub
docker import [image.tar] [image]   # To import from an existing tar image
  • Commands to delete a container
docker rm [container]              # To delete a stopped container
docker rm -f [container]           # To delete a running container
docker rm $(docker ps -a -q)       # To delete all the conatiners
  • Command to remove all stopped containers, unused networks, build caches, and dangling images?
docker system prune -a

Thank you for reading!📘