Get Even More Visitors To Your Blog, Upgrade To A Business Listing >>

Introduction to Docker and Containers

Since its emergence almost a decade ago, Docker containers have kept rising in popularity and over the years have become one of the most popular choices for packaging and deploying applications, to the point that they have now become an integral part of DevOps related efforts on most companies.

This article will attempt to provide some background on Docker’s history and ecosystem, as well as to introduce main concepts around it and provide an overview of some docker CLI commands.

Although primarily geared towards people beginning their journey into Docker and containers, we hope it also provides some relevant context and information for those that may already have some experience on the subject.

What Are Containers?

docker.com website provides us a great starting point to answer the question:

“A Docker container image is a lightweight, standalone, executable package of software that includes everything needed to run an application: code, runtime, system tools, system libraries and settings.”

Containers allow bundling applications with all their dependencies and related configuration into packages (container images) that can be deployed and run on any platform that supports them.

Similarly to running applications on a VM, these containerized applications ran in sandboxed environment and can have resource (CPU & memory) limits applied on them. Unlike VMs however, containers are a form of OS virtualization and incur in significantly less overhead because there is no need to run the guest OS.

Applications running in containers share the kernel with the host OS — notably, this means that Linux based containers can only run on Linux hosts and Windows containers on Windows Hosts.

Running applications in VMs vs. containers

In complex systems, multiple containers are usually deployed to one or more worker clusters and are managed by a container orchestrator such as Docker Swarm or Kubernetes — the latter of which has emerged as the de facto standard. We won’t dig deeper into orchestrators in this article; for now, we can quickly characterize them as the tools that perform the automatic deployment, management, scaling and networking of containers throughout their lifecycle.

Why Use Containers?

We talked about the benefits of using containers vs. running application on VMs. The benefits of using containers, however, extend well beyond that.

Deploying software using containers give us full control of the deployed software stack, including the version of required dependencies, as well as any relevant configuration.
Containers decouple packaged applications from the environment where they will be run. This greatly simplifies deploying containerized applications and eliminates a whole range of problems associated to unintended differences in app behaviors due to improper environment or dependencies setup. Teams can rest easy knowing their apps will run as intended, regardless of where they are deployed.

Applications running in containers are inherently portable and can be run anywhere that supports containers without needing to install nor configure any additional dependencies.
Nowadays, that means every mayor cloud provider — AWS, Microsoft Azure, GCP, IBM Cloud, Oracle Cloud Infrastructure, etc.; all of them have services that support containers and containers orchestration — but also companies managed private infrastructure or even software developers workstations. This intrinsic portability of containers reduce companies dependencies on specific Cloud providers, allowing for greater flexibility when choosing between them.

Because they greatly simplify packaging and, specially with help of orchestration tools, managing application deployments, containers are a great fit for micro-services architectures. Systems consisting of multiple micro-services can be easily deployed and scaled using containers as building blocks.

Lastly, containers also make it trivially easy to quickly run pre-packaged software. There’s a multitude of popular software packages already bundled (for example, in Docker Hub — a public repository of container images) that can be run with minimal or no configuration at all.
Examples of this range from relational databases such as MySQL, PostgreSQL or even SQL Server, NoSQL databases like MongoDB or Redis, DevOps tools such as Jenkins or GoCD, CMS platforms like WordPress and Joomla, etc. The list goes on and on.

Docker and the Containers Ecosystem

The term Docker can refer both to the software/technology or to the company that popularized it. To clear up some of the confusion, let’s provide some (very) brief historical context, and paint a picture of how things look today.

Containers development began almost 10 years ago on Linux. Although the features that support them (cgroups and namespaces) already existed on the Linux Kernel for a while, containers as we know them today where popularized and initially developed by Docker Inc. when the company was still called dotCloud.

As the project grew in popularity and evolved, different bits of the stack were separated into different components and standardized, specially with the creation of the Open Container Initiative, an open governance structure created to defining standards around container formats and runtimes. This allowed for better interplay between Docker and other tools, such as Kubernetes, that were being developed at the time and provided overlapping functionality to other offerings from Docker Inc. like Docker Swarm.

These days, the main components on the stack can be identified as:

  • The CLI client (docker) which interacts with a daemon component (dockerd) that actually creates and manage containers. Alternatives to the docker CLI exists, such as podman (although it should be noted that podman has a different design that does not require a running daemon).
  • The Container Runtime Interface (CRI) is the interface that Kubernetes and other orchestrators use to control the different runtimes that create and manage containers. containerd is Docker’s implementation while CRI-O is an alternative lightweight implementation designed from the ground up to work with Kubernetes by other industry players.
  • The OCI runtime layer provides all of the low-level functionality for containers, such as interacting with existing low-level kernel features, like namespaces and control groups, and create and run container processes. runCis the reference implementation and comes from Docker, but there are also other implementations such as crun and others.
Containers Tech Stack

Container Images

When people talk about container images they are usually thinking of repository images which refer to a bundle of container image layers and metadata with additional information about them. Docker and other container based tools use images based on a format defined by the Open Container Initiative (OCI). This format defines the layers and metadata within a container image as a set of tar files for each layer, and a manifest.json file with additional metadata.
The standardization of the image format has lead to unification on the ecosystem and wider adoption from Cloud providers and allowed the emerge of tool for performing security scanning, signing, building, etc.

Implementation details aside, container images bundle OS base software (excluding the OS kernel) plus required software, configuration and application code, resulting in portable and executable packages.
An image to run a Java application would, for example, usually package a base Linux distribution — lightweight distros such as Alpine Linux, are generally preferred for this — alongside the required JRE version and the corresponding application code.

Images are defined and built from instructions present in Dockerfiles. A Dockerfile always starts with a FROM instruction specifying the image it will be based on (FROM scratch is used when defining root images).
This is usually followed by a set of instructions that modify its contents, such as COPY, ADD or RUN, usually used to install or configure additionally required software.
Lastly, they end with ENTRYPOINT and/or CMD instructions that specify the program to run when the container is launched.
It's the execution of these instructions in theDockerfile that result in the new layers being created when the image is built.

A Dockerfile for the previously mentioned Java application may look like this:

A guide on creating Dockerfiles is outside the scope of this article, but if you're looking for additional details, the Dockerfile reference is probably the best place to start.

Working with Containers

Containers are created from images. You can think of containers as running instances of the images they are based on.
Many docker commands such as docker run, docker pull or docker push require specifying the image to work with. This image URI parameter can be formed from all or some of the following components:
REGISTRY/NAMESPACE[/SUB-NAMESPACE]/REPOSITORY:TAG.
Many of these can be omitted and in practice they often are:

  • When the REGISTRY is omitted, docker defaults to working with the Docker Hub Registry (docker.io)
  • If the NAMESPACE is omitted and working with the Docker Hub Registry, defaults to library namespace, which contains the Docker official images.
  • Omitting the TAG has the same effect as specifying latest. Although it may sometimes be useful to (implicitly or explicitly) use the latest, tag it is generally not recommended, specially on Dockerfiles, as the image represented by it may change over time, thus breaking things that depend on it.

The most important command for working with Docker CLI is docker run which, as you probably guessed, is used to run a container from the specified image. Let’s review some example invocations:

In the latter example we introduced 2 additional concepts:

  • Port mapping is used to expose ports in the running container through the host. This way requests made to the host port can be redirected to the corresponding container port.
  • Volumes are file systems mounted on the container, usually to preserve data generated by the running container. The volumes are usually stored on the host — although other storage drivers can be used to store information, for example, on cloud services like AWS S3 or Azure Blob Storage — and their life cycle is not tied to that of the containers. This allows backing up and sharing data between containers easily.

The docker run commands supports many additional parameters to modify containers behavior, including enforcing constraints on system resources, defining networks to allow communication between different containers, setting environment variables and many other things. You can review the docker run command reference for additional details.

To list containers the docker ps command can be used. If we were to execute it after starting the previous containers we would get an output similar to the following:

Another important command is docker exec. Unlike docker run which creates new containers from a specified image, docker exec is used to execute commands on already running containers.
This is specially useful for troubleshooting issues with containers. It's not uncommon to use docker execto open a terminal on a running container to review its state. Let's see some examples:

Other frequently used commands are docker logs, docker stop and docker start. As their names suggest these commands can be used to show logs, stop and (re)start containers respectively. Let's also have a quick look at them:

Working with Images

Building an image requires running the docker build command providing a build context and an image definition file.

  • The build context refers to the source location to use when building the image— usually a local directory, but a Git repository can also be used.
    Any COPY or ADD instructions specified on the Dockerfile will act relative to that source location. This means that anything you want to copy or add to the container must be inside the provided build context folder.
  • The Dockerfile can be specified using the -f/--file switch. When omitted, a file named “Dockerfile” must exist on the specified build context to be used instead.

It is generally also useful, although not mandatory, to tag images — that is, to give them an identifier — when building them, to allow easily referencing them later.

Images without any tag are referred to as “dangling” images. The docker tag command can be used to add additional tags to images, whether they already had any preexisting tag(s) or not.

When building the image, docker will execute all instructions specified in the Dockerfile(except for those already cached) and generate any new required layers.

Being able to share images is fundamental to support container-based application development and related DevOps processes. Container Registries are a special type of file server used for hosting container image repositories.

When using the docker CLI, Docker Hub (Docker Inc.'s own Container Registry service) is generally the default when none is specified (although some Linux distributions allow changing that through configuration).
Most Cloud providers offer their own Container Registry services and there are also several product choices for companies that need or prefer to host their own. Some of the better known options here are Nexus Repository, Artifactory, Harbor and Project Quay.

docker pull and docker push commands are used to upload and download images to Container Registries respectively. When pulling images the docker daemon will download any image layers not already present locally from the corresponding Container Registry servers.

Closing Notes

Containerized applications are not just the future of software development, but for many organizations they are already the present.

Over the last decade factors like popularization of cloud computing and monolithic systems being replaced by micro-service architectures, together with benefits inherent to containers like simplification of devops processes (building, deploying, security scanning, etc.) and reduced operational costs due to more efficient allocations of computing resources and orchestrators support for auto-scaling with demand, resulted in containerized apps quickly becoming the norm for many software companies, dev and devops teams.

In this article we have barely scratched the surface, but hopefully you now have a firmer grasp on containers and the main concepts around them.

Thanks for staying with us this far. We hope to get back on the topic soon, but this time to focus on container orchestrators and Kubernetes.

Would you like to know more? Do you need our help? Contact Us!
www.quadiontech.com


Introduction to Docker and Containers was originally published in Quadion Technologies on Medium, where people are continuing the conversation by highlighting and responding to this story.



This post first appeared on Quadion Technologies, please read the originial post: here

Share the post

Introduction to Docker and Containers

×

Subscribe to Quadion Technologies

Get updates delivered right to your inbox!

Thank you for your subscription

×