Docker Kickstarter

Docker is one of the buzzword, that we encounter commonly nowadays. After so much reluctance, tried it out and realized a lot of advantage for myself.

This article doesn’t dive deep into the architecture of the Docker. But, helps you get a high-level idea and get started. Also, I tried to explain the docker jargons that you encounter commonly, so that you can skim through any other setup articles that you encounter.

Before getting started,

If you are already a bit familiar about Docker, but need a kickstart, this article is primarily for you.

If you are already aware of the concept of Virtual Machines, Docker is a similar technology, but efficient and light weight because of its nature. This will help you.

If you are never heard about such terms, Docker helps you to run (virtual) machines with different OS inside you *nix/Mac/Windows machinesImagine if you are able to run a Centos, Ubuntu, and Fedora inside your machine as if you are running it on different machines. The major differences are you will be able to access it using only terminals. Read through OS Virtualization and Docker, before reading further.


  1. You can have Dev/QA/Staging environments, that exactly mimics your production environment. So that you can avoid inconsistencies and issue that are not reproducible. In my Mac, I have a bunch of Linux flavored docker containers (we will see what it means) for various purposes.
  2. You can have the environment that is getting built from a plain text file (Dockerfile), so that just sharing that file helps you to share the environment setup and get it created in minutes.
  3. Since the complete environment setup can happen from a plain text file, the environment can be easily reviewed and understood by anyone.
  4. Since Docker is available for all the OS (have to mention that Windows docker is not so good), easy to have the environments up and running, so that you can focus on your task, rather than spending time in environment setup.
  5. Can even be used in production along with additional support such as Openshift. Preferable for Microservice architecture.
  6. Sandboxed environments. You can create as many as you want and throw it away, without polluting your local machine.


We install Docker in our host.

A host is a machine on which you install the docker. This helps you to create as many Docker containers as you want.

The Docker container is a running copy of a Docker Image. You can only directly use a Docker container.  You can create many independent containers from the same image. You create a container by running the Docker image.

Docker image is the template of the environment that you wish to have. It can be any plain linux distro, or a customized one with additional applications such as Oracle DB or Tomcat. There are lot of images already available in the Docker hub.

Docker hub is the hub containing tons of Docker images, that we commonly use. You can just pull any of the images and run it as a container and use it. Or you can write a Dockerfile that contains additional customization required for you.

Dockerfile is a simple text file containing the docker commands that do the additional customization required for your container(via image). You build a Dockerfile that creates a image in your local, with the customization and you can run the containers from the image and use it. It is mandatory that the file name has to be Dockerfile.

Apart from the above basics, you facilitate multi-container app using Docker Compose, you can map ports from container to host.


Example – Having Tomcat server up and running:

  • Once you installed docker on your machine, you can pull the image with the below command,

docker pull tomcat

  • Once the above command is completed you will have a tomcat image in your machine, which you can check with the command

docker images

  • Now that you have an image of a tomcat docker you can run and use it. The below command runs image in detached mode (-d), so that your terminal is not interrupted, with the 8888 port of your local maps to the 8080 port of the tomcat image.

docker run -d -p 8888:8080 tomcat_container

Now when you hit http://localhost:8888 , you will see tomcat available there.

  • On a sample basic Dockerfile, you can refer the usage of Dockerfile here. A gotcha with writing a sample Dockerfile is it has to execute a command that runs forever. Once that command stops the container stops. In the docker is the ever running that keeps the container running.
  • On a running container you can enter, just as you ssh to a remote machine. And browse through it using the below command. With this you ask docker to execute (exec) bash in interactive terminal(-it) for the container with given name.

docker exec -it tomcat_container bash


Strongly recommend this tutorial for a deep dive into Docker and Kubernetes.

System Integration – Design Options

This article is about making a choice on a particular problem while integrating multiple systems. Integrating multiple systems is a common problem that Software engineers solve when their project has multiple systems.

While integrating multiple systems from the scratch there is a small design question. Which system initiates a data transfer? Will the data source system push data into the destination system? (Or) Will the destination systems pull data from the source? (Or) To build a mediator system which takes of pulling the data from the source and pushing into the destination?

Data Push

In this way, the source data system is tied to one or more destination systems and takes the responsibility of sending the required data to the destination system(s).


  • Data sync will be real-time.
  • No unnecssary polling needed.


  • The source could be connected only to a fixed set of systems. Including additional systems could be costly based on the design.
  • By taking up the responsibility of  Data Pushing, sometimes the source system could also additionally have to take care of concerns like the quality of the data being pushed to each system.

Data Pull

In this way, the system which requires data pulls the data from the source system as needed.


  • It’s always better when someone asks and gets only whatever they need. So by this way, the system which needs the data takes the responsibility requesting and getting only what it needs.
  • Since the destination systems know better what it needs when the needs changes, the data integration logic change will be in the same system.
  • Easy to setup any number of development/QA instances, since setting up the system itself will take care of the data pull.


  • Real-time data pull requires continuous polling, thereby increasing the number of requests to the data source.

A Dedicated Mediator

In this way, we build a dedicated system which takes the responsibility of pulling data from the source system(s) and pushing the data to the destination system(s). ETL/EAI systems are examples for this.


  • The systems are very much decoupled. This provides more flexibility.
  • When the whole ecosystems involve multiple systems, there will be more code reusability since there will be only one place which manages the data push/pull for multiple systems.
  • When the data integration requires a lot of data translation/transformation this solution makes the things simple. Since the only job of the mediator is pull->transform->push


  • Since this is integrated with two or more systems, this needs rigorous testing when there is a change in any of the dependent systems.
  • For the solutions involving simple data transfer, maintaining a separate system will be an overkill.
  • Setting up multiple Development/QA environments might require a dedicated instance of the Mediator instance.


In my opinion, whenever there is an expectancy of growing requirements for the data integration it’s better to go with a dedicated Mediator.

When there need is simple data load, it is better to go with Data Pull.

Go with a Data Push only if you have a strong reason to go for it.