Thursday, 15 August 2019

What is fishbucket in Splunk

Introduction

In this post we will learn what is fishbucket in Splunk but before that lets us understand what Splunk is and its purpose.
Splunk is used for monitoring, searching, analyzing the machine data in real time. The datasource can range from application data to user created data to sensors or devices.

Purpose of Splunk fishbucket

Before analyzing the data, Splunk index the data. The index is necessary to analyze the data. But here is the issue, what if the same data is indexed multiple times or in other words, how to avoid duplicate indexing the same chunk of data?
Splunk fishbucket keeps seek pointers and CRCs for the indexed files inside the directory. This directory is called fishbucket. Since through fishbucket we can know which data has already been indexed, so splunkd can tell if it has been read already and avoid duplicate indexing.


How fishbucket works?

File monitor processor searches the fishbucket to see if the CRC from the beginning of the file is already there or not. This is the first step of file monitor processor whenever it starts looking at a file. There can be three possible scenarios:
Scenario 1: If CRC is not present is fishbucket, the file is indexed as new. This is simple, file has never been indexed. After indexing, it stores CRC and seekpointer inside fishbucket.
Scenario 2: If CRC is present is fishbucket and seek pointer is same as current end of file , this means the file has already been indexed and has not been changed since last indexed. Seek pointer is used to check if there is change in file or not.
Scenario 3: If CRC is present is fishbucket and seek pointer is beyond the current end of file, this means something in the part of file which we have already read has been changed. Since we cannot know what has been changed, lets re-index the whole data again.
Location of fishbucket directory
All these CRCs and seek pointer is stored in location by default:
/opt/splunk/var/lib/splunk


Retention policy of fishbucket index

Via indexes.conf, we can change the retention policy of fishbucket index. This may be needed if we are indexing a lots of number of file. But we need to be careful when changing retention policy because if the file which has already been indexed but the CRCs and seek pointer got deleted due to change of retention policy, there is risk of same file getting indexed again.


Ways to track down a particular file when needed

If you need to know which file has been indexed and reindexed at which particular time, we can search all the events in the fishbucket associated with it by the file or source name. We can check seek pointer and mod time to know the required details.
We can also search fishbucket through GUI by searching for "index=_thefishbucket".

That's all for Splunk fishbucket. If you have any query, please mention in comment sections. Thanks.
Originally published at https://devopsrevisited.com

Related Articles:
You may also like:

Tuesday, 13 August 2019

Introduction to DevOps on AWS

Introduction:

Amazon Web Services(AWS) is a cloud service from Amazon, which provides services in the form of building blocks, these building blocks can be used to create and deploy any type of application in the cloud.

It is a comprehensive, easy to use computing platform. The platform is developed with a combination of infrastructure as a service (IaaS), platform as a service (PaaS) and packaged software as a service (SaaS) offerings.

Advantages of AWS for DevOps

There are many benefits of using AWS for Devops:

Get Started Fast

Each AWS service is ready to use if you have an AWS account. There is
no setup required or software to install.

Fully Managed Services

These services can help you take advantage of AWS resources
quicker. You can worry less about setting up, installing, and operating infrastructure on your own. This lets you focus on your core product.

Built for scale

You can manage a single instance or scale to thousands using AWS
services. These services help you make the most of flexible compute resources by
simplifying provisioning, configuration, and scaling.

Programmable

You have the option to use each service via the AWS Command Line
Interface or through APIs and SDKs. You can also model and provision AWS resources
and your entire AWS infrastructure using declarative AWS CloudFormation templates.

Automation

AWS helps you use automation so you can build faster and more efficiently.
Using AWS services, you can automate manual tasks or processes such as deployments,
development & test workflows, container management, and configuration management.

Secure

Use AWS Identity and Access Management (IAM) to set user permissions and
policies. This gives you granular control over who can access your resources and how they
access those resources.

Buffer In Amazon Web Services

An Elastic Load Balancer ensures that the incoming traffic is distributed optimally across
various AWS instances.

A buffer will synchronize different components and makes the arrangement additional
elastic to a burst of load or traffic.

The components are prone to work in an unstable way of receiving and processing the
requests. The buffer creates the equilibrium linking various apparatus and crafts them effort at the identical rate to supply more rapid services.

Components of Amazon Web Services

Amazon S3

With this, one can retrieve the key information which are occupied in creating
cloud structural design and amount of produced information also can be stored in this
component that is the consequence of the key specified.

Amazon EC2 instance

Helpful to run a large distributed system on the Hadoop cluster.Automatic parallelization and job scheduling can be achieved by this component.

Amazon SQS

This component acts as a mediator between different controllers. Also worn
for cushioning requirements those are obtained by the manager of Amazon.

Amazon SimpleDB

Helps in storing the transitional position log and the errands executed
by the consumers.

How Spot instance different from an On-Demand instance or Reserved Instance

Spot Instance, On-Demand instance and Reserved Instances are all models for pricing.

Moving along, spot instances provide the ability for customers to purchase compute
capacity with no upfront commitment, at hourly rates usually lower than the On-Demand
rate in each region.

Spot instances are just like bidding, the bidding price is called Spot Price. The Spot Price
fluctuates based on supply and demand for instances, but customers will never pay more
than the maximum price they have specified.

If the Spot Price moves higher than a customer’s maximum price, the customer’s EC2
instance will be shut down automatically.

But the reverse is not true, if the Spot prices come down again, your EC2 instance will not
be launched automatically, one must do that manually.

In Spot and on demand instance, there is no commitment for the duration from the user
side, however in reserved instances one must stick to the time period that he has chosen.

Amazon Elastic Container Service (ECS)

Amazon Elastic Container Service (ECS) is a highly scalable, high performance container
management service that supports Docker containers and allows us to easily run
applications on a managed cluster of Amazon EC2 instances.

AWS Lambda in AWS DevOps

AWS Lambda lets us run code without provisioning or managing servers. With Lambda,
we can run code for virtually any type of application or backend service, all with zero
administration.

Just upload your code and Lambda takes care of everything required to run and scale your
code with high availability.

Amazon EC2 security best practices:

There are several best practices to secure Amazon EC2. A few of them are given below:

  • Use AWS Identity and Access Management (IAM) to control access to your AWS resources.
  • Restrict access by only allowing trusted hosts or networks to access ports on your
  • instance.
  • Review the rules in your security groups regularly, and ensure that you apply the
  • principle of least.
  • Privilege – only open up permissions that you require.
  • Disable password-based logins for instances launched from your AMI. Passwords can be found or cracked and are a security risk.

Friday, 26 July 2019

Top 100 Splunk Interview questions and answers

In this article, we will see important Splunk Interview Questions and Answers.

Here are the top 100 interview questions and answers on Splunk

What is Splunk and its uses?
Splunk is a software used for monitoring, searching, analyzing the machine data in real time. The adata source can be web application, sensors, devices, or user created data.

What are the components of Splunk?
The components of Splunk are:
(a) search head - GUI for searching
(b) Forwarder - forward data to indexer
(c) indexer - index machine data
(d) Deployment server - Manages splunk components  in distributed environment.

Briefly describe how Splunk works?
Splunk works by collecting, parsing, indexing and analyzing data. Data is collected by the forwarder from the source and forwarder forward the data to the indexer. On data stored in the indexer the search head searches, visualizes, analyzes and performs various functions.

What is Splunk forwarder?
Splunk forwarder is used to forward data to indexer.

What are the advantages of Splunk forwarder?
Splunk forwarder can throttle bandwidth and provide an encrypted SSL connection for transferring data from forwarder to indexer.

What are the types of forwarder?
There are two types:
Universal Forwarder
Heavy Weight Forwarder

What is Universal Forwarder?
In Universal forwarder, splunk agent is installed on non-splunk system to gather data locally but it can't parse or index data

What is Heavy Weight Forwarder?
Heavy weight forwarder is the full instance of splunk with advance functionality and it works as remote controller as well as intermediate forwarder and data filter.

What is the advantage of Splunk over other similar tools?
Splunk is a single integrated tool for machine data. It does all the role starting from performing IT operation, analyzing machine logs with providing business intelligence. There can be other tools in market but Splunk is the only tool that provides end-to-end data operation. You might need 3-4 tools individually for what Splunk is doing as a single software.

What are the configuration files of Splunk?
props.conf
indexes.conf
inputs.conf
transforms.conf
server.conf

What are the different licenses in Splunk?
There are 6 type of licenses in Splunk
Enterprise license
Free license
Forwarder license
Beta license
Licenses for search heads 
Licenses for cluster members

What is the limitation of free license in Splunk?
In free license we cannot authenticate and schedule searches, distribute search, forwarding in TCP/ Http and deployment management

What is the use of License Master in Splunk?
License Master controls how much data size we can index in a day. For example if we have 200 GB license model, then we can only index 200 GB of data in a day. So we should have the license for the maximum data size we are getting.

Suppose due to some reason License Master is unreachable, will the indexing stop?
Data search will stop if License Master is not reachable, however data will continue to indexed. You will get a warning on web UI or search head that you have exceeded the indexing volume. The indexing will not stop.

What is the use of DB Connect in Splunk?
Its a plugin to connect to generic SQL database and integrate with it.

What is the command for boot-start enable and disable?
To enable Splunk to boot-start, the command is:
$SLUNK_HOME/bin/Splunk
To disable Splunk to boot-start, the command is:
$SPLUNK_HOME/bin/Splunk

What is summary index in Splunk?
To boost the reporting efficiency, Summary indexes are used. Basically it enables user to generate report after processing huge volume of machine data.

What are the different types of Summary Index?
There are two types:
Default Summary Index - It is used by Splunk Enterprise by default in case no other summary index are specified.
Additional Summary Index - To enable running varieties of reports, additional summary index is used.

What is the default field for events?
The five default fields are 
source, 
host, 
source type, 
index,
timestamp

How can we restart Splunk?
Splunk can be restarted from the Splunk Web. The steps are
1.Go to System, navigate to Server Controls.
2.Click on Restart Splunk.

How to search multiple ips in splunk?
Using lookup tables, we can search multiple IP addresses 

What is the most efficient way to filter events in splunk?
The most efficient way to filter events in Splunk is by time / duration.

How can we reset Splunk password?
To reset the password, access to the file where Splunk is running is necessary. Then perform the following steps:
Move $SPLUNK_HOME/etc/passwd file to $SPLUNK_HOME/etc/passwd.bak
Restart Splunk and log in with default username and password i.e. admin/changeme.
Reset the password and combine the password file with the backup file.

What is sourcetype?
Sourcetype in Splunk is a default data field.Sourcetype is the format of the data that shows its origin. for eg, take .evt files,  it originate from the event viewer. The classification of the incoming data can be done based on service, system, format and character code. The common source types are apache_error, websphere_core, apache_error and cisco_syslog.  What it does is processes and distributes incoming data into different events. 

How to use two sourcetypes in splunk? 
I would like to give usecase on how to search 2 sourcetpes in a lookup file
sourcetype=X OR sourcetype=Y | lookup country.csv
Using this code, sourcetypes X and Y can be searched in a lookup file.

What is kv store in splunk?
KV stands for key value that allows to store and obtain data inside Splunk. The KV store has the following functions:
(a) To manage a job queue.
(b) For storing metadata by the user.
(c) Analysing the workflow.
Storing the user application state required for handling a UI session. To store the results of the search queries in Splunk. Maintaining a list of environment assets and checkpoint data.

What is deployer in Splunk? 
A deployer is used to deploy configuration information and apps to the cluster head. The set of configuration details such as updates that the deployer sends is called configuration bundle. The deployer shares this bundle when a new cluster member joins the cluster. It handles the basic app configurations and user configurations. 
However, the latest states cannot be restored to the members of the cluster.

Which roles can create data models in Splunk?
Data models can be created through admin or power roles by the users. For other users, these models can only be created if they have the write access to the application. The permissions based on the roles determine whether a user can edit or view them.

When to use auto_high_volume in Splunk?
auto_high_volume is used when the indexes are of very high volume. A high volume index can get over 10GB of data.

What are the Splunk alternatives?
logstash, 
Loggly, 
Loglogic,
sumo logic

How to restart splunk webserver and daemon?
To restart webserver: splunk start splunkweb
To restart daemon: splunk start splunkd

How to clear Splunk search history?
we need to delete searches.log from this path
$splunk_home/var/log/splunk/searches.log

What is fishbucket in Splunk?
Its a directory or index at default location /opt/splunk/var/lib/splunk .It contains seek pointers and CRCs for the files you
are indexing, so splunkd can tell if it has read them already.We can access it through GUI by searching for  “index=_thefishbucket”

Which commands are used in the reporting results category?
  • Top 
  • Rare 
  • stats
  • Chart 
  • Timechart 

What is the use of stat command?
Stat reports data in tabular format and multiple fields is used to build table.

What is the use of chart command?
As name indicates, chart is used to display data in bar, line or area graph. It takes 2 fields to display chart.

What is the use of timechart?
Timechart is used to display data on timeline. It just takes 1 field as the other field is by default is time field.

How to disable the Splunk boot start?
$SPLUNK_HOME/bin/Splunkdisable boot-start

How to disable the Splunk launch message?
we can disable Splunk launch messabe by adding this in splunk_launch.conf
Set valueOFFENSIVE=Less in splunk_launch.conf

What is difference between Splunk app and Splunk Add-on?
Splunk app has GUI configuration whereas Splunk app doesnt have it (only command line)

In Splunk cluster, how to offline a peer?
Using command Splunk offline, we can offline a peer

What are the different categories in SPL command?
SPL command has five major categories:
Sorting Results, Filtering Results, Grouping Results, Filtering, Modifying and Adding Fields and Reporting Results.

How to specify minimum disk usage in splunk
Using the following commands we can set minimum disk usage:
/opt/splunk/bin/splunk set minfreemb = 20000
It requires restart, so
/opt/splunk/bin/splunk restart

Do you know what is SOS in context of Splunk?
Yes, SOS stands for Splunk on Splunk. Its a type of splunk app which provides graphical interface of Splunk performance and issues.

What does Lookup commands do?
It adds fields based while identifying the value in the event, referencing a lookup table and while adding up the fields in the matching rows in the lookup table of the event. 

What does input lookup command do?
It returns the whole lookup table as the search results.

What does the output lookup command do?
It outputs the current search results to a lookup table on the disk.

What does the sort command do in Splunk?
As name explains, it sorts the search results by the use of specified fields.
Here is the syntax:
Sort[<count>] <sort-by-clause>... [desc]

What is transaction Command and how does it works?
The transaction command is helpful in  two specific scenarios:
As we know, unique id (from one or more fields) alone is not enough to differentiate between two transactions. This might be the use case when the identifier is reused, for example web sessions identified by cookie/client IP. In this scenario, time span or pauses are also used to segment the data into transactions. In other cases when an identifier is reused, say in DHCP logs, a particular message may identify the beginning or end of a transaction. When it is desirable to see the raw text of the events combined rather than analysis on the constituent fields of the events.

How To Troubleshoot Splunk Performance Issues ?
Well, we can start from here: First I would like to check splunkd.log to trace any error. If all is fine then I will check server / vm performance issue (i.e. cpu / memory / storage IO etc) and lastly install Splunk on Splunk which provides GUI where we can check any performance issues.

How to create a new app from template in Splunk?
Go to dir /opt/splunk/bin/splunk 
create app New_App -template app1

What Is Dispatch Directory ?
$SPLUNK_HOME/var/run/splunk/dispatch contains a directory for each search that is running or has completed. For example, a directory named 1434308973.367 will contain a CSV file of its search results, a search.log with details about the search execution, and other stuff. Using the defaults (which you can override in limits.conf), these directories will be deleted 10 minutes after the search completes – unless the user saves the search results, in which case the results will be deleted after 7 days. 

What Is Difference Between Search Head Pooling And Search Head Clustering?
Both are features provided splunk for high availability of splunk search head in case any one search head goes down.Search head cluster is newly introduced and search head pooling will be removed in next upcoming versions.Search head cluster is managed by captain and captain controls its slaves.Search head cluster is more reliable and efficient than search head pooling.

What is null queue in Splunk?
Null queue used to trim out all the data that is unwanted.

List different types of search modes supported in splunk?
There are three modes:
  • Fast mode
  • Smart mode
  • Verbose mode

What is btool in Splunk?
Splunk btool is a command line tool to troubleshoot configuration file issues. It is also used to  see what values are being used by your Splunk Enterprise installation in existing environment.

How to use btool?
Command: /opt/splunk/bin/splunk btool input list

How to rollback your splunk web configuration bundle to last version?
Here is the command
/opt/splunk/bin/splunk rollback cluster-bundle

How to change port in Splunk?
/opt/splunk/bin/splunkset web-port <port_number>

What Is Map-reduce Algorithm?
Map-reduce algorithm is inspired by map and reduce funtionality and used for batch based large scale parallelization.

Where does Splunk default configuration file located?
Default configuration file is located in  $Splunkhome/etc/system/default

What is lookup command used for?
Lookup command is used for referencing fields from an external csv file that matches fields in your event data.

How Splunk Avoids Duplicate Indexing Of Logs ?
This is done by keeping track of indexed events in a directory called fish buckets and contains seek pointers and CRCs for indexed files. This way it can check whether it has been indexed or not and avoid duplicate index.

That's all for Splunk Interview questions with answers. If you have any questions, please mention in comments. Thanks!!!

Related Articles:
You may also like:

Saturday, 13 July 2019

Introduction to Docker Compose

In this post, we will see how to manager multiple container lifecycle using docker-compose. Let us understand what is docker compose.

Introduction

Docker compose is a tool provided by docker to manage multiple containers within a single host. So in case you want to create / start  / stop/ remove / scale up / scale down multiple containers from a single command, docker compose comes handy. But if you are looking for managing multiple containers within multiple host, consider using docker swarm / kubernetes. Docker compose manages container lifecycle within a single host. If your containerized application is hosted on multiple node in non-clustered mode, then you need to use another copy of docker-compose on another host. However I would suggest to look for docker swarm or kubernetes for that to manage it from single point.

What docker compose can do for you?

Docker compose can automate your container deployment, re-deployment, undeployment. It is not a tool to solely create docker image (docker build used to create docker image)

Installation of Docker Compose
If you are using Windows or Mac, docker compose is already installed as it comes in Docker toolbox. But in case of Linux, We need to first install docker compose.

YAML Configuration file
Docker compose provides a configuration file docker-compose.yml where in we need to write yaml script to manager container lifecycle.  Here is the simple example of docker-compose.yml with instruction

Docker-compose Example
Docker Compose Example

Let me explain further on it.
version : It indicates compose version number
services: This indicates docker-compose that below is the list of services that needs to be containerized
<service-name>: Name of the service for reference purpose
build: Path to docker file from where image to be build to be used to create container
ports: port mapping from host to container
volumes: volume mapping from host to container
image: image name to be used to create container

Docker compose commands:

So you now have docker-compose.yml and want to manage lifecycle of containers through it. We can create, start, stop, destroy, scale using the same docker-compose.yml file. 
Here is the list of commands:
  • docker-compose up - It creates container (if required) and also run the container. Use it with this option ( -d ) to run this daemon in background
  • docker-compose down - Just opposite of up command, it stops all the containers and also removes them.
  • docker-compose start - It starts the container. Please note that if the container does not exist, it will not create a new container. It just starts the stopped container listed in docker-compose.yml
  • docker-compose stop - It stops the running container. It goes through each service mentioned in docker-compose.yml and tries to stop the started container.
  • docker-compose rm - It deletes the stopped container. use it with -f to force delete the container.
  • docker-compose scale - It set the number of containers per service. We can both scale up and scale down the number of containers per service using the same command
  • docker-compose exec - It run the command inside the container. You need to pass container id along with command. For eg. docker-compose exec -i <container_id> ls / home
  • docker-compose pause - It pause the services. This is different from stop in the way that it is like sleeping for sometime and when resume it continues from the point where it is paused. Stop service will kill the container running thread and it will start from the scratch. For example, you have application that prints 2 line of statement. if you pause after printing first line of statement, unpause will continue from there and it will print the second line. In case of stop after first line and restart, it will again print both first and second line of statement
  • docker-compose unpause - Resumes the paused services.
  • docker-compose port - It prints the public port for port binding
  • docker-compose build - It build or rebuild services
  • docker-compose bundle - It generates a docker bundle from compose file
  • docker-compose config - It validates the docker compose file
  • docker-compose create - It creates the services
  • docker-compose images - It list the docker images
  • docker-compose logs - View container output
  • docker-compose top - It views the running container
  • docker-compose version - It displays the version of docker compose installed on the host
  • docker-compose help - It gets help on a command
Thats all for short introduction to docker compose. If you are using Ansible, you can manage containers lifecycle using docker_compose module. For any query, please mention in comments section. Thanks!!

Sunday, 9 June 2019

Tagging a docker image

In this article we will see how to tag an image in docker.

Docker Image Tag background

Tagging an image gives docker image a version to refer from local repository or docker hub. Though optional, it is highly recommended to have a tag for an image. If you look at the docker image its naming looks like: username/repository:tag


Different ways to tag a docker image
Different ways to tag a docker image


Command to tag an image

There are multiple ways we can tag a docker image. Here are the ways to tag a docker image:

(a) Tagging an image during image creation: 

We can create a tag or multiple tags during image creation. The only limitation is that the docker version should be 1.10 or above. The command is:

docker build -t name1:tag1 -t name2:tag2 -t name2:tag3
    

(b) Tagging an image using image:

Using tag command, we can tag an existing image. There are 2 ways to do that. Using image id or using image name

docker tag myimage tech693/myimage:1.0
    

This will create another copy of image with name "tech693/myimage" with tag "1.0"

Deleting a docker Image

So well you have created image but now want to delete it. docker rmi command is used to delete the image

docker rmi <imagename:tag>

How to rename docker image

Renaming docker image is easy but bit tricky. The idea here is to create another image with tag and delete the older one.

Example

docker tag <oldImage:tag> <newImage:tag>
docker rmi <oldImage:tag>

You will notice that if you tag same docker image with multiple name, its Image id remains same which mean tagging is just a name to refer the Image Id for human readability purpose.

That's all about tagging a docker image. If you have query, please ask in comment section. Thanks


Related Articles:
You may also like:

Manage docker image in Ansible using docker_image module



Build docker image using Ansible


Docker as you know is the most popular containerization platform and Ansible is the leading automation tool. In this post we will see how we can automate / manage docker images with Ansible.

Module:

Ansible provides docker_image module to pull / create docker image.

We will cover two areas of docker image
(a) Pulling images from repository using Ansible
(b) Creating images from dockerfile using Ansible

Pulling images from repository using Ansible:

Lets see how we can pull docker image by using docker_image module. Here I am pulling RabbitMQ image.

Source Code:



[root@test ansible_example]# cat docker-pull.yml

---

- hosts: localhost

  tasks:

  - name: Pull RabbitMQ Image

    docker_image:

      name: rabbitmq

      source: pull



Lets run the above playbook.

Output:




[root@test ansible_example]# ansible-playbook docker-pull.yml



 [WARNING]: Could not match supplied host pattern, ignoring: all



 [WARNING]: provided hosts list is empty, only localhost is available



PLAY [localhost] ***************************************************************************************************************



TASK [Gathering Facts] *********************************************************************************************************



ok: [localhost]



TASK [Pull RabbitMQ Image] *******************************************************************************************************



changed: [localhost]



PLAY RECAP *********************************************************************************************************************



localhost                  : ok=2    changed=1    unreachable=0    failed=0






Explanation:

This will download RabbitMQ image from docker repository. To verify use "docker images" command and we will find RabbitMQ images has been pulled from docker images.

Creating images from dockerfile using Ansible:

Docker provides another way to create images using dockerfile. Ansible also provides way to automate creating docker images using dockerfile.

Creating dockerfile:


FROM ubuntu

RUN apt-get update

RUN apt-get install -y rabbitmq-server



Now lets create playbook to build docker my rabbitmq image from above dockerfile

Source Code:




[root@test ansible_example]# cat docker-build.yml

---

- hosts: localhost

  tasks:

  - name: Build RabbitMQ image

    docker_image:

      path: .

      name: test/my-rabbitmq

      tag: v1

      source: build


In the above code snippet, we are trying to build a docker image located at current path.

Now lets run the playbook.



[root@test ansible_example]# ansible-playbook docker-build.yml



 [WARNING]: Could not match supplied host pattern, ignoring: all



 [WARNING]: provided hosts list is empty, only localhost is available



PLAY [localhost] ***************************************************************************************************************



TASK [Gathering Facts] *********************************************************************************************************



ok: [localhost]



TASK [Build RabbitMQ image] *******************************************************************************************************



changed: [localhost]



PLAY RECAP *********************************************************************************************************************



localhost                  : ok=2    changed=1    unreachable=0    failed=0


Please note you can easily push the local image to repository using Ansible. Here is the playbook example:




[root@test ansible_example]# cat docker-push.yml

---

- hosts: localhost

  tasks:

  - name: Push image

    docker_image:

   name: test/rabbitmq:v1

   repository: localhost:4800/test

   tag: v1

   push: yes

   source: local


Conclusion and Best practices:

Basically you can replace your docker-compose yml file with Ansible easily to manage containers even better. This way you can automate containerized application creation & deployment at one place and when the build triggers and completes, Ansible playbook can automatically deploy containers from one place to multiple node(s)

If you are looking for all the options of  docker_image module, please visit the official link here

Kubernetes is Docker orchestration engine and most devops team uses Kubernetes for docker lifecycle. This article can be helpful if you are looking for large number of container deployment using Ansible and Kubernetes.

That's all for building / managing docker images using Ansible. If you have any query, please mention in comment section. Thanks

Related Articles:
You may also like:

Tuesday, 12 March 2019

Introduction to Docker

Definition
Docker is the platform that provides an isolated environment to run/deploy your application in the form of containers. 

It separates the code from the infrastructure so that developer can concentrate on code only and don't worry about the different environment issues.
In Docker, we ship our containers (along with code) and thus if it worked on one deployment environment (say development / test), it will also work on other environment (say production). There is huge advantage of docker in this context, the process of application release will be much faster.

Features / Advantages
Use of docker container bring multiple advantage over traditional application deployment:
(a) Since docker separates code from infrastructure, developer can concentrate on issue related to application only
(b)  As in docker, we ship containers, all the issues we faced due to different environment will not be there
(c) It virtualizes host operating system rather than hardware, so it is much faster than traditional virtual machines

(d) We can run more instance of application using docker rather than virtual machines. Infact we can run containers inside virtual machines also
(e) There is one great advantage for docker is that we can break monolithic application to multiple micro-services, hence each component is independently deployable without affecting others. So, we have the bug in UI, we dont need to build / deploy whole application, only container / microservice for UI needs to be updated.
(f) we can spin up multiple containers in separate isolated environment using docker thus providing a sort of Highly Availability. This comes handy in a single node environment and we need to update application and if we have multiple containers for same application we can down one to update then up it and then update for other containers resulting in no downtime.

Docker is one of the important part of DevOps where it handles code deployment in an intelligent way.

Conclusion
In short, docker comes handy to 
  • Isolate infrastructure / different environment issue, 
  • Faster code deployment, 
  • Faster processing of application using same hardware resource, 
  • Dividing monolithic application to independent deployable microservices,
  • Minimize application downtime to zero,
  • Easy clustering multiple containers to use as HA and Load balancer both.