Friday 26 July 2019

Top 100 Splunk Interview questions and answers

In this article, we will see important Splunk Interview Questions and Answers.

Here are the top 100 interview questions and answers on Splunk

What is Splunk and its uses?
Splunk is a software used for monitoring, searching, analyzing the machine data in real time. The adata source can be web application, sensors, devices, or user created data.

What are the components of Splunk?
The components of Splunk are:
(a) search head - GUI for searching
(b) Forwarder - forward data to indexer
(c) indexer - index machine data
(d) Deployment server - Manages splunk components  in distributed environment.

Briefly describe how Splunk works?
Splunk works by collecting, parsing, indexing and analyzing data. Data is collected by the forwarder from the source and forwarder forward the data to the indexer. On data stored in the indexer the search head searches, visualizes, analyzes and performs various functions.

What is Splunk forwarder?
Splunk forwarder is used to forward data to indexer.

What are the advantages of Splunk forwarder?
Splunk forwarder can throttle bandwidth and provide an encrypted SSL connection for transferring data from forwarder to indexer.

What are the types of forwarder?
There are two types:
Universal Forwarder
Heavy Weight Forwarder

What is Universal Forwarder?
In Universal forwarder, splunk agent is installed on non-splunk system to gather data locally but it can't parse or index data

What is Heavy Weight Forwarder?
Heavy weight forwarder is the full instance of splunk with advance functionality and it works as remote controller as well as intermediate forwarder and data filter.

What is the advantage of Splunk over other similar tools?
Splunk is a single integrated tool for machine data. It does all the role starting from performing IT operation, analyzing machine logs with providing business intelligence. There can be other tools in market but Splunk is the only tool that provides end-to-end data operation. You might need 3-4 tools individually for what Splunk is doing as a single software.

What are the configuration files of Splunk?

What are the different licenses in Splunk?
There are 6 type of licenses in Splunk
Enterprise license
Free license
Forwarder license
Beta license
Licenses for search heads 
Licenses for cluster members

What is the limitation of free license in Splunk?
In free license we cannot authenticate and schedule searches, distribute search, forwarding in TCP/ Http and deployment management

What is the use of License Master in Splunk?
License Master controls how much data size we can index in a day. For example if we have 200 GB license model, then we can only index 200 GB of data in a day. So we should have the license for the maximum data size we are getting.

Suppose due to some reason License Master is unreachable, will the indexing stop?
Data search will stop if License Master is not reachable, however data will continue to indexed. You will get a warning on web UI or search head that you have exceeded the indexing volume. The indexing will not stop.

What is the use of DB Connect in Splunk?
Its a plugin to connect to generic SQL database and integrate with it.

What is the command for boot-start enable and disable?
To enable Splunk to boot-start, the command is:
To disable Splunk to boot-start, the command is:

What is summary index in Splunk?
To boost the reporting efficiency, Summary indexes are used. Basically it enables user to generate report after processing huge volume of machine data.

What are the different types of Summary Index?
There are two types:
Default Summary Index - It is used by Splunk Enterprise by default in case no other summary index are specified.
Additional Summary Index - To enable running varieties of reports, additional summary index is used.

What is the default field for events?
The five default fields are 
source type, 

How can we restart Splunk?
Splunk can be restarted from the Splunk Web. The steps are
1.Go to System, navigate to Server Controls.
2.Click on Restart Splunk.

How to search multiple ips in splunk?
Using lookup tables, we can search multiple IP addresses 

What is the most efficient way to filter events in splunk?
The most efficient way to filter events in Splunk is by time / duration.

How can we reset Splunk password?
To reset the password, access to the file where Splunk is running is necessary. Then perform the following steps:
Move $SPLUNK_HOME/etc/passwd file to $SPLUNK_HOME/etc/passwd.bak
Restart Splunk and log in with default username and password i.e. admin/changeme.
Reset the password and combine the password file with the backup file.

What is sourcetype?
Sourcetype in Splunk is a default data field.Sourcetype is the format of the data that shows its origin. for eg, take .evt files,  it originate from the event viewer. The classification of the incoming data can be done based on service, system, format and character code. The common source types are apache_error, websphere_core, apache_error and cisco_syslog.  What it does is processes and distributes incoming data into different events. 

How to use two sourcetypes in splunk? 
I would like to give usecase on how to search 2 sourcetpes in a lookup file
sourcetype=X OR sourcetype=Y | lookup country.csv
Using this code, sourcetypes X and Y can be searched in a lookup file.

What is kv store in splunk?
KV stands for key value that allows to store and obtain data inside Splunk. The KV store has the following functions:
(a) To manage a job queue.
(b) For storing metadata by the user.
(c) Analysing the workflow.
Storing the user application state required for handling a UI session. To store the results of the search queries in Splunk. Maintaining a list of environment assets and checkpoint data.

What is deployer in Splunk? 
A deployer is used to deploy configuration information and apps to the cluster head. The set of configuration details such as updates that the deployer sends is called configuration bundle. The deployer shares this bundle when a new cluster member joins the cluster. It handles the basic app configurations and user configurations. 
However, the latest states cannot be restored to the members of the cluster.

Which roles can create data models in Splunk?
Data models can be created through admin or power roles by the users. For other users, these models can only be created if they have the write access to the application. The permissions based on the roles determine whether a user can edit or view them.

When to use auto_high_volume in Splunk?
auto_high_volume is used when the indexes are of very high volume. A high volume index can get over 10GB of data.

What are the Splunk alternatives?
sumo logic

How to restart splunk webserver and daemon?
To restart webserver: splunk start splunkweb
To restart daemon: splunk start splunkd

How to clear Splunk search history?
we need to delete searches.log from this path

What is fishbucket in Splunk?
Its a directory or index at default location /opt/splunk/var/lib/splunk .It contains seek pointers and CRCs for the files you
are indexing, so splunkd can tell if it has read them already.We can access it through GUI by searching for  “index=_thefishbucket”

Which commands are used in the reporting results category?
  • Top 
  • Rare 
  • stats
  • Chart 
  • Timechart 

What is the use of stat command?
Stat reports data in tabular format and multiple fields is used to build table.

What is the use of chart command?
As name indicates, chart is used to display data in bar, line or area graph. It takes 2 fields to display chart.

What is the use of timechart?
Timechart is used to display data on timeline. It just takes 1 field as the other field is by default is time field.

How to disable the Splunk boot start?
$SPLUNK_HOME/bin/Splunkdisable boot-start

How to disable the Splunk launch message?
we can disable Splunk launch messabe by adding this in splunk_launch.conf
Set valueOFFENSIVE=Less in splunk_launch.conf

What is difference between Splunk app and Splunk Add-on?
Splunk app has GUI configuration whereas Splunk app doesnt have it (only command line)

In Splunk cluster, how to offline a peer?
Using command Splunk offline, we can offline a peer

What are the different categories in SPL command?
SPL command has five major categories:
Sorting Results, Filtering Results, Grouping Results, Filtering, Modifying and Adding Fields and Reporting Results.

How to specify minimum disk usage in splunk
Using the following commands we can set minimum disk usage:
/opt/splunk/bin/splunk set minfreemb = 20000
It requires restart, so
/opt/splunk/bin/splunk restart

Do you know what is SOS in context of Splunk?
Yes, SOS stands for Splunk on Splunk. Its a type of splunk app which provides graphical interface of Splunk performance and issues.

What does Lookup commands do?
It adds fields based while identifying the value in the event, referencing a lookup table and while adding up the fields in the matching rows in the lookup table of the event. 

What does input lookup command do?
It returns the whole lookup table as the search results.

What does the output lookup command do?
It outputs the current search results to a lookup table on the disk.

What does the sort command do in Splunk?
As name explains, it sorts the search results by the use of specified fields.
Here is the syntax:
Sort[<count>] <sort-by-clause>... [desc]

What is transaction Command and how does it works?
The transaction command is helpful in  two specific scenarios:
As we know, unique id (from one or more fields) alone is not enough to differentiate between two transactions. This might be the use case when the identifier is reused, for example web sessions identified by cookie/client IP. In this scenario, time span or pauses are also used to segment the data into transactions. In other cases when an identifier is reused, say in DHCP logs, a particular message may identify the beginning or end of a transaction. When it is desirable to see the raw text of the events combined rather than analysis on the constituent fields of the events.

How To Troubleshoot Splunk Performance Issues ?
Well, we can start from here: First I would like to check splunkd.log to trace any error. If all is fine then I will check server / vm performance issue (i.e. cpu / memory / storage IO etc) and lastly install Splunk on Splunk which provides GUI where we can check any performance issues.

How to create a new app from template in Splunk?
Go to dir /opt/splunk/bin/splunk 
create app New_App -template app1

What Is Dispatch Directory ?
$SPLUNK_HOME/var/run/splunk/dispatch contains a directory for each search that is running or has completed. For example, a directory named 1434308973.367 will contain a CSV file of its search results, a search.log with details about the search execution, and other stuff. Using the defaults (which you can override in limits.conf), these directories will be deleted 10 minutes after the search completes – unless the user saves the search results, in which case the results will be deleted after 7 days. 

What Is Difference Between Search Head Pooling And Search Head Clustering?
Both are features provided splunk for high availability of splunk search head in case any one search head goes down.Search head cluster is newly introduced and search head pooling will be removed in next upcoming versions.Search head cluster is managed by captain and captain controls its slaves.Search head cluster is more reliable and efficient than search head pooling.

What is null queue in Splunk?
Null queue used to trim out all the data that is unwanted.

List different types of search modes supported in splunk?
There are three modes:
  • Fast mode
  • Smart mode
  • Verbose mode

What is btool in Splunk?
Splunk btool is a command line tool to troubleshoot configuration file issues. It is also used to  see what values are being used by your Splunk Enterprise installation in existing environment.

How to use btool?
Command: /opt/splunk/bin/splunk btool input list

How to rollback your splunk web configuration bundle to last version?
Here is the command
/opt/splunk/bin/splunk rollback cluster-bundle

How to change port in Splunk?
/opt/splunk/bin/splunkset web-port <port_number>

What Is Map-reduce Algorithm?
Map-reduce algorithm is inspired by map and reduce funtionality and used for batch based large scale parallelization.

Where does Splunk default configuration file located?
Default configuration file is located in  $Splunkhome/etc/system/default

What is lookup command used for?
Lookup command is used for referencing fields from an external csv file that matches fields in your event data.

How Splunk Avoids Duplicate Indexing Of Logs ?
This is done by keeping track of indexed events in a directory called fish buckets and contains seek pointers and CRCs for indexed files. This way it can check whether it has been indexed or not and avoid duplicate index.

That's all for Splunk Interview questions with answers. If you have any questions, please mention in comments. Thanks!!!

Related Articles:
You may also like:

No comments:

Post a Comment