Splunk Search

How to create a search for delay of data in Splunk?

ursfischer
Engager

Hello all

As a splunk in an early station 😀 I currently have the following challenge:
We have many indexes and we want to do an analysis over all indexes how fast the log data is available in Splunk. As distance should be measured from writing the log (_time) to indexing time (_indextime). Also we want to exclude scatter (e.g. we currently have hosts with wrong time configuration, i.e. something like a Gaussian normal distribution).
Here is an example query, which is probably wrong or could be done much better by you:

| tstats latest(_time) AS logTime latest(_indextime) AS IndexTime WHERE index=bv* BY _time span=1h
| eval delta=IndexTime - logTime
| search (delta<1800 AND delta>0)
| table _time delta

Is the query approx correct so that we can answer the question what kind of deley we have over all? How could one use a Gaussian normal distribution instead of restricting the search manually?

Labels (2)
Tags (1)
0 Karma

PickleRick
SplunkTrust
SplunkTrust

On top of @ITWhisperer 's suggestion I'd rather not use tstats to produce just one value per hour bin but rather calculate average over that delta for hour or shorter periods. If you have lots of data you could use sampling to do it only on a small subset of events.

0 Karma

ursfischer
Engager

well, we do have a lot of data (currently approx 10 billion events per day and more, increasing). tstats is probably not the best idea to use here, but faster than just a normal search. I will try with sampling and have a look how i can use this.
An other idea is to do some saved searches for each index, store the results (_time, _indextime, index) into a summary index and then use this make some statistics. but with more than 100 indexes it will take some time, effort and Splunk resources. also i am not shure if this will make things easyier for me.

0 Karma

ITWhisperer
SplunkTrust
SplunkTrust

You could consider using the Machine Learning ToolKit (MLTK) which is a free add-on from SplunkBase.

You can set up models of your data e.g. Gaussian / Normal distributions, and then look for anomalies.

0 Karma
Career Survey
First 500 qualified respondents will receive a $20 gift card! Tell us about your professional Splunk journey.

Can’t make it to .conf25? Join us online!

Get Updates on the Splunk Community!

Leveraging Automated Threat Analysis Across the Splunk Ecosystem

Are you leveraging automation to its fullest potential in your threat detection strategy?Our upcoming Security ...

Can’t Make It to Boston? Stream .conf25 and Learn with Haya Husain

Boston may be buzzing this September with Splunk University and .conf25, but you don’t have to pack a bag to ...

Splunk Lantern’s Guide to The Most Popular .conf25 Sessions

Splunk Lantern is a Splunk customer success center that provides advice from Splunk experts on valuable data ...