Splunk Search

Where do I put my fillnull value?

ebs
Communicator

This is my base search:

| datamodel Test summariesonly=true search
| search "TEST.date"=2021-05-23 | rename "TEST.date" as date
| bin _time span=1s
| eventstats count("TEST.exchangeId") by _time
| rename count(TEST.exchangeId) as avg_TPS
| stats avg(avg_TPS) as averageTps by date
| eval averageTps=round(averageTps,3)
| eval _time=$date$
| fields averageTps _time

 

Where do I put my fillnull value for averageTps? With the multiple stats values, I'm unsure and everytime I've tried putting it in I get no results whereas I want the averageTps field to be 0

Labels (5)
0 Karma

yuanliu
SplunkTrust
SplunkTrust

Take a look at this old thread and @somesoni2's solutionhttps://community.splunk.com/t5/Deployment-Architecture/How-to-produce-empty-time-buckets/m-p/172328....  The idea is to use a cheap subsearch to force missing buckets.  There is also a recent comment saying that the fillnull command is not needed if timechart is used, although I haven't tested that. (I can no longer find my old use case.)

0 Karma

ebs
Communicator

I have tried that append pipe and it kind of works, but it changes the original value when the value is not 0 for example one value before the appendpipe was 85, after it was 140. How do I fix this?

0 Karma

yuanliu
SplunkTrust
SplunkTrust

@ebs wrote:

I have tried that append pipe and it kind of works, but it changes the original value when the value is not 0 for example one value before the appendpipe was 85, after it was 140. How do I fix this?


Without knowing what code you tried, and the data used in test, it is rather difficult to cut hair over the telephone.

0 Karma

yuanliu
SplunkTrust
SplunkTrust

Just noticed that timechart seems to fill non-existent buckets with 0 for data like this:

_timeexchangeId
2016-08-30 05:27:0160302
2016-08-30 05:28:0360302
2016-08-30 05:28:0460130
2016-08-30 05:28:0560302
2016-08-30 05:28:0860130
2016-08-30 05:29:1860130
2016-08-30 05:29:1960302
2016-08-30 05:29:3160302
2016-08-30 05:31:0960130
2016-08-30 05:31:0960130
2016-08-30 05:31:2160130
2016-08-30 05:31:4660302
2016-08-30 05:31:5260130
2016-08-30 05:32:3560302
2016-08-30 05:32:5560302
2016-08-30 05:36:1460302
2016-08-30 05:36:5060302
2016-08-30 05:37:1260302
2016-08-30 05:37:2560302
2016-08-30 05:37:4160302

 

| timechart span=1m count(exchangeId) as TPM

 

(If exchangeId is available in every event as the sample data presents, count alone should suffice.) Output for this is

_timeTPM
2016-08-30 05:27:001
2016-08-30 05:28:004
2016-08-30 05:29:003
2016-08-30 05:30:000
2016-08-30 05:31:005
2016-08-30 05:32:002
2016-08-30 05:33:000
2016-08-30 05:34:000
2016-08-30 05:35:000
2016-08-30 05:36:002
2016-08-30 05:37:003

If this result is acceptable as TPM, your daily average can easily be calculated as

| timechart span=1m count(exchangeId) as TPM
| timechart span=1d sum(TPM) as average_TPM

You can calculate TPS, of course, by using span=1s in the first timechart.

(I no longer have the data that triggered the "gap" behavior observed in that old post.  It could also be that in later SPL, timechart now automatically fills blank buckets.)

0 Karma

bowesmana
SplunkTrust
SplunkTrust

@ebs 

Some questions and observations:

  • In your example, date has a single value, 2021-05-23, so the 'by date' part of "| stats avg(avg_TPS) as averageTps by date" is redundant as there will be a single result
  • You are doing count("TEST.exchange_id") by _time
    • This will just return a count of events that contain that field
      • if you know that all events contain that field, the just count by _time is sufficient
  • Try to use the rename part of a stats aggregation - it makes your SPL easier to read, e.g. 
    • | eventstats count("TEST.exchangeId") as avg_TPS by _time
    • so you can avoid having to do the subsequent rename
  • what is $date$ value - in Splunk, _time variable is an epoch time, not a formatted date string, so your $date$ token would have to be a numeric epoch
  • what is your expected results
    • as single value showing average TPS per second for a single day
      • if so, what null values are you trying to fill?

Are you trying to include seconds where there are 0 TPS, in which case you should use timechart instead

| datamodel Test summariesonly=true search
| search "TEST.date"=2021-05-23
| rename "TEST.date" as date
| timechart span=1s count as TPS 
| stats avg(TPS) as averageTps 
| eval averageTps=round(averageTps,3)
| eval date=$date|s$
| fields averageTps date

Note also in your example, where you use eventstats followed by stats

Using eventstats count will end up giving you the wrong result compared to stats count, e.g. if your eventstats count data is

_time, TPS
2021-05-23 08:00:00 5
2021-05-23 08:00:00 5
2021-05-23 08:00:00 5
2021-05-23 08:00:00 5
2021-05-23 08:00:00 5
2021-05-23 08:00:01 2
2021-05-23 08:00:01 2

then stats avg(TPS) will give you 29/7 = 4.143, whereas if you do stats count your data will be

2021-05-23 08:00:00 5
2021-05-23 08:00:01 2

and the stats avg(TPS) will be 3.5

Hope this helps

 

 

0 Karma

ebs
Communicator

Hi @bowesmana 

You've helped me improve my search, however I still don't know where to place my fillnull value in case there are no transaction which was my initial question.

0 Karma

bowesmana
SplunkTrust
SplunkTrust

Can't really answer your fillnull question without more information. What null values are present in your data you want to fill?

From your original search there are no null values.

If my comment about using timechart vs stats did not answer your question about what _might_ be null values, perhaps you can say.

 

0 Karma

ebs
Communicator

This specific line could result in a null value if there were no transactions today (unlikely but want it just in case)

0 Karma

bowesmana
SplunkTrust
SplunkTrust

@ebs 

so what was wrong with my original suggestion to use timechart - that will give you 0 values for a period that does not have data.

Tags (1)
0 Karma

ebs
Communicator

I had to revert to my original search because timechart was getting overwhelmed. All echange_ids are only 1 value each so it works for the TPS search

0 Karma
Get Updates on the Splunk Community!

Federated Search for Amazon S3 | Key Use Cases to Streamline Compliance Workflows

Modern business operations are supported by data compliance. As regulations evolve, organizations must ...

New Dates, New City: Save the Date for .conf25!

Wake up, babe! New .conf25 dates AND location just dropped!! That's right, this year, .conf25 is taking place ...

Introduction to Splunk Observability Cloud - Building a Resilient Hybrid Cloud

Introduction to Splunk Observability Cloud - Building a Resilient Hybrid Cloud  In today’s fast-paced digital ...