Splunk Search

How to search the difference between the stats output per field and fill "0" values in the stats output if an event does not exist?

inbroker
New Member

I have two source files, SourceA and SourceB, representing different months e.g. logs from June and July

Each source contains logs, and among other fields, it has:
ip, log_level, month

Doing a stats count by ip, log_level per source, I get results like:

(sourceA)

IP Address Log_Level Count

    ipA                 0                   X1
    ipA                 1                   X2
    ipB                 0                   X3
    ipC                 0                   X4
    ipC                 1                   X5

(sourceB)

IP Address Log_Level Count

    ipA                 0                   Y1
    ipA                 1                   Y2
    ipB                 0                   Y3
    ipB                 1                   Y4
    ipC                 0                   Y5

Note that it may happen that a certain IP doesn't have logs of a certain log_level on some months.

I would like to compare the difference between the two sources something like

(result)

IP Address Log_Level (current-previous count)

    ipA                 0                   Y1 - X1
    ipA                 1                   Y2 - X2
    ipB                 0                   Y3 - X3
    **ipB                 1                   Y4 - 0**
    ipC                 0                   Y5 - X4
    **ipC                 1                    0 - X5**

Notice that I want to calculate the difference per IP and per log_level (so I need to 'introduce' the count value of 0 when such events are non-existing.

Appendcols didn't work because it combines the resulting rows in sequential order so it mixes e.g. (ipC,0) from source A with (ipB,1) from source B

Append didn't work because it just produces a table with all the rows from both sources like

IP Address Log_Level Count

    ipA                 0                   X1
    ipA                 1                   X2
    ipB                 0                   X3
    ipC                 0                   X4
    ipC                 1                   X5
    ipA                 0                   Y1
    ipA                 1                   Y2
    ipB                 0                   Y3
    ipB                 1                   Y4
    ipC                 0                   Y5

Join seems to be the best candidate, but both left and outer joins filter out the rows of each source that doesn't have a corresponding one on the other table, so in the end result, I am getting something like:

IP Address Log_Level (current-previous count)

    ipA                 0                   Y1 - X1
    ipA                 1                   Y2 - X2
    ipB                 0                   Y3 - X3
    **ipB                 1                   Y4 - 0** (outer) or  **ipC                 1                    0 - X5** (left)
    ipC                 0                   Y5 - X4

Is there any way to 'inject' the '0' values in the stats output if such logs are non-existent or any other possible way?

Any hints? Thanks in advance

0 Karma
1 Solution

lguinn2
Legend

Try this

yoursearchhere (source=A OR source=B)
| stats count(eval(source="A")) as sourceA count(eval(source="B")) as sourceB by ip, log_level, month
| fillnull 
| eval difference=sourceA-sourceB

Still, this won't deliver a row if both sourceA and sourceB don't have a certain log_level for a particular month. But - no need for a subsearch!

View solution in original post

0 Karma

somesoni2
Revered Legend

Try something like this (assuming sourceA (previous month) and sourceB (current month) are hardcoded values)

your base search to include data from both sources | table source ip log_level | eval ip=ip."#".log_level | chart count over ip by source | rex field=ip "<?<ip>.*)#(?<log_level>.*)" | eval different=sourceB-sourceA | table ip log_level difference
0 Karma

lguinn2
Legend

Try this

yoursearchhere (source=A OR source=B)
| stats count(eval(source="A")) as sourceA count(eval(source="B")) as sourceB by ip, log_level, month
| fillnull 
| eval difference=sourceA-sourceB

Still, this won't deliver a row if both sourceA and sourceB don't have a certain log_level for a particular month. But - no need for a subsearch!

0 Karma
Get Updates on the Splunk Community!

Now Playing: Splunk Education Summer Learning Premieres

It’s premiere season, and Splunk Education is rolling out new releases you won’t want to miss. Whether you’re ...

The Visibility Gap: Hybrid Networks and IT Services

The most forward thinking enterprises among us see their network as much more than infrastructure – it's their ...

Get Operational Insights Quickly with Natural Language on the Splunk Platform

In today’s fast-paced digital world, turning data into actionable insights is essential for success. With ...