Splunk Search

How to search the difference between the stats output per field and fill "0" values in the stats output if an event does not exist?

inbroker
New Member

I have two source files, SourceA and SourceB, representing different months e.g. logs from June and July

Each source contains logs, and among other fields, it has:
ip, log_level, month

Doing a stats count by ip, log_level per source, I get results like:

(sourceA)

IP Address Log_Level Count

    ipA                 0                   X1
    ipA                 1                   X2
    ipB                 0                   X3
    ipC                 0                   X4
    ipC                 1                   X5

(sourceB)

IP Address Log_Level Count

    ipA                 0                   Y1
    ipA                 1                   Y2
    ipB                 0                   Y3
    ipB                 1                   Y4
    ipC                 0                   Y5

Note that it may happen that a certain IP doesn't have logs of a certain log_level on some months.

I would like to compare the difference between the two sources something like

(result)

IP Address Log_Level (current-previous count)

    ipA                 0                   Y1 - X1
    ipA                 1                   Y2 - X2
    ipB                 0                   Y3 - X3
    **ipB                 1                   Y4 - 0**
    ipC                 0                   Y5 - X4
    **ipC                 1                    0 - X5**

Notice that I want to calculate the difference per IP and per log_level (so I need to 'introduce' the count value of 0 when such events are non-existing.

Appendcols didn't work because it combines the resulting rows in sequential order so it mixes e.g. (ipC,0) from source A with (ipB,1) from source B

Append didn't work because it just produces a table with all the rows from both sources like

IP Address Log_Level Count

    ipA                 0                   X1
    ipA                 1                   X2
    ipB                 0                   X3
    ipC                 0                   X4
    ipC                 1                   X5
    ipA                 0                   Y1
    ipA                 1                   Y2
    ipB                 0                   Y3
    ipB                 1                   Y4
    ipC                 0                   Y5

Join seems to be the best candidate, but both left and outer joins filter out the rows of each source that doesn't have a corresponding one on the other table, so in the end result, I am getting something like:

IP Address Log_Level (current-previous count)

    ipA                 0                   Y1 - X1
    ipA                 1                   Y2 - X2
    ipB                 0                   Y3 - X3
    **ipB                 1                   Y4 - 0** (outer) or  **ipC                 1                    0 - X5** (left)
    ipC                 0                   Y5 - X4

Is there any way to 'inject' the '0' values in the stats output if such logs are non-existent or any other possible way?

Any hints? Thanks in advance

0 Karma
1 Solution

lguinn2
Legend

Try this

yoursearchhere (source=A OR source=B)
| stats count(eval(source="A")) as sourceA count(eval(source="B")) as sourceB by ip, log_level, month
| fillnull 
| eval difference=sourceA-sourceB

Still, this won't deliver a row if both sourceA and sourceB don't have a certain log_level for a particular month. But - no need for a subsearch!

View solution in original post

0 Karma

somesoni2
Revered Legend

Try something like this (assuming sourceA (previous month) and sourceB (current month) are hardcoded values)

your base search to include data from both sources | table source ip log_level | eval ip=ip."#".log_level | chart count over ip by source | rex field=ip "<?<ip>.*)#(?<log_level>.*)" | eval different=sourceB-sourceA | table ip log_level difference
0 Karma

lguinn2
Legend

Try this

yoursearchhere (source=A OR source=B)
| stats count(eval(source="A")) as sourceA count(eval(source="B")) as sourceB by ip, log_level, month
| fillnull 
| eval difference=sourceA-sourceB

Still, this won't deliver a row if both sourceA and sourceB don't have a certain log_level for a particular month. But - no need for a subsearch!

0 Karma
Got questions? Get answers!

Join the Splunk Community Slack to learn, troubleshoot, and make connections with fellow Splunk practitioners in real time!

Meet up IRL or virtually!

Join Splunk User Groups to connect and learn in-person by region or remotely by topic or industry.

Get Updates on the Splunk Community!

[Puzzles] Solve, Learn, Repeat: Character substitutions with Regular Expressions

This challenge was first posted on Slack #puzzles channelFor BORE at .conf23, we had a puzzle question which ...

Splunk Community Badges!

  Hey everyone! Ready to earn some serious bragging rights in the community? Along with our existing badges ...

[Puzzles] Solve, Learn, Repeat: Matching cron expressions

This puzzle (first published here) is based on matching timestamps to cron expressions.All the timestamps ...