Splunk Search

Compare values at two different timestamps and output diffs into a table

ch_payroc
Loves-to-Learn Lots

Hi Folks,

I am trying to figure out how to compare a single field based off another field called timestamp.

I pull in data into Splunk via a JSON file that looks like the following:

{"table": "Route", "timestamp": "2023-11-07T12:25:43.208903", "dst": "10.240.0.0/30"}
{"table": "Route", "timestamp": "2023-11-07T12:25:43.208903", "dst": "10.241.0.0/30"}
{"table": "Route", "timestamp": "2023-11-07T12:25:43.208903", "dst": "10.242.0.0/30"}

{"table": "Route", "timestamp": "2023-11-10T13:12:17.529455", "dst": "10.240.0.0/30"}
{"table": "Route", "timestamp": "2023-11-10T13:12:17.529455", "dst": "10.241.0.0/31"}
{"table": "Route", "timestamp": "2023-11-10T13:12:17.529455", "dst": "10.245.0.0/30"}

There will be tens or hundreds of unique dst values, all with the same timestamp value.

What I'd like to be able to do is compare all dst values based off the timestamp value and compare that against a different set of dst values based off a different timestamp value.

So far, I've been able to do an appendcols + simple eval function to compare stats values from one timestamp to another:

index=<index> host=<host> sourcetype=_json timestamp=2023-11-07T12:25:43.208903
| stats values(dst) as old_prefix
| appendcols [searchindex=<index> host=<host> sourcetype=_json timestamp=2023-11-10T13:12:17.529455 | stats values(dst) as new_prefix]
| eval result=if(old_prefix=new_prefix, "pass","fail")
| table old_prefix new_prefix result

 And these are the results I get:

old_prefixnew_prefixresult
10.240.0.0/30
10.241.0.0/30
10.242.0.0/30
10.240.0.0/30
10.241.0.0/31
10.245.0.0/30
fail

 

But what I'd really want to see is something along the lines of this:

old_prefixnew_prefixresultpresent_in_old_tablepresent_in_new_table
10.240.0.0/3010.240.0.0/30pass  
10.241.0.0/30 fail10.241.0.0/30 
 10.241.0.0/31fail 10.241.0.0/31
10.242.0.0/30 fail10.242.0.0/30 
 10.245.0.0/30fail  10.245.0.0/30

 

Or this:

old_prefixnew_prefixresultpresent_in_old_tablepresent_in_new_table
10.240.0.0/30
10.241.0.0/30
10.242.0.0/30
10.240.0.0/30
10.241.0.0/31
10.245.0.0/30
fail10.241.0.0/30
10.242.0.0/30
10.241.0.0/31
10.245.0.0/30

 

Is this something that could be reasonably done inside splunk?

Please let me know if you have any further questions from me.

Labels (3)
0 Karma

tscroggins
Influencer

Hi @ch_payroc,

The diff search command can quickly identify differences between fields, just as the diff program does for files:

| stats values(dst) as dst by _time
``` convert multivalued dst field to multiline field for diff comparison ```
| eval dst=mvjoin(dst, urldecode("%0a"))
| diff attribute=dst

 

2023-11-07 07:25:43.20810.240.0.0/30
10.241.0.0/30
10.242.0.0/30
@@ -1,3 +1,3 @@
10.240.0.0/30
-10.241.0.0/30
-10.242.0.0/30
+10.241.0.0/31
+10.245.0.0/30
6

 

Using diff context=true will provide slightly different output:

 

2023-11-07 07:25:43.20810.240.0.0/30
10.241.0.0/30
10.242.0.0/30
*** 1,3 ****
  10.240.0.0/30
! 10.241.0.0/30
! 10.242.0.0/30
--- 1,3 ----
10.240.0.0/30
! 10.241.0.0/31
! 10.245.0.0/30
8

 

0 Karma

ch_payroc
Loves-to-Learn Lots

Oh, that diff may just work for my immediate needs!

I do want something a bit more visually pleasing, so I've been looking at multisearches as well, and was able to put this together:

| multisearch 
[ search index=<INDEX> host=<HOST> timestamp=<Timestamp 1> | fields timestamp dst ]
[ search index=<INDEX> host=<HOST> timestamp=<Timestamp 2> | fields timestamp dst ]
| stats values(timestamp) as timestamp by dst
| where mvcount(timestamp) = 1
| eval diff=if(timestamp=<Timestamp 1>, "No longer present in latest snapshot", "New route in latest snapshot")
| stats values(dst) by diff

I think I might find a way to use both somehow in a dashboard as long as I can keep the search from getting too complex. 

0 Karma

PickleRick
SplunkTrust
SplunkTrust

That's the right way to go. But it can be improved!

You don't have to do two separate searches you can search for

index=<INDEX> host=<HOST> timestamp=t1 OR timestamp=t2

or

index=<INDEX> host=<HOST> timestamp IN (t1,t2)

Also while your solution with values(timestamp) is generally right, I prefer to clasify the events with a nummeric classifier (either 1 or 2) and do sum on them. This way you're doing a bit more performant operations. But your search is still OK.

0 Karma
Get Updates on the Splunk Community!

Splunk Edge Processor | Popular Use Cases to Get Started with Edge Processor

Splunk Edge Processor offers more efficient, flexible data transformation – helping you reduce noise, control ...

Introducing New Splunkbase Governance!

Splunk apps are essential for maximizing the value of your Splunk Experience. Whether you’re using the default ...

3 Ways to Make OpenTelemetry Even Better

My role as an Observability Specialist at Splunk provides me with the opportunity to work with customers of ...