Hi Folks,
I am trying to figure out how to compare a single field based off another field called timestamp.
I pull in data into Splunk via a JSON file that looks like the following:
{"table": "Route", "timestamp": "2023-11-07T12:25:43.208903", "dst": "10.240.0.0/30"}
{"table": "Route", "timestamp": "2023-11-07T12:25:43.208903", "dst": "10.241.0.0/30"}
{"table": "Route", "timestamp": "2023-11-07T12:25:43.208903", "dst": "10.242.0.0/30"}
{"table": "Route", "timestamp": "2023-11-10T13:12:17.529455", "dst": "10.240.0.0/30"}
{"table": "Route", "timestamp": "2023-11-10T13:12:17.529455", "dst": "10.241.0.0/31"}
{"table": "Route", "timestamp": "2023-11-10T13:12:17.529455", "dst": "10.245.0.0/30"}
There will be tens or hundreds of unique dst values, all with the same timestamp value.
What I'd like to be able to do is compare all dst values based off the timestamp value and compare that against a different set of dst values based off a different timestamp value.
So far, I've been able to do an appendcols + simple eval function to compare stats values from one timestamp to another:
index=<index> host=<host> sourcetype=_json timestamp=2023-11-07T12:25:43.208903
| stats values(dst) as old_prefix
| appendcols [searchindex=<index> host=<host> sourcetype=_json timestamp=2023-11-10T13:12:17.529455 | stats values(dst) as new_prefix]
| eval result=if(old_prefix=new_prefix, "pass","fail")
| table old_prefix new_prefix result
And these are the results I get:
old_prefix | new_prefix | result |
10.240.0.0/30 10.241.0.0/30 10.242.0.0/30 | 10.240.0.0/30 10.241.0.0/31 10.245.0.0/30 | fail |
But what I'd really want to see is something along the lines of this:
old_prefix | new_prefix | result | present_in_old_table | present_in_new_table |
10.240.0.0/30 | 10.240.0.0/30 | pass | ||
10.241.0.0/30 | fail | 10.241.0.0/30 | ||
10.241.0.0/31 | fail | 10.241.0.0/31 | ||
10.242.0.0/30 | fail | 10.242.0.0/30 | ||
10.245.0.0/30 | fail | 10.245.0.0/30 |
Or this:
old_prefix | new_prefix | result | present_in_old_table | present_in_new_table |
10.240.0.0/30 10.241.0.0/30 10.242.0.0/30 | 10.240.0.0/30 10.241.0.0/31 10.245.0.0/30 | fail | 10.241.0.0/30 10.242.0.0/30 | 10.241.0.0/31 10.245.0.0/30 |
Is this something that could be reasonably done inside splunk?
Please let me know if you have any further questions from me.
Hi @ch_payroc,
The diff search command can quickly identify differences between fields, just as the diff program does for files:
| stats values(dst) as dst by _time
``` convert multivalued dst field to multiline field for diff comparison ```
| eval dst=mvjoin(dst, urldecode("%0a"))
| diff attribute=dst
2023-11-07 07:25:43.208 | 10.240.0.0/30 10.241.0.0/30 10.242.0.0/30 | @@ -1,3 +1,3 @@ | 6 |
Using diff context=true will provide slightly different output:
2023-11-07 07:25:43.208 | 10.240.0.0/30 10.241.0.0/30 10.242.0.0/30 | *** 1,3 **** | 8 |
Oh, that diff may just work for my immediate needs!
I do want something a bit more visually pleasing, so I've been looking at multisearches as well, and was able to put this together:
| multisearch
[ search index=<INDEX> host=<HOST> timestamp=<Timestamp 1> | fields timestamp dst ]
[ search index=<INDEX> host=<HOST> timestamp=<Timestamp 2> | fields timestamp dst ]
| stats values(timestamp) as timestamp by dst
| where mvcount(timestamp) = 1
| eval diff=if(timestamp=<Timestamp 1>, "No longer present in latest snapshot", "New route in latest snapshot")
| stats values(dst) by diff
I think I might find a way to use both somehow in a dashboard as long as I can keep the search from getting too complex.
That's the right way to go. But it can be improved!
You don't have to do two separate searches you can search for
index=<INDEX> host=<HOST> timestamp=t1 OR timestamp=t2
or
index=<INDEX> host=<HOST> timestamp IN (t1,t2)
Also while your solution with values(timestamp) is generally right, I prefer to clasify the events with a nummeric classifier (either 1 or 2) and do sum on them. This way you're doing a bit more performant operations. But your search is still OK.