Getting Data In

Comparing CSVs day over day in Splunk?

daniel333
Builder

All,

I am reading in a CSV daily into index=main. It will have about 100k items in it. I want an alert for any added, removed file_name. OR an MD5 has changed. I have them brought in as their own event each.

CSV format is easy -
file_name, md5_hash

Looking for a snappy search to compare these files? Any samples or commands or advice?

0 Karma

woodcock
Esteemed Legend

Like this:

Your search here that has both data sets | stats dc(md5_hash) AS md5_hash_count count by file_name | search count<2 OR md5_hash_count>1
0 Karma

HiroshiSatoh
Champion

How about this?

index="main" sourcetype="csv" earliest=@d latest=+1d@d|join type=left file_name [search index="main" sourcetype="csv" earliest=-1d@d latest=@d|rename md5_hash as old_md5_hash]
|table file_name md5_hash old_md5_hash
|eval status=case(isnull(old_md5_hash),"ADD",md5_hash!=old_md5_hash,"UPDATE",md5_hash=old_md5_hash,"-")
| append [search index="main" sourcetype="csv" earliest=-1d@d latest=@d NOT [search index="main" sourcetype="csv" earliest=@d latest=+1d@d | fields file_name ]|table file_name md5_hash|eval status="DELETE"]

0 Karma
Get Updates on the Splunk Community!

Enhance Security Visibility with Splunk Enterprise Security 7.1 through Threat ...

(view in My Videos)Struggling with alert fatigue, lack of context, and prioritization around security ...

Troubleshooting the OpenTelemetry Collector

  In this tech talk, you’ll learn how to troubleshoot the OpenTelemetry collector - from checking the ...

Adoption of Infrastructure Monitoring at Splunk

  Splunk's Growth Engineering team showcases one of their first Splunk product adoption-Splunk Infrastructure ...