Getting Data In

Comparing CSVs day over day in Splunk?

daniel333
Builder

All,

I am reading in a CSV daily into index=main. It will have about 100k items in it. I want an alert for any added, removed file_name. OR an MD5 has changed. I have them brought in as their own event each.

CSV format is easy -
file_name, md5_hash

Looking for a snappy search to compare these files? Any samples or commands or advice?

0 Karma

woodcock
Esteemed Legend

Like this:

Your search here that has both data sets | stats dc(md5_hash) AS md5_hash_count count by file_name | search count<2 OR md5_hash_count>1
0 Karma

HiroshiSatoh
Champion

How about this?

index="main" sourcetype="csv" earliest=@d latest=+1d@d|join type=left file_name [search index="main" sourcetype="csv" earliest=-1d@d latest=@d|rename md5_hash as old_md5_hash]
|table file_name md5_hash old_md5_hash
|eval status=case(isnull(old_md5_hash),"ADD",md5_hash!=old_md5_hash,"UPDATE",md5_hash=old_md5_hash,"-")
| append [search index="main" sourcetype="csv" earliest=-1d@d latest=@d NOT [search index="main" sourcetype="csv" earliest=@d latest=+1d@d | fields file_name ]|table file_name md5_hash|eval status="DELETE"]

0 Karma
Get Updates on the Splunk Community!

Building Reliable Asset and Identity Frameworks in Splunk ES

 Accurate asset and identity resolution is the backbone of security operations. Without it, alerts are ...

Cloud Monitoring Console - Unlocking Greater Visibility in SVC Usage Reporting

For Splunk Cloud customers, understanding and optimizing Splunk Virtual Compute (SVC) usage and resource ...

Automatic Discovery Part 3: Practical Use Cases

If you’ve enabled Automatic Discovery in your install of the Splunk Distribution of the OpenTelemetry ...