Currently we are having issues with our scan data comming in to out indexer, so we have to use CSV's for scan data 😐. The data from CSV's we are uploading into Splunk look like this:
Scan Date Vuln Blah
Feb 11, 2021 11:30:29 EST 4 15
Feb 18, 2021 11:30:29 EST 10 15
I want to pull only the newest scan data, in this case " Feb 18, 2021 11:30:29 EST"? It doesnt appear "strp time" can run on this date format because of the EST at the end. I know "substr" exists, but it appears it only works on field names, not field values.
Any ideas?
Thanks
The tail command does not examine fields - it should return the 1 line from the end of the CSV, but that's a moot point now that you've clarified the requirements.
To get the latest date for each IP address we need to convert the timestamps into epoch form so we can find the most recent one. Here's how we do that.
| inputcsv my.csv
| eval ts=strptime(SCAN_DATE, "%b %d, %Y %H:%M:%S %Z")
| eventstats max(ts) as maxts by IP
| where ts=maxts
| fields - ts maxts
The strptime function does indeed recognize the EST time zone. Try this
| eval ts=strptime('Scan Date', "%b %d, %Y %H:%M:%S %Z")
How do I pull all data from the most recent date? I only want to see field values from the most recent date, Feb 18, 2021 11:30:29 EST
To get the most recent date from the CSV, use inputcsv followed by tail.
| inputcsv my.csv
| tail 1
My apologies, I should have been more specific and added more data to the table to better understand the challenge I'm having:
IP SCAN_DATE Data
192.168.1.1 FEB 16, 2021 11:30:29 EST 1
192.168.1.1 FEB 16, 2021 11:30:29 EST 2
192.168.1.1 FEB 13, 2021 11:30:29 EST 5
192.168.1.2 FEB 11, 2021 11:30:29 EST 3
192.168.1.2 FEB 14, 2021 11:30:29 EST 4
192.168.1.2 FEB 14, 2021 11:30:29 EST 6
192.168.1.3 FEB 12, 2021 11:30:29 EST 8
192.168.1.3 FEB 18, 2021 11:30:29 EST 20
192.168.1.3 FEB 18, 2021 11:30:29 EST 21
If I run:
| inputcsv my.csv
| tail 1
I'll get:
IP Scan_Date Data
192.168.1.1 FEB 16, 2021 11:30:29 EST 1
192.168.1.2 FEB 14, 2021 11:30:29 EST 4
192.168.1.3 FEB 18, 2021 11:30:29 EST 20
Tail only pulls the first field value for the latest scan_date it sees for each IP. I need all field values (all data) for each IP for the newest date. The search would return something like this (it would ignore the older SCAN_DATE for each IP):
IP SCAN_DATE Data
192.168.1.1 FEB 16, 2021 11:30:29 EST 1
192.168.1.1 FEB 16, 2021 11:30:29 EST 2
192.168.1.2 FEB 14, 2021 11:30:29 EST 4
192.168.1.2 FEB 14, 2021 11:30:29 EST 6
192.168.1.3 FEB 18, 2021 11:30:29 EST 20
192.168.1.3 FEB 18, 2021 11:30:29 EST 21
Any ideas?
The tail command does not examine fields - it should return the 1 line from the end of the CSV, but that's a moot point now that you've clarified the requirements.
To get the latest date for each IP address we need to convert the timestamps into epoch form so we can find the most recent one. Here's how we do that.
| inputcsv my.csv
| eval ts=strptime(SCAN_DATE, "%b %d, %Y %H:%M:%S %Z")
| eventstats max(ts) as maxts by IP
| where ts=maxts
| fields - ts maxts
That works! I just changed max to min to retrieve most recent scan data.
| inputcsv my.csv
| eval ts=strptime(SCAN_DATE, "%b %d, %Y %H:%M:%S %Z")
| eventstats min(ts) as maxts by IP
| where ts=maxts
| fields - ts maxts
Thanks!
It should be max() since the most recent timestamp will convert into the biggest number, but whatever works for you.