Splunk Search

How do I search for outdated data in the index and delete it, retaining only the latest information?

wvpony
Engager

Hello,

I'm working on IOC but unfortunately, keeping them in a lookup table is already getting messy and we have to index them now, so I have this query running every hour to check if our Threat Intel source has an updated IOC or if any of those IOC that has been recorded already is now active again based on the information collected from Threat Intel and then writing them now in one of our indexes using the "collect" command.

Keeping them in the lookup table and deleting the old or duplicate IOC based on their "last_updated_date" is quite easy, however since we have to move them now or write them now in an index, we cannot avoid having duplicate IOC being written in that index like, for example, an IOC has been active in the wild for a week or for a month and the Threat Intel source will keep us posted whenever there's a new update, so if updates happened for an IOC 5 times for a week, then we have 5 duplicate IOCs in our index having 5 different "last_updated_date" including the time when the Threat Intel gives update like for example within 24 hours.

So my plan is to have an SPL query that will run once or twice a day to delete those old IOCs with old "last_updated_date" but my question is, how I can run a search that will only show up those old data (not showing the IOC with the latest data in the result) and then I will add in the end the "delete" command.

the example below "4/18/2023" is the latest

Indicator last_updated_date
abc123 4/18/2023
abc123 4/17/2023
abc123 4/16/2023
abc123 4/15/2023
abc123 4/14/2023

 

requirement: to have a query that will give a result like the one below (knowing that the latest one is not included but still it is somewhere in the index and to make sure not to be deleted when we add the "delete" command at the end of the possible query) 

Indicator last_updated_date
abc123 4/17/2023
abc123 4/16/2023
abc123 4/15/2023
abc123 4/14/2023

 

The one below is an example of the current query I have

index=threatintel indicator="0.tcp.ap.ngrok.io"
| stats latest(indicator) latest(last_updated_date) by indicator

But this one is showing only the latest one even though the indicator has been written in my index 12 times already, so what I'm trying to do is to list out all of the 11 indicators and just retain the 1 with the latest "last_updated_date"

------------------------------------------------------------------------------
I have another sample but this one is already good as my target here is to delete the IOC with more than 90 days of "last_updated_date" - you will see at the bottom that I have written the "delete" command as this query is showing me the events that are more 90 days old and not active anymore in the wild and also good to be deleted in our index

index=threatintel
| eval ninety_days_ago = relative_time(now(), "-90d@d")
| eval last_updated_date = strptime(last_updated_date,"%Y-%m-%d %H:%M:%S")
| where last_updated_date < ninety_days_ago
| eval last_updated_date = strftime(last_updated_date, "%Y-%m-%d %H:%M:%S")
| table indicator ioc_type last_updated_date
| delete

Hope someone can help 

Best regards

wvpony

Labels (1)
0 Karma

wvpony
Engager

Hi @somesoni2, I apologize for needing to return to this issue and modify my response. After obtaining the necessary permissions to execute the 'delete' command in our testing and production environments, I encountered an error stating:

"Error in 'delete' command: This command cannot be invoked after the command 'eventstats', which is not distributable streaming."

Are there any other possible workarounds you can suggest? The query you provided indeed gives me the desired result. For instance, if there are 100 duplicates, running the SPL will only return 99 results. However, when I add the '| delete' command, that's when the issue arises.

Thank you for your assistance.

0 Karma

somesoni2
Revered Legend

Give this a try (run it without delete command first to validate result)

index=threatintel 
| eventstats latest(last_updated_date)  as latest_update_date by indicator
| where NOT last_updated_date=latest_update_date
| delete

 

Get Updates on the Splunk Community!

Video | Welcome Back to Smartness, Pedro

Remember Splunk Community member, Pedro Borges? If you tuned into Episode 2 of our Smartness interview series, ...

Detector Best Practices: Static Thresholds

Introduction In observability monitoring, static thresholds are used to monitor fixed, known values within ...

Expert Tips from Splunk Education, Observability in Action, Plus More New Articles on ...

Splunk Lantern is a Splunk customer success center that provides advice from Splunk experts on valuable data ...