- Mark as New
- Bookmark Message
- Subscribe to Message
- Mute Message
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
How do I search for outdated data in the index and delete it, retaining only the latest information?
Hello,
I'm working on IOC but unfortunately, keeping them in a lookup table is already getting messy and we have to index them now, so I have this query running every hour to check if our Threat Intel source has an updated IOC or if any of those IOC that has been recorded already is now active again based on the information collected from Threat Intel and then writing them now in one of our indexes using the "collect" command.
Keeping them in the lookup table and deleting the old or duplicate IOC based on their "last_updated_date" is quite easy, however since we have to move them now or write them now in an index, we cannot avoid having duplicate IOC being written in that index like, for example, an IOC has been active in the wild for a week or for a month and the Threat Intel source will keep us posted whenever there's a new update, so if updates happened for an IOC 5 times for a week, then we have 5 duplicate IOCs in our index having 5 different "last_updated_date" including the time when the Threat Intel gives update like for example within 24 hours.
So my plan is to have an SPL query that will run once or twice a day to delete those old IOCs with old "last_updated_date" but my question is, how I can run a search that will only show up those old data (not showing the IOC with the latest data in the result) and then I will add in the end the "delete" command.
the example below "4/18/2023" is the latest
Indicator | last_updated_date |
abc123 | 4/18/2023 |
abc123 | 4/17/2023 |
abc123 | 4/16/2023 |
abc123 | 4/15/2023 |
abc123 | 4/14/2023 |
requirement: to have a query that will give a result like the one below (knowing that the latest one is not included but still it is somewhere in the index and to make sure not to be deleted when we add the "delete" command at the end of the possible query)
Indicator | last_updated_date |
abc123 | 4/17/2023 |
abc123 | 4/16/2023 |
abc123 | 4/15/2023 |
abc123 | 4/14/2023 |
The one below is an example of the current query I have
index=threatintel indicator="0.tcp.ap.ngrok.io"
| stats latest(indicator) latest(last_updated_date) by indicator
But this one is showing only the latest one even though the indicator has been written in my index 12 times already, so what I'm trying to do is to list out all of the 11 indicators and just retain the 1 with the latest "last_updated_date"
------------------------------------------------------------------------------
I have another sample but this one is already good as my target here is to delete the IOC with more than 90 days of "last_updated_date" - you will see at the bottom that I have written the "delete" command as this query is showing me the events that are more 90 days old and not active anymore in the wild and also good to be deleted in our index
index=threatintel
| eval ninety_days_ago = relative_time(now(), "-90d@d")
| eval last_updated_date = strptime(last_updated_date,"%Y-%m-%d %H:%M:%S")
| where last_updated_date < ninety_days_ago
| eval last_updated_date = strftime(last_updated_date, "%Y-%m-%d %H:%M:%S")
| table indicator ioc_type last_updated_date
| delete
Hope someone can help
Best regards
wvpony
- Mark as New
- Bookmark Message
- Subscribe to Message
- Mute Message
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi @somesoni2, I apologize for needing to return to this issue and modify my response. After obtaining the necessary permissions to execute the 'delete' command in our testing and production environments, I encountered an error stating:
"Error in 'delete' command: This command cannot be invoked after the command 'eventstats', which is not distributable streaming."
Are there any other possible workarounds you can suggest? The query you provided indeed gives me the desired result. For instance, if there are 100 duplicates, running the SPL will only return 99 results. However, when I add the '| delete' command, that's when the issue arises.
Thank you for your assistance.
- Mark as New
- Bookmark Message
- Subscribe to Message
- Mute Message
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Give this a try (run it without delete command first to validate result)
index=threatintel
| eventstats latest(last_updated_date) as latest_update_date by indicator
| where NOT last_updated_date=latest_update_date
| delete