Getting Data In

Python Keeping a Rolling Window of Events

trenin
Explorer

For our solution, we need to index a number of events, but delete the events when they get too old. In our implementation, this is something like 6 months to a year. Each event references a file on a file system which also needs to be deleted when the event is deleted. The events that are indexed are JSON files.

I've implemented something that works using a 'one-shot' search, but I would like some feedback to see if there are alternatives that might work better for my use case.

I have a python script which connects to my splunk instance. I have a cron job scheduled for once a day which then executes the following with the Splunk API:

import splunklib.client as client
import json

# Init connection to Splunk
service = client.connect(
  host=SPLUNK_HOST,
  port=SPLUNK_PORT,
  username=SPLUNK_USER,
  password=SPLUNK_PASSWD)

# Get all the events older than a year.
kwargs_searcholder = {"latest_time" : "-1y"}
search = "search index=custom"
search_results = service.jobs.oneshot(search, **kwargs_searcholder)
search_reader = search_results.ResultsReader(search_results)
for item in search_reader:
    jsondata=json.loads(item['_raw'])
    id = jsondata['id']
    # Delete the file on the file system
    delete_file(id)
    # Delete the event in splunk
    service.jobs.oneshot(search + " id=" + id + " | delete", **kwargs_deleteone)

I have a few questions about this.

Q1: is there a way to do this within splunk instead of creating a cron job? I need to do more than just run a search, I also need to delete a file on the file system.

Q2: If the search returns a lot of records, then I am invoking the API once per record to perform the delete. Is there some way I can delete the same records I got in the search without specifying them all individually? I know I could just pipe the search through delete, but since this procedure takes a non-trivial amount of time, there will be events that are not old enough at the time of the first search, but are old enough at the time of the second. Then I would have files left on disk that are orphaned because the event is already deleted in Splunk.

Q3: Is this a good candidate for a saved search?

0 Karma
Got questions? Get answers!

Join the Splunk Community Slack to learn, troubleshoot, and make connections with fellow Splunk practitioners in real time!

Meet up IRL or virtually!

Join Splunk User Groups to connect and learn in-person by region or remotely by topic or industry.

Get Updates on the Splunk Community!

[Puzzles] Solve, Learn, Repeat: Matching cron expressions

This puzzle (first published here) is based on matching timestamps to cron expressions.All the timestamps ...

Why Splunk Customers Should Attend Cisco Live 2026 Las Vegas

Why Splunk Customers Should Attend Cisco Live 2026 Las Vegas     Cisco Live 2026 is almost here, and this ...

Data Management Digest – May 2026

Welcome to the May 2026 edition of Data Management Digest!   As your trusted partner in data innovation, the ...