job status polling via external API

mitag · ‎10-12-2020

What are the best practices in collecting job statuses in Splunk via an external API?

(I am not sure I am asking the right question, or asking the question correctly - so please bear with me.)

With a log file, Splunk only ingests what's been appended to the file since the last ingest, and not the entire file. With API polling it's a little trickier as even if the last record is unchanged, prior records (job statuses) may still refer to jobs that are in progress; their statuses needing to be ingested into Splunk... My initial impulse is write the Python polling script (as part of a "Scripted Input") as follows:

Poll the API, capture states of all job statuses and write them to a file
During the next poll, poll the API again, then read the "states" file, determine what's changed, and send only the updated records to Splunk
Update the "states" file with new data.

Is there a simpler way?

Thanks!

P.S. Sample data that a Python script collects via an API call:

[{"id":"1","fileName":"257158727.mpg","scheduledAt":"Jul 31, 2020 6:51:17 AM","status":"Finished","result":"Failure","correct":"Run correction|10058","progress":"0|00000173a5242","startTime":"Jul 31, 2020 6:51:20 AM","completionTime":"Jul 31, 2020 7:07:45 AM",},
{"id":"2","fileName":"257164625.ts","scheduledAt":"Jul 31, 2020 6:11:50 AM","status":"Finished","result":"Failure","correct":"Correction in Progress||00000173a5000","progress":"86|843|00000173a5000","startTime":"Jul 31, 2020 6:11:53 AM","completionTime":"Jul 31, 2020 6:53:35 AM"},
{"id":"3","fileName":"257166304.ts","scheduledAt":"Jul 31, 2020 5:03:05 AM","status":"Finished","result":"Failure","correct":"correction completed|00000173a4c11","progress":"100|00000173a4c11","startTime":"Jul 31, 2020 5:03:07 AM","completionTime":"Jul 31, 2020 6:44:23 AM"}]

Note that "status" and "result" fields are rather meaningless when determining if the job has finished. Instead I must extract the first stanza in the "correct" field and make the determination based on its value: if it contains "Correction in Progress", the job is in progress; anything else - it's done.

P.P.S. The sample data is from Interra Systems' Baton Content Corrector. The data format (job or task UUID, status, timestamps, other metadata) is very common across most job and session tracking systems (transcoding farms, file transfer platforms, etc.) with the goal of detecting anomalies, issues, stuck jobs.

P.P.P.S. I am assuming the best practice is to follow the "Example script that polls a database" except modify it for my purposes; my hope is that there's yet another "best practice" on top of it as polling job statuses is conceptually different from "tailing" a database.

job status polling via external API

JSON

other

Introducing Splunk Enterprise 9.2

Adoption of RUM and APM at Splunk

Routing logs with Splunk OTel Collector for Kubernetes