What I am trying to write is some SPL code that will identify log events that only have a "Starting" event with no "Completed" event. By a specific Job Name extracted from each log event that are in the same index & sourcetype ?
A Job is still 'running' if it only has a "Start" event with no "Completed" event.
If my starting query is: index=anIndex sourcetype=aSourcetype (jobName1 OR jobName2 OR jobName3) AND "Starting"
| rex field=_raw "Batch::(?<aJobName1>[^\s]*)"
| stats count AS aCount1 by aJobName1
Then I only want to keep log events that have no "Completed" event from the same index and sourcetype:
index=anIndex sourcetype=aSourcetype (jobName1 OR jobName2 OR jobName3) AND "Completed"
| rex field=_raw "Batch::(?<aJobName2>[^\s]*)"
| stats count AS aCount2 by aJobName2
I have tried using:
where isnull(aCount2) but I used appendcols but stats is removing _raw data ? for the rest of my code...
How would I go about just getting those log events (_raw) for jobs that are only "Started"
I might be overthinking this but am struggling...
Yep. You're overthinking it a bit. Either you have a field containing the job state (Starting/Completed) or you can create one by
| eval state=case(searchmatch("Starting",_raw),"Starting",searchmatch("Completed"),"Completed",1=1,null())
Then you need to check the state for each separate job
| stats values(state) as states by whatever_id_you_have_for_each_job
(If you want to retain the jobname, which I assume is a more general clasifier than a single job identifier, add values(aJobName) to that stats command.
Then you can filter to see only non-finished jobs by
| where NOT states="Completed"
Keep in mind that matching multivalued fields can be a bit unintuitive at first.
Yep. You're overthinking it a bit. Either you have a field containing the job state (Starting/Completed) or you can create one by
| eval state=case(searchmatch("Starting",_raw),"Starting",searchmatch("Completed"),"Completed",1=1,null())
Then you need to check the state for each separate job
| stats values(state) as states by whatever_id_you_have_for_each_job
(If you want to retain the jobname, which I assume is a more general clasifier than a single job identifier, add values(aJobName) to that stats command.
Then you can filter to see only non-finished jobs by
| where NOT states="Completed"
Keep in mind that matching multivalued fields can be a bit unintuitive at first.
That makes sense and I was wanting to create some additional fields for the output and was getting hung up on the usage of | stats and had to switch it to | eventstats to retain _raw data for the rest of the code after the stats/eventstats.
You have helped me before PickleRick and always provide good info !!!
Works like a charm, thanks again !
| rex field=_raw "Batch::(?<aJobName>[^\s]*)"
| eval aStatus=case(
searchmatch("START of script"), "Start",
searchmatch("COMPLETED OK"), "End",
searchmatch("ABORTED, exiting with status"), "End",
true(),null()
)
| eventstats values(aStatus) as aStateList by aJobName
| where aStateList != "End"
|........
Yes, eventstats can indeed be sometimes used when you need to retain the original events but remember that eventstats is a "heavier" command than single stats (it has to keep all the events and add the summarized data to all events so it needs potentially way way more resources than simple stats; it's also not that well distributable)