Solved: How to get events that have only a starting 'strin...

sjringo · ‎02-29-2024

What I am trying to write is some SPL code that will identify log events that only have a "Starting" event with no "Completed" event. By a specific Job Name extracted from each log event that are in the same index & sourcetype ?

A Job is still 'running' if it only has a "Start" event with no "Completed" event.

If my starting query is: index=anIndex sourcetype=aSourcetype (jobName1 OR jobName2 OR jobName3) AND "Starting"

| rex field=_raw "Batch::(?<aJobName1>[^\s]*)"
| stats count AS aCount1 by aJobName1

Then I only want to keep log events that have no "Completed" event from the same index and sourcetype:

index=anIndex sourcetype=aSourcetype (jobName1 OR jobName2 OR jobName3) AND "Completed"
| rex field=_raw "Batch::(?<aJobName2>[^\s]*)"
| stats count AS aCount2 by aJobName2

I have tried using:

where isnull(aCount2) but I used appendcols but stats is removing _raw data ? for the rest of my code...

How would I go about just getting those log events (_raw) for jobs that are only "Started"
I might be overthinking this but am struggling...

PickleRick · ‎02-29-2024

Yep. You're overthinking it a bit. Either you have a field containing the job state (Starting/Completed) or you can create one by

| eval state=case(searchmatch("Starting",_raw),"Starting",searchmatch("Completed"),"Completed",1=1,null())

Then you need to check the state for each separate job

| stats values(state) as states by whatever_id_you_have_for_each_job

(If you want to retain the jobname, which I assume is a more general clasifier than a single job identifier, add values(aJobName) to that stats command.

Then you can filter to see only non-finished jobs by

| where NOT states="Completed"

Keep in mind that matching multivalued fields can be a bit unintuitive at first.

View solution in original post

PickleRick · ‎02-29-2024

Yep. You're overthinking it a bit. Either you have a field containing the job state (Starting/Completed) or you can create one by

| eval state=case(searchmatch("Starting",_raw),"Starting",searchmatch("Completed"),"Completed",1=1,null())

Then you need to check the state for each separate job

| stats values(state) as states by whatever_id_you_have_for_each_job

(If you want to retain the jobname, which I assume is a more general clasifier than a single job identifier, add values(aJobName) to that stats command.

Then you can filter to see only non-finished jobs by

| where NOT states="Completed"

Keep in mind that matching multivalued fields can be a bit unintuitive at first.

sjringo · ‎02-29-2024

That makes sense and I was wanting to create some additional fields for the output and was getting hung up on the usage of | stats and had to switch it to | eventstats to retain _raw data for the rest of the code after the stats/eventstats.

You have helped me before PickleRick and always provide good info !!!

Works like a charm, thanks again !

| rex field=_raw "Batch::(?<aJobName>[^\s]*)"
| eval aStatus=case(
searchmatch("START of script"), "Start",
searchmatch("COMPLETED OK"), "End",
searchmatch("ABORTED, exiting with status"), "End",
true(),null()
)
| eventstats values(aStatus) as aStateList by aJobName
| where aStateList != "End"

|........

PickleRick · ‎03-01-2024

Yes, eventstats can indeed be sometimes used when you need to retain the original events but remember that eventstats is a "heavier" command than single stats (it has to keep all the events and add the summarized data to all events so it needs potentially way way more resources than simple stats; it's also not that well distributable)

How to get events that have only a starting 'string' with no ending 'string' ?

join

stats

subsearch

How to Monitor Google Kubernetes Engine (GKE)

Index This | How can you make 45 using only 4?

Splunk Education Goes to Washington | Splunk GovSummit 2024