Our tool has a root, parent and child jobs which we are monitoring using Splunk. For a short example:
Job JobId="1" ParentJob="0"
Job JobId="2" ParentJob="1"
Job JobId="3" ParentJob="1"
Job JobId="4" ParentJob="2"
Job JobId="5" ParentJob="3"
Job JobId="6" ParentJob="2"
So here, the child jobs are only the jobs with ID = 4,5,6.
I want to get events only from ChildJobs and the only way to do that is to execute following queries (pseudo SQL):
1. AllJobsId = SELECT DISTINCT JobId FROM JobStatuses
2. ChildJob = SELECT * FROM JobStatuses WHERE ParentJobId != 0 AND ParentJobId NOT IN [AllJobsId]
Then I want to execute some stats on all events which were from ChildJobs. Is that possible?
I believe this will return what you're looking for
index=<someindex> sourcetype=JobStatuses NOT [search index=<someindex> sourcetype=JobStatuses | rename JobId AS ParentJob | stats count by ParentJob | fields - count]
How do you identify a child job?
Child job is a job for which there is no other job which has it ID as a ParentJobId. I have defined this (more or less) with a SQL Pseudo Query). So I need to do something like this - get all jobs ids to kind of dictionary. Then, when filtering the events, do something like: .... | where JobId != ParentJobsIds | . In other words, if this job id was defined as parent job id somewhere, it means that this job is parent job so this event should be ignored.
The structure could not be deeper than three nodes Root -> Parent -> Child or Parent->Child.
in splunk language try this
index=......... Child* |stats count by JobId
Hi, the problem is that I do not have the name ChilJob/ParentJob/Root available. I have only generic entry for root, parent and child jobs. I have updated the question to avoid confusion. Or I didn't get your suggestion?