Splunk Search

How to perform calculations based on rex extracted fields?

athorat
Communicator
index=aap_prod sourcetype="HDP:PROD:OOZIE"  | rex "TOKEN\[\] APP\[(?<JobName>[^\]]*)"   | rex "ACTION\[[^\@]*(?<Action>[^\d\]]*)" 

Extracting two fields JobName and Action.
The field called Action has values 1>@:start: 2> @end and 3> @Incremental

I have to calculate total number of JobNames which have values start and end which will be considered as successful jobs, and the count of JobNames which have different Start and End.

0 Karma
1 Solution

dcarmack_splunk
Splunk Employee
Splunk Employee

try this:

index=aap_prod sourcetype="HDP:PROD:OOZIE"  
| rex "APP\[(?<Job_name>[^\]]+)\]\s*JOB\[(?<job_id>[^\]]+)\]\s*ACTION\[[^:]+:(?<action>[^:]+)" 
| fields action job_id 
| stats values(action) AS action by job_id 
| where mvfilter(match(action,"start") AND match(action,"end"))

View solution in original post

0 Karma

dcarmack_splunk
Splunk Employee
Splunk Employee

try this:

index=aap_prod sourcetype="HDP:PROD:OOZIE"  
| rex "APP\[(?<Job_name>[^\]]+)\]\s*JOB\[(?<job_id>[^\]]+)\]\s*ACTION\[[^:]+:(?<action>[^:]+)" 
| fields action job_id 
| stats values(action) AS action by job_id 
| where mvfilter(match(action,"start") AND match(action,"end"))
0 Karma

athorat
Communicator

I modified the Field Names but that works (Y).

Thank You so much.

index=aap_prod sourcetype="HDP:PROD:OOZIE"  ("@:start:" OR "@end") | rex "TOKEN\[\] APP\[(?<JobName>[^\]]*)"   | rex "ACTION\[[^\@]*(?<Action>[^\d\]]*)" 
 | fields Action JobName
| stats values(Action) AS Action by JobName
 | where isnotnull(mvfind(Action,"@:start:")) AND isnotnull(mvfind(Action,"@end"))
0 Karma

athorat
Communicator

@dcarmack one followup question
When I do a stats count of @:start: and @end I get a count of 360+ for both of them

index=aap_prod sourcetype="HDP:PROD:OOZIE" | rex "TOKEN\[\] APP\[(?<JobName>[^\]]*)"  | rex "ACTION\[[^\@]*(?<Action>[^\d\]]*)" | search  Action="@:start:"| stats count(Action)

but when I use the query which you suggested it returns around 118.

also when I try to use stats values(JobName) and group by Action it does not return anything...

0 Karma

dcarmack_splunk
Splunk Employee
Splunk Employee

The search I provided you is giving a count of job_id's where the action of both start and end exists. A stats count is simply going to give you a count of events where either exist.

... | stats values(JobName) AS JobName by Action

^^ that doesn't return anything?

0 Karma

athorat
Communicator

@dcarmack thanks for the reply

I get an error : Error in 'where' command: The expression is invalid. The result of a 'where' expression must be boolean.

0 Karma

dcarmack_splunk
Splunk Employee
Splunk Employee

oops! sorry! replace the where with this

| where isnotnull(mvfind(action,"start")) AND isnotnull(mvfind(action,"end")) 

jkat54
SplunkTrust
SplunkTrust

"I have to calculate total number of job names which has values start and end , which will be considered as successful jobs
and the count of jobnames which has difference of Start and End"

If you break each of these events into separate events correctly:

Event 1: 2015-12-17 23:25:17,487 INFO ActionStartXCommand:543 - SERVER[p01bdl841.aap.csaa.com] USER[hdfs] GROUP[-] TOKEN[] APP[WF_danlaw_journey_hive] JOB[0015832-150810162217214-oozie-oozi-W] ACTION[0015832-150810162217214-oozie-oozi-W@:start:] [***0015832-150810162217214-oozie-oozi-W@:start:***]Action updated in DB!

Event 2: 2015-12-18 23:25:17,487 INFO ActionStartXCommand:543 - SERVER[p01bdl841.aap.csaa.com] USER[hdfs] GROUP[-] TOKEN[] APP[WF_danlaw_journey_hive] JOB[0015832-150810162217214-oozie-oozi-W] ACTION[0015832-150810162217214-oozie-oozi-W@:start:] [***0015832-150810162217214-oozie-oozi-W@:start:***]Action updated in DB!

Then it makes sense that each event is 1 job name. So all you need to do is search the events looking for start & stop.

 index=aap_prod sourcetype="HDP:PROD:OOZIE" start stop | rex field=_raw "(?<app>(?<=APP\[).*(?=\] JOB))" | rex field=_raw "(?<job>(?<=JOB\[).*(?=\] ACTION))" | stats c by app job

and

 index=aap_prod sourcetype="HDP:PROD:OOZIE" start NOT stop | rex field=_raw "(?<app>(?<=APP\[).*(?=\] JOB))" | rex field=_raw "(?<job>(?<=JOB\[).*(?=\] ACTION))" | stats c by app job
0 Karma

athorat
Communicator

I think I was not clear in explaining the requirement.

This is my regex

index=aap_prod sourcetype="HDP:PROD:OOZIE" ("@:start:" OR "@end") | rex "TOKEN\[\] APP\[(?[^\]]*)"   | rex "ACTION\[[^\@]*(?[^\d\]]*)" | stats count(JobName) by Action

And I want to calculate the JobNames which has Action-Start AND Action=END

0 Karma

sundareshr
Legend

The regex you have posted here doesn't appear to be correct. Are you getting the right extraction? Or are your needing help with that as well? If you need help with the regex, you should include some sample data so someone can help with valid regex. Also, what do you mean by difference of start and end? Do mean number of events that have an Action=Start by do not have Action=End?

0 Karma

athorat
Communicator

This is the event

2015-12-17 23:25:17,487  INFO ActionStartXCommand:543 - SERVER[p01bdl841.aap.csaa.com] USER[hdfs] GROUP[-] TOKEN[] APP[WF_danlaw_journey_hive] JOB[0015832-150810162217214-oozie-oozi-W] ACTION[0015832-150810162217214-oozie-oozi-W@:start:] [***0015832-150810162217214-oozie-oozi-W@:start:***]Action updated in DB!

I get the Start Action, End Action and the Incremantal Action

So just need to get the jobnames which has action as started and End.

0 Karma

sundareshr
Legend

I'm still a bit unclear as to what it is that you are looking for. I have modified your regex where it will not correctly extract JobName and Action. However, start appears twice in your data, which one is the action? Looking at your regex, it appears you attempting to get the one that starts with ACTION. Based on these assumption, here's the rex for you to try. Hopefully, this is what you were looking for

    .... | rex "APP\[(?<JobName>[^\]]+)" | rex "ACTION[^:]+:(?<Action>[^:]+)"
0 Karma

Richfez
SplunkTrust
SplunkTrust

athorat,

Once we make sure those extractions and rexes are correct, is what you want something like

A list of successful JobNames where the "start" and "end" count per JobName are equal, and another list of unsuccessful JobNames where the count of start and end don't match. Does that sound right?

So, you might have as successful ones,
JobName, Count of Start, Count of End
BillyBob, 92, 92
JimmyJohn, 43, 43
...

Unsuccessful could be things like
JobName, Count of Start, Count of End
RalphyRalph, 47, 42
MyBrokenJob, 52, 18
MyImpossibleJob, 32, 0

(This can be combined into one list probably pretty well, so if it's a separate list or part of the same list isn't really an issue, what I think we need to make sure of is that we're thinking about this the right way for now).

Get Updates on the Splunk Community!

.conf24 | Registration Open!

Hello, hello! I come bearing good news: Registration for .conf24 is now open!   conf is Splunk’s rad annual ...

ICYMI - Check out the latest releases of Splunk Edge Processor

Splunk is pleased to announce the latest enhancements to Splunk Edge Processor.  HEC Receiver authorization ...

Introducing the 2024 SplunkTrust!

Hello, Splunk Community! We are beyond thrilled to announce our newest group of SplunkTrust members!  The ...