Splunk Search
Highlighted

How to perform calculations based on rex extracted fields?

Communicator
index=aap_prod sourcetype="HDP:PROD:OOZIE"  | rex "TOKEN\[\] APP\[(?<JobName>[^\]]*)"   | rex "ACTION\[[^\@]*(?<Action>[^\d\]]*)" 

Extracting two fields JobName and Action.
The field called Action has values 1>@:start: 2> @end and 3> @Incremental

I have to calculate total number of JobNames which have values start and end which will be considered as successful jobs, and the count of JobNames which have different Start and End.

0 Karma
Highlighted

Re: How to perform calculations based on rex extracted fields?

Legend

The regex you have posted here doesn't appear to be correct. Are you getting the right extraction? Or are your needing help with that as well? If you need help with the regex, you should include some sample data so someone can help with valid regex. Also, what do you mean by difference of start and end? Do mean number of events that have an Action=Start by do not have Action=End?

0 Karma
Highlighted

Re: How to perform calculations based on rex extracted fields?

Communicator

This is the event

2015-12-17 23:25:17,487  INFO ActionStartXCommand:543 - SERVER[p01bdl841.aap.csaa.com] USER[hdfs] GROUP[-] TOKEN[] APP[WF_danlaw_journey_hive] JOB[0015832-150810162217214-oozie-oozi-W] ACTION[0015832-150810162217214-oozie-oozi-W@:start:] [***0015832-150810162217214-oozie-oozi-W@:start:***]Action updated in DB!

I get the Start Action, End Action and the Incremantal Action

So just need to get the jobnames which has action as started and End.

0 Karma
Highlighted

Re: How to perform calculations based on rex extracted fields?

Legend

I'm still a bit unclear as to what it is that you are looking for. I have modified your regex where it will not correctly extract JobName and Action. However, start appears twice in your data, which one is the action? Looking at your regex, it appears you attempting to get the one that starts with ACTION. Based on these assumption, here's the rex for you to try. Hopefully, this is what you were looking for

    .... | rex "APP\[(?<JobName>[^\]]+)" | rex "ACTION[^:]+:(?<Action>[^:]+)"
0 Karma
Highlighted

Re: How to perform calculations based on rex extracted fields?

SplunkTrust
SplunkTrust

athorat,

Once we make sure those extractions and rexes are correct, is what you want something like

A list of successful JobNames where the "start" and "end" count per JobName are equal, and another list of unsuccessful JobNames where the count of start and end don't match. Does that sound right?

So, you might have as successful ones,
JobName, Count of Start, Count of End
BillyBob, 92, 92
JimmyJohn, 43, 43
...

Unsuccessful could be things like
JobName, Count of Start, Count of End
RalphyRalph, 47, 42
MyBrokenJob, 52, 18
MyImpossibleJob, 32, 0

(This can be combined into one list probably pretty well, so if it's a separate list or part of the same list isn't really an issue, what I think we need to make sure of is that we're thinking about this the right way for now).

Highlighted

Re: How to perform calculations based on rex extracted fields?

SplunkTrust
SplunkTrust

"I have to calculate total number of job names which has values start and end , which will be considered as successful jobs
and the count of jobnames which has difference of Start and End"

If you break each of these events into separate events correctly:

Event 1: 2015-12-17 23:25:17,487 INFO ActionStartXCommand:543 - SERVER[p01bdl841.aap.csaa.com] USER[hdfs] GROUP[-] TOKEN[] APP[WF_danlaw_journey_hive] JOB[0015832-150810162217214-oozie-oozi-W] ACTION[0015832-150810162217214-oozie-oozi-W@:start:] [***0015832-150810162217214-oozie-oozi-W@:start:***]Action updated in DB!

Event 2: 2015-12-18 23:25:17,487 INFO ActionStartXCommand:543 - SERVER[p01bdl841.aap.csaa.com] USER[hdfs] GROUP[-] TOKEN[] APP[WF_danlaw_journey_hive] JOB[0015832-150810162217214-oozie-oozi-W] ACTION[0015832-150810162217214-oozie-oozi-W@:start:] [***0015832-150810162217214-oozie-oozi-W@:start:***]Action updated in DB!

Then it makes sense that each event is 1 job name. So all you need to do is search the events looking for start & stop.

 index=aap_prod sourcetype="HDP:PROD:OOZIE" start stop | rex field=_raw "(?<app>(?<=APP\[).*(?=\] JOB))" | rex field=_raw "(?<job>(?<=JOB\[).*(?=\] ACTION))" | stats c by app job

and

 index=aap_prod sourcetype="HDP:PROD:OOZIE" start NOT stop | rex field=_raw "(?<app>(?<=APP\[).*(?=\] JOB))" | rex field=_raw "(?<job>(?<=JOB\[).*(?=\] ACTION))" | stats c by app job
0 Karma
Highlighted

Re: How to perform calculations based on rex extracted fields?

Communicator

I think I was not clear in explaining the requirement.

This is my regex

index=aap_prod sourcetype="HDP:PROD:OOZIE" ("@:start:" OR "@end") | rex "TOKEN\[\] APP\[(?[^\]]*)"   | rex "ACTION\[[^\@]*(?[^\d\]]*)" | stats count(JobName) by Action

And I want to calculate the JobNames which has Action-Start AND Action=END

0 Karma
Highlighted

Re: How to perform calculations based on rex extracted fields?

Communicator

try this:

index=aap_prod sourcetype="HDP:PROD:OOZIE"  
| rex "APP\[(?<Job_name>[^\]]+)\]\s*JOB\[(?<job_id>[^\]]+)\]\s*ACTION\[[^:]+:(?<action>[^:]+)" 
| fields action job_id 
| stats values(action) AS action by job_id 
| where mvfilter(match(action,"start") AND match(action,"end"))

View solution in original post

0 Karma
Highlighted

Re: How to perform calculations based on rex extracted fields?

Communicator

@dcarmack thanks for the reply

I get an error : Error in 'where' command: The expression is invalid. The result of a 'where' expression must be boolean.

0 Karma
Highlighted

Re: How to perform calculations based on rex extracted fields?

Communicator

oops! sorry! replace the where with this

| where isnotnull(mvfind(action,"start")) AND isnotnull(mvfind(action,"end"))