Getting Data In

How to allocate identical timestamps from multiple lines to separate columns using dedup?

Fergal111
Path Finder

Hi Splunk heads,

I'm hoping someone might have the answer to this little issue I am facing.

I have a problem with the below search when the "Job Queued" and "Job Started" DateTime values are matching. If this is the case, the matching DateTime value will only be allocated to the "Job Queued" column, instead of slotting the timestamp into "Job Queued" and "Job Started". As a result of this, the "Variant Started" columns timestamp will be allocated to "Job Started" and the timestamp from "Variant Completed" will be allocated to "Variant Started".

Event="JobQueued" OR Event="JobProcessStarted" OR Event="VariantProcessStarted" OR Event="VariantProcessCompleted" | rename ProductionId as "Production ID" | rename JobProcessStarted as "Job Started" | rename JobQueued as "Job Queued" | rename VariantProcessStarted as "Variant Started" | rename VariantProcessCompleted as "Variant Completed" | fields DateTime JobId Event "Production ID" | stats values() as by JobId | where mvcount(Event)>1 | eval "Job Queued"=mvindex(DateTime,0) | eval "Job Started"=mvindex(DateTime,1) | eval "Variant Started"=mvindex(DateTime,2) | eval "Variant Completed"=mvindex(DateTime,3) | eval "P&P Duration"=strptime('Variant Completed',"%Y-%m-%dT%H:%M:%S") -strptime('Job Started',"%Y-%m-%dT%H:%M:%S") | eval "P&P Duration"=strftime('P&P Duration',"%H:%M:%S") | table "Production ID" JobId "Job Queued" "Job Started" "Variant Started" "Variant Completed" "P&P Duration" | sort by "Job Queued","Job Started" desc

Example result:

Production ID JobId Job Queued Job Started Variant Started Variant Completed P&P Duration
2/4080/0001#001 1B116C49-A75D-441F-89C6-B592D50BCF9A 2015-08-01T20:20:42 2015-08-01T21:10:01 2015-08-01T21:10:36 
2/3995/0007#001 9B0736F4-4DAF-497B-98CC-66201C09E864 2015-08-01T13:57:04 2015-08-01T14:24:17 2015-08-01T14:24:46

Any advice would be very well appreciated.

Cheers,
F

0 Karma
1 Solution

woodcock
Esteemed Legend

Let's do it directly and it won't be a problem; try this:

Event="JobQueued" OR Event="JobProcessStarted" OR Event="VariantProcessStarted" OR Event="VariantProcessCompleted" | eval "Job Queued Time" = if(Event="JobQueued", DateTime, null()) | eval "Job Started Time" = if(Event="JobProcessStarted", DateTime, null()) | eval "Variant Started Time" = if(Event="VariantProcessStarted", DateTime, null()) | eval "Variant Completed Time" = if(Event="VariantProcessCompleted", DateTime, null()) | rename ProductionId as "Production ID" | stats values(*) as * by JobId | where mvcount(Event)>1 | eval "P&P Duration"=strptime('Variant Completed Time',"%Y-%m-%dT%H:%M:%S") -strptime('Job Started Time',"%Y-%m-%dT%H:%M:%S") | eval "P&P Duration"=strftime('P&P Duration', "%H:%M:%S") | table "Production ID" JobId "Job Queued Time" "Job Started Time" "Variant Started Time" "Variant Completed Time" "P&P Duration" | sort by "Job Queued Time","Job Started Time" desc

View solution in original post

woodcock
Esteemed Legend

Let's do it directly and it won't be a problem; try this:

Event="JobQueued" OR Event="JobProcessStarted" OR Event="VariantProcessStarted" OR Event="VariantProcessCompleted" | eval "Job Queued Time" = if(Event="JobQueued", DateTime, null()) | eval "Job Started Time" = if(Event="JobProcessStarted", DateTime, null()) | eval "Variant Started Time" = if(Event="VariantProcessStarted", DateTime, null()) | eval "Variant Completed Time" = if(Event="VariantProcessCompleted", DateTime, null()) | rename ProductionId as "Production ID" | stats values(*) as * by JobId | where mvcount(Event)>1 | eval "P&P Duration"=strptime('Variant Completed Time',"%Y-%m-%dT%H:%M:%S") -strptime('Job Started Time',"%Y-%m-%dT%H:%M:%S") | eval "P&P Duration"=strftime('P&P Duration', "%H:%M:%S") | table "Production ID" JobId "Job Queued Time" "Job Started Time" "Variant Started Time" "Variant Completed Time" "P&P Duration" | sort by "Job Queued Time","Job Started Time" desc

woodcock
Esteemed Legend

Try replacing the dobule-quotes with dollar-signs like this:

now() - $Job Queued Time$

Fergal111
Path Finder

Thank you. That removed the error but iwhat im seeing now is new fields/columns and blank spaces.

Im not sure but i hope this paste will give you an idea of what im seeing.

JobId BundleCode BundleCodeDescription BundleUrl DateTime Event Job Queued Time Job Started Time P&P Duration Production ID PublishedLocation Variant Completed Time Variant Started Time VariantDescription VariantId date_hour date_mday date_minute date_month date_second date_wday date_year date_zone host index linecount source sourcetype splunk_server statusValue timeendpos timestartpos
0104523B-2144-404A-85B0-E342C023F60C

2015-08-04T20:53:18
2015-08-04T20:53:19
JobProcessStarted
JobQueued
2015-08-04T20:53:18 2015-08-04T20:53:19 20 4 53 august

18
19
tuesday 2015 local

LL-000-00 main 1

JSON Completed

52
60
33
41

0 Karma

Fergal111
Path Finder

Thanks Woodcock.

I tried both but there was thrown an error each time.

Error in 'eval' command: Typechecking failed. '-' only takes numbers.
I wonder if this is pointing to a bug here: (now() - "Job Queued Time")

0 Karma

woodcock
Esteemed Legend

Please note that there was a typo in one of the field names and I fixed it by re-editing my solution. Be sure to update your search.

0 Karma

Fergal111
Path Finder

Thank you Woodcock. That works perfectly! 🙂

just added this at the end for for values that have yet to appear in the logs.

| fillnull value=Processing "Production ID","Job Queued","Job Started","Variant Started","Variant Completed","P&P Duration"

Im trying to find out now if it is possible to write something along the lines of:
value=pending if DateTime value of "Job Queued" is less than 24 hours ago. If it is more than 24 hours ago value=failed

Thanks for all your help:)

Cheers,
F

0 Karma

woodcock
Esteemed Legend

What you asked can be done like this:

Event="JobQueued" OR Event="JobProcessStarted" OR Event="VariantProcessStarted" OR Event="VariantProcessCompleted" | eval "Job Queued Time" = if(Event="JobQueued", DateTime, null()) | eval "Job Started Time" = if(Event="JobProcessStarted", DateTime, null()) | eval "Variant Started Time" = if(Event="VariantProcessStarted", DateTime, null()) | eval "Variant Completed Time" = if(Event="VariantProcessCompleted", DateTime, null()) | rename ProductionId as "Production ID" | stats values(*) as * by JobId | where mvcount(Event)>1 | eval "P&P Duration"=strptime('Variant Completed Time',"%Y-%m-%dT%H:%M:%S") -strptime('Job Started Time',"%Y-%m-%dT%H:%M:%S") | eval "P&P Duration"=strftime('P&P Duration', "%H:%M:%S") | eval statusValue=if(((now() - "Job Queued Time") < (24*60*60)), "pending", "failed")

But I think what you really desire is this:

Event="JobQueued" OR Event="JobProcessStarted" OR Event="VariantProcessStarted" OR Event="VariantProcessCompleted" | eval "Job Queued Time" = if(Event="JobQueued", DateTime, null()) | eval "Job Started Time" = if(Event="JobProcessStarted", DateTime, null()) | eval "Variant Started Time" = if(Event="VariantProcessStarted", DateTime, null()) | eval "Variant Completed Time" = if(Event="VariantProcessCompleted", DateTime, null()) | rename ProductionId as "Production ID" | stats values(*) as * by JobId | where mvcount(Event)>1 | eval "P&P Duration"=strptime('Variant Completed Time',"%Y-%m-%dT%H:%M:%S") -strptime('Job Started Time',"%Y-%m-%dT%H:%M:%S") | eval "P&P Duration"=strftime('P&P Duration', "%H:%M:%S") | eval QueuedJobAgeSecs= (now() - "Job Queued Time") | eval statusValue=case(isnotnull("Variant Completed Time"), "Completed", isnotnull("Variant Started Time"), "In-Flight", isnull("Job Started") AND QueuedJobAgeSecs < (24*60*60), "Pending", isnull("Job Started"), "Failed", 1==1, "???ERROR???")

woodcock
Esteemed Legend

The values function drops duplicates and sorts alphabetically (which in your case is also sorting by increasing time) but the list function keeps duplicates and does not sort at all (which in your case leaves them sorted by decreasing time, which is backwards for your case) so, presuming that this is your only problem (which is a big presumption), you should be able to get it working by switching from values to list, and reversing the order of events (to make the earliest ones show up first), like this:

Event="JobQueued" OR Event="JobProcessStarted" OR Event="VariantProcessStarted" OR Event="VariantProcessCompleted" | reverse | rename ProductionId as "Production ID" | rename JobProcessStarted as "Job Started" | rename JobQueued as "Job Queued" | rename VariantProcessStarted as "Variant Started" | rename VariantProcessCompleted as "Variant Completed" | stats values(*) as * list(DateTime) AS DateTimeList by JobId | where mvcount(Event)>1 | eval "Job Queued"=mvindex(DateTimeList,0) | eval "Job Started"=mvindex(DateTimeList,1) | eval "Variant Started"=mvindex(DateTimeList,2) | eval "Variant Completed"=mvindex(DateTimeList,3) | eval "P&P Duration"=strptime('Variant Completed',"%Y-%m-%dT%H:%M:%S") -strptime('Job Started',"%Y-%m-%dT%H:%M:%S") | eval "P&P Duration"=strftime('P&P Duration',"%H:%M:%S") | table "Production ID" JobId "Job Queued" "Job Started" "Variant Started" "Variant Completed" "P&P Duration" | sort by "Job Queued","Job Started" desc

Fergal111
Path Finder

Hi Woodcock,

Thanks for jumping on to this post!

Event="JobQueued" OR Event="JobProcessStarted" OR Event="VariantProcessStarted" OR Event="VariantProcessCompleted" | reverse | rename ProductionId as "Production ID" | rename JobProcessStarted as "Job Started" | rename JobQueued as "Job Queued" | rename VariantProcessStarted as "Variant Started" | rename VariantProcessCompleted as "Variant Completed" | stats values(*) as * list(DateTime) AS DateTimeList by JobId | where mvcount(Event)>1 | eval "Job Queued"=mvindex(DateTimeList,0) | eval "Job Started"=mvindex(DateTimeList,1) | eval "Variant Started"=mvindex(DateTimeList,2) | eval "Variant Completed"=mvindex(DateTimeList,3) | eval "P&P Duration"=strptime('Variant Completed',"%Y-%m-%dT%H:%M:%S") -strptime('Job Started',"%Y-%m-%dT%H:%M:%S") | eval "P&P Duration"=strftime('P&P Duration',"%H:%M:%S") | table "Production ID" JobId "Job Queued" "Job Started" "Variant Started" "Variant Completed" "P&P Duration" | sort by "Job Queued","Job Started" desc

When I try this it assigns the DateTime from the JobQueued to all time fields. Any ideas?

Example:
Production ID JobId Job Queued Job Started Variant Started Variant Completed P&P Duration
BVT/Test/002#001 25D80E91-1599-4534-B741-4A05C4351384 2015-08-03T11:56:16 2015-08-03T11:56:16 2015-08-03T11:56:16 2015-08-03T11:56:16 00:00:00

Below are the four raw line events.

Event

{ [-]
BundleCode: 20
BundleCodeDescription: CTV01
BundleUrl: test.xml
DateTime: 2015-08-03T10:05:10
Event: VariantProcessCompleted
JobId: E3953235-84DC-4B3A-B0AF-1D3A09047933
ProductionId: BVT/Test/002#001
PublishedLocation: VAR001/BVT-Test-002-001_20_4_VAR001.ism
VariantDescription: dvbdash-nodrm-noad-nosubs
VariantId: VAR001
}
Show as raw text
{ [-]
BundleCode: 20
BundleCodeDescription: CTV01
BundleUrl: test.xml

DateTime: 2015-08-03T10:04:54
Event: VariantProcessStarted
JobId: E3953235-84DC-4B3A-B0AF-1D3A09047933
ProductionId: BVT/Test/002#001
VariantDescription: dvbdash-nodrm-noad-nosubs
VariantId: VAR001
}
Show as raw text
{ [-]
DateTime: 2015-08-03T10:03:32
Event: JobProcessStarted
JobId: E3953235-84DC-4B3A-B0AF-1D3A09047933
}
Show as raw text
{ [-]
DateTime: 2015-08-03T10:03:31
Event: JobQueued
JobId: E3953235-84DC-4B3A-B0AF-1D3A09047933
}
Show as raw text

0 Karma

Fergal111
Path Finder

I was thinking | dedup 3 DateTime |
...but doesn't make as difference:(

0 Karma
Get Updates on the Splunk Community!

Introduction to Splunk Observability Cloud - Building a Resilient Hybrid Cloud

Introduction to Splunk Observability Cloud - Building a Resilient Hybrid Cloud  In today’s fast-paced digital ...

Observability protocols to know about

Observability protocols define the specifications or formats for collecting, encoding, transporting, and ...

Take Your Breath Away with Splunk Risk-Based Alerting (RBA)

WATCH NOW!The Splunk Guide to Risk-Based Alerting is here to empower your SOC like never before. Join Haylee ...