Solved: Re: Calculating times between segments/steps in a ...

esp · ‎09-16-2010

Is it possible to dynamically calculate the RHS of a search comparison?

I'm looking to use Splunk to do latency measurements across various segments of a processing pipeline, e.g.:

A -> B -> C

I have a log that looks like:

  <conversationId> <timestamp> <segment (e.g. A, B or C)>

Where conversationId is used to correlate messages related to a single 'conversation' as they flow through the pipeline.

I can calculate end-to-end latency like so:

sourcetype="source" segment="C" |
eval endTime=timestamp |
fields conversationId, endTime |
join type=outer conversationId [
  search sourcetype="source" segment="A" |
  eval startTime=timestamp |
  fields conversationId, startTime
] |
eval latency=(endTime-startTime) |
fields conversationId, latency

which works, but I need to explicitly identify the start and end segments. I'd like to be able to generalize this so that I can calc latency across each of the subsegments without having to name each of them (this becomes a pain as the number of segments increases or changes).

My idea was to include info about the previous segment in the log messages:

  <conversationId> <timestamp> <segment (e.g. A, B or C)> <previousSegment)

And then have a search like:

sourcetype="source" |
eval prev=previousSegment |
eval endTime=timestamp |
fields conversationId, previousSegment, endTime |
join type=outer conversationId [
  search sourcetype="source" segment=***prev*** |
  eval startTime=timestamp |
  fields conversationId, startTime
] |
eval latency=(endTime-startTime) |
fields conversationId, latency

I can't get this to work however. Is there some way to be able to use a calculated field in the RHS of a search comparison?

Thanks, Edwin

gkanapathy · ‎09-16-2010

I don't think you need to (nor should you) do what you seem to be trying.

Seems this search could much more easily and efficiently be done with:

sourcetype=source 
| stats 
    first(_time) as latest
    last(_time) as earliest
  by conversationId
| eval latency = latest-earliest

Alternatively, if for some reason _time isn't the same as timestamp:

sourcetype=source 
| stats 
    max(timestamp) as latest
    min(timestamp) as earliest
  by conversationId
| eval latency = latest-earliest

Update: Oh I see, you want the diffs between each stage. Then you'd need:

source=sourcetype
| streamstats global=f window=2 current=t
    max(_time) as currenttime
    min(_time) as prevtime
  by
    conversationId
| eval latency=later-earlier

I think. I may have it off-by-one, so the latency is for the next stage instead of the previous stage.

View solution in original post

splunksolutions · ‎10-11-2011

This is something we've built a search command to do in the Splunk App for Transaction Profiling. Look in the menu for Samples -> Steps. The current version on Splunkbase, Preview 2, still requires you to identify each segment; but we're looking at ways to more generally define when a new segment starts.

Esp, the product team would like to engage with you offline. Can you please email transactionprofiling@splunk.com?

Thanks!

gkanapathy · ‎09-17-2010

I believe that your approach is more complicated and less efficient than necessary. Instead of your specific question about variable substitution, I have answered with what I think is a better way to get the results you seem to be asking for.

gkanapathy · ‎09-16-2010

I don't think you need to (nor should you) do what you seem to be trying.

Seems this search could much more easily and efficiently be done with:

sourcetype=source 
| stats 
    first(_time) as latest
    last(_time) as earliest
  by conversationId
| eval latency = latest-earliest

Alternatively, if for some reason _time isn't the same as timestamp:

sourcetype=source 
| stats 
    max(timestamp) as latest
    min(timestamp) as earliest
  by conversationId
| eval latency = latest-earliest

Update: Oh I see, you want the diffs between each stage. Then you'd need:

source=sourcetype
| streamstats global=f window=2 current=t
    max(_time) as currenttime
    min(_time) as prevtime
  by
    conversationId
| eval latency=later-earlier

I think. I may have it off-by-one, so the latency is for the next stage instead of the previous stage.

hulahoop · ‎09-16-2010

Hello esp, have you considered using the transaction command to accomplish this? It will automatically group events across segments (A->B->C) whose conversationid field have the same value. As a bonus, you also get the latency calculated between the earliest event and latest event in the same transaction. This latency is computed as the duration field.

Since the sample data didn't come through, I'll just sketch the search:

sourcetype=source segment=A OR segment=B OR segment=C | transaction conversationid

There are lots of options to defining transactions, including how far apart the events are in relation to each other, what is the maximum time range for a group of events, what event marks the start/end of the transaction, etc. Details on the transaction command are in the Command Reference.

ftk · ‎09-16-2010

hey, your sample log didn't show up in the question...

Calculating times between segments/steps in a conversation/transaction

Splunk + ThousandEyes: Correlate frontend, app, and network data to troubleshoot ...

Splunk Observability for AI

🔐 Trust at Every Hop: How mTLS in Splunk Enterprise 10.0 Makes Security Simpler

Are you a member of the Splunk Community?

Calculating times between segments/steps in a conversation/transaction

Splunk + ThousandEyes: Correlate frontend, app, and network data to troubleshoot ...

Splunk Observability for AI

🔐 Trust at Every Hop: How mTLS in Splunk Enterprise 10.0 Makes Security Simpler