Splunk Search

sum field until other associated field was changed.

arjitgoswami
Explorer

Hi All,

I have a scenario where I need to find total time taken by a particular servlet only until other servlet is not called.

for instance, my log file looks like
time Servlet Timetaken
24-05-2017 07:00 Servlet1 Time taken: 3
24-05-2017 07:15 Servlet1 Time taken: 5
24-05-2017 07:30 Servlet2 Time taken: 6
24-05-2017 07:45 Servlet3 Time taken: 7
24-05-2017 08:00 Servlet1 Time taken: 9

so my output should look like:

Servlet name TimeTaken

Servlet1 8 ---> this is sum of all time taken until Servlet2 appear.
Servlet2 6
Servlet3 7
Servlet1 9

using stats sum(TimeTaken) by Servlet is giving me sum of entire Servlet instance (here for Servlet1, I am getting 17) which I don't want. I want how much time did servlet1 took before Servlet2 came.

Can somebody please help me in this scenario?

Thanks and regards,
Arjit Goswami.

Tags (2)
1 Solution

DalJeanis
SplunkTrust
SplunkTrust

Start with this ...

your search that gets the records with _time, ServletName and TimeTaken
| sort 0 _time 

This substitutes for the above to give a run-anywhere test sample

| makeresults 
| eval ServletName="A B A A A B B C C C A C"  
| makemv ServletName 
| mvexpand ServletName 
|streamstats count as addtime 
| eval _time=_time+addtime 
| table _time ServletName 
| eval TimeTaken=1

... then this calculates the time

| streamstats current=f last(ServletName) as PriorName
| eval newServlet=if(PriorName==ServletName,0,1)
| streamstats sum(newServlet) as ServletNumber
| stats sum(TimeTaken) as TimeTaken by ServletNumber ServletName

I'm pretty sure that @somesoni2 or @woodcock has a more elegant way, but this works.

View solution in original post

arjitgoswami
Explorer

@DalJeanis @somesoni2 @woodcock @niketnilay ! Thanks for your responses but the issue here is we don't know the count of the records here. My Bad I should have mentioned it earlier !!! the thing is its a log file so these are just not only 5 records but there would be 1000's of records (and we don't know which all servlets are called in log file) . So records are like:

24-05-2017 07:00 Servlet1 Time taken: 3
24-05-2017 07:15 Servlet1 Time taken: 5
24-05-2017 07:30 Servlet2 Time taken: 6
24-05-2017 07:45 Servlet3 Time taken: 7
24-05-2017 08:00 Servlet1 Time taken: 9
.....
....
....
...
blah blah blah

Can you please suggest approach to this problem which can address this issue?

Thanks in advance

Kind regards,
Arjit.

0 Karma

woodcock
Esteemed Legend

As I understand it, each of the solutions provided should work just fine. Have you tested them? If all fail, you will need to VERY clearly explain the nature of the failure/breakdown.

0 Karma

arjitgoswami
Explorer

@woodcock! Thanks. Issue was related to configuration. It all worked now !!!

@DalJeanis @somesoni2 @woodcock @niketnilay! Thanks for all your help.

0 Karma

niketn
Legend

@DalJeanis @somesoni2 @woodcock, here is another approach using reset_on_change.

Following query creates some time dependent test data using gentimes and makeresults.

| gentimes start=-9
| eval _time=starttime
| table _time
| appendcols [| makeresults  
| eval ServletName="Servlet1"
| eval TimeTaken="3"
| append [| makeresults  
| eval ServletName="Servlet1"
| eval TimeTaken="6"]
| append [| makeresults  
| eval ServletName="Servlet2"
| eval TimeTaken="3"]
| append [| makeresults  
| eval ServletName="Servlet2"
| eval TimeTaken="1"]
| append [| makeresults  
| eval ServletName="Servlet2"
| eval TimeTaken="3"]
| append [| makeresults  
| eval ServletName="Servlet3"
| eval TimeTaken="7"]
| append [| makeresults  
| eval ServletName="Servlet4"
| eval TimeTaken="4"]
| append [| makeresults  
| eval ServletName="Servlet5"
| eval TimeTaken="3"]
| append [| makeresults  
| eval ServletName="Servlet5"
| eval TimeTaken="9"]
]

Following query gets the expected results:
The first streamstats calculates the sum of Time Taken by a Servlet until the ID changes.
The subsequent streamstats filters records where same Servlet is repeated.

| streamstats sum(TimeTaken) as Duration BY ServletName reset_on_change=true
| reverse
| streamstats current=f window=2 last(ServletName) as NextServletName
| where ServletName!=NextServletName OR isnull(NextServletName)
| table _time ServletName Duration
____________________________________________
| makeresults | eval message= "Happy Splunking!!!"
0 Karma

woodcock
Esteemed Legend

Like this:

| makeresults 
| eval raw="1,3 1,5 2,6 3,7 1,9" 
| makemv delim=" " raw 
| mvexpand raw 
| rename raw AS _raw 
| rex "^(?<Servlet>[^,]+),(?<Timetaken>.*)$" 
| streamstats count as addtime 
| eval _time=_time+addtime 
| table _time Servlet Timetaken

| rename COMMENT AS "Everything above generates sample event data; everything below is your solution"

| streamstats count AS myPos
| streamstats count AS myServletPos BY Servlet
| eventstats first(myPos) AS firstServletPos BY Servlet
| eval normaizedServletPos = myPos - firstServletPos + 1
| eval goodTimetaken=if((myServletPos = normaizedServletPos), Timetaken, 0)
| stats sum(goodTimetaken) BY Servlet
0 Karma

DalJeanis
SplunkTrust
SplunkTrust

Start with this ...

your search that gets the records with _time, ServletName and TimeTaken
| sort 0 _time 

This substitutes for the above to give a run-anywhere test sample

| makeresults 
| eval ServletName="A B A A A B B C C C A C"  
| makemv ServletName 
| mvexpand ServletName 
|streamstats count as addtime 
| eval _time=_time+addtime 
| table _time ServletName 
| eval TimeTaken=1

... then this calculates the time

| streamstats current=f last(ServletName) as PriorName
| eval newServlet=if(PriorName==ServletName,0,1)
| streamstats sum(newServlet) as ServletNumber
| stats sum(TimeTaken) as TimeTaken by ServletNumber ServletName

I'm pretty sure that @somesoni2 or @woodcock has a more elegant way, but this works.

somesoni2
SplunkTrust
SplunkTrust

My answer would've been same as this (with only diff of using | accum newServlet instead of | streamstats sum(newServlet) as ServletNumber.

0 Karma

DalJeanis
SplunkTrust
SplunkTrust

@woodcock and @somesoni2 - dang, I was assuming there was some clever way to use an eval in the streamstats to save a step, but I couldn't figure it out.

I guess it looks like this, but it isn't any more efficient that I can see...

 | streamstats sum(eval(if(PriorName==ServletName,0,1))) as ServletNumber

... and it would be more confusing to beginners who are likely to be reading the answer for years.

0 Karma

woodcock
Esteemed Legend

I wrote my answer the way that I did mostly for educational purposes; it could have been done more efficiently but would lose clarity.

0 Karma

DalJeanis
SplunkTrust
SplunkTrust

@woodcock - Yep. We're teaching people how to use the tool belt, more than just how to make a "chair".

Still, it's cool sometimes to see you masters do something sideways... and I go, "Wait... what was THAT?"

0 Karma

woodcock
Esteemed Legend

When I see that @DalJeanis has answered already, I typically move on assuming it is answered well (same for @somesoni2).

0 Karma
Get Updates on the Splunk Community!

Index This | I am a number, but when you add ‘G’ to me, I go away. What number am I?

March 2024 Edition Hayyy Splunk Education Enthusiasts and the Eternally Curious!  We’re back with another ...

What’s New in Splunk App for PCI Compliance 5.3.1?

The Splunk App for PCI Compliance allows customers to extend the power of their existing Splunk solution with ...

Extending Observability Content to Splunk Cloud

Register to join us !   In this Extending Observability Content to Splunk Cloud Tech Talk, you'll see how to ...