Splunk Search

sum field until other associated field was changed.

arjitgoswami
Explorer

Hi All,

I have a scenario where I need to find total time taken by a particular servlet only until other servlet is not called.

for instance, my log file looks like
time Servlet Timetaken
24-05-2017 07:00 Servlet1 Time taken: 3
24-05-2017 07:15 Servlet1 Time taken: 5
24-05-2017 07:30 Servlet2 Time taken: 6
24-05-2017 07:45 Servlet3 Time taken: 7
24-05-2017 08:00 Servlet1 Time taken: 9

so my output should look like:

Servlet name TimeTaken

Servlet1 8 ---> this is sum of all time taken until Servlet2 appear.
Servlet2 6
Servlet3 7
Servlet1 9

using stats sum(TimeTaken) by Servlet is giving me sum of entire Servlet instance (here for Servlet1, I am getting 17) which I don't want. I want how much time did servlet1 took before Servlet2 came.

Can somebody please help me in this scenario?

Thanks and regards,
Arjit Goswami.

Tags (2)
1 Solution

DalJeanis
Legend

Start with this ...

your search that gets the records with _time, ServletName and TimeTaken
| sort 0 _time 

This substitutes for the above to give a run-anywhere test sample

| makeresults 
| eval ServletName="A B A A A B B C C C A C"  
| makemv ServletName 
| mvexpand ServletName 
|streamstats count as addtime 
| eval _time=_time+addtime 
| table _time ServletName 
| eval TimeTaken=1

... then this calculates the time

| streamstats current=f last(ServletName) as PriorName
| eval newServlet=if(PriorName==ServletName,0,1)
| streamstats sum(newServlet) as ServletNumber
| stats sum(TimeTaken) as TimeTaken by ServletNumber ServletName

I'm pretty sure that @somesoni2 or @woodcock has a more elegant way, but this works.

View solution in original post

arjitgoswami
Explorer

@DalJeanis @somesoni2 @woodcock @niketnilay ! Thanks for your responses but the issue here is we don't know the count of the records here. My Bad I should have mentioned it earlier !!! the thing is its a log file so these are just not only 5 records but there would be 1000's of records (and we don't know which all servlets are called in log file) . So records are like:

24-05-2017 07:00 Servlet1 Time taken: 3
24-05-2017 07:15 Servlet1 Time taken: 5
24-05-2017 07:30 Servlet2 Time taken: 6
24-05-2017 07:45 Servlet3 Time taken: 7
24-05-2017 08:00 Servlet1 Time taken: 9
.....
....
....
...
blah blah blah

Can you please suggest approach to this problem which can address this issue?

Thanks in advance

Kind regards,
Arjit.

0 Karma

woodcock
Esteemed Legend

As I understand it, each of the solutions provided should work just fine. Have you tested them? If all fail, you will need to VERY clearly explain the nature of the failure/breakdown.

0 Karma

arjitgoswami
Explorer

@woodcock! Thanks. Issue was related to configuration. It all worked now !!!

@DalJeanis @somesoni2 @woodcock @niketnilay! Thanks for all your help.

0 Karma

niketn
Legend

@DalJeanis @somesoni2 @woodcock, here is another approach using reset_on_change.

Following query creates some time dependent test data using gentimes and makeresults.

| gentimes start=-9
| eval _time=starttime
| table _time
| appendcols [| makeresults  
| eval ServletName="Servlet1"
| eval TimeTaken="3"
| append [| makeresults  
| eval ServletName="Servlet1"
| eval TimeTaken="6"]
| append [| makeresults  
| eval ServletName="Servlet2"
| eval TimeTaken="3"]
| append [| makeresults  
| eval ServletName="Servlet2"
| eval TimeTaken="1"]
| append [| makeresults  
| eval ServletName="Servlet2"
| eval TimeTaken="3"]
| append [| makeresults  
| eval ServletName="Servlet3"
| eval TimeTaken="7"]
| append [| makeresults  
| eval ServletName="Servlet4"
| eval TimeTaken="4"]
| append [| makeresults  
| eval ServletName="Servlet5"
| eval TimeTaken="3"]
| append [| makeresults  
| eval ServletName="Servlet5"
| eval TimeTaken="9"]
]

Following query gets the expected results:
The first streamstats calculates the sum of Time Taken by a Servlet until the ID changes.
The subsequent streamstats filters records where same Servlet is repeated.

| streamstats sum(TimeTaken) as Duration BY ServletName reset_on_change=true
| reverse
| streamstats current=f window=2 last(ServletName) as NextServletName
| where ServletName!=NextServletName OR isnull(NextServletName)
| table _time ServletName Duration
____________________________________________
| makeresults | eval message= "Happy Splunking!!!"
0 Karma

woodcock
Esteemed Legend

Like this:

| makeresults 
| eval raw="1,3 1,5 2,6 3,7 1,9" 
| makemv delim=" " raw 
| mvexpand raw 
| rename raw AS _raw 
| rex "^(?<Servlet>[^,]+),(?<Timetaken>.*)$" 
| streamstats count as addtime 
| eval _time=_time+addtime 
| table _time Servlet Timetaken

| rename COMMENT AS "Everything above generates sample event data; everything below is your solution"

| streamstats count AS myPos
| streamstats count AS myServletPos BY Servlet
| eventstats first(myPos) AS firstServletPos BY Servlet
| eval normaizedServletPos = myPos - firstServletPos + 1
| eval goodTimetaken=if((myServletPos = normaizedServletPos), Timetaken, 0)
| stats sum(goodTimetaken) BY Servlet
0 Karma

DalJeanis
Legend

Start with this ...

your search that gets the records with _time, ServletName and TimeTaken
| sort 0 _time 

This substitutes for the above to give a run-anywhere test sample

| makeresults 
| eval ServletName="A B A A A B B C C C A C"  
| makemv ServletName 
| mvexpand ServletName 
|streamstats count as addtime 
| eval _time=_time+addtime 
| table _time ServletName 
| eval TimeTaken=1

... then this calculates the time

| streamstats current=f last(ServletName) as PriorName
| eval newServlet=if(PriorName==ServletName,0,1)
| streamstats sum(newServlet) as ServletNumber
| stats sum(TimeTaken) as TimeTaken by ServletNumber ServletName

I'm pretty sure that @somesoni2 or @woodcock has a more elegant way, but this works.

somesoni2
Revered Legend

My answer would've been same as this (with only diff of using | accum newServlet instead of | streamstats sum(newServlet) as ServletNumber.

0 Karma

DalJeanis
Legend

@woodcock and @somesoni2 - dang, I was assuming there was some clever way to use an eval in the streamstats to save a step, but I couldn't figure it out.

I guess it looks like this, but it isn't any more efficient that I can see...

 | streamstats sum(eval(if(PriorName==ServletName,0,1))) as ServletNumber

... and it would be more confusing to beginners who are likely to be reading the answer for years.

0 Karma

woodcock
Esteemed Legend

I wrote my answer the way that I did mostly for educational purposes; it could have been done more efficiently but would lose clarity.

0 Karma

DalJeanis
Legend

@woodcock - Yep. We're teaching people how to use the tool belt, more than just how to make a "chair".

Still, it's cool sometimes to see you masters do something sideways... and I go, "Wait... what was THAT?"

0 Karma

woodcock
Esteemed Legend

When I see that @DalJeanis has answered already, I typically move on assuming it is answered well (same for @somesoni2).

0 Karma
Get Updates on the Splunk Community!

Built-in Service Level Objectives Management to Bridge the Gap Between Service & ...

Wednesday, May 29, 2024  |  11AM PST / 2PM ESTRegister now and join us to learn more about how you can ...

Get Your Exclusive Splunk Certified Cybersecurity Defense Engineer at Splunk .conf24 ...

We’re excited to announce a new Splunk certification exam being released at .conf24! If you’re headed to Vegas ...

Share Your Ideas & Meet the Lantern team at .Conf! Plus All of This Month’s New ...

Splunk Lantern is Splunk’s customer success center that provides advice from Splunk experts on valuable data ...