Hi All,
What I'm trying to do is to have a chart with time on x-axis and percentages by ResponseStatus on y-axis.
To do that I come up with the below Splunk search query:
match some http requests
| fields _time,ResponseStatus,RequestName
| eval Date=strftime(_time, "%m/%d/%Y")
| eval ResponseStatus=if(isnull(ResponseStatus), 504, ResponseStatus)
| eventstats count as "totalCount" by Date
| eventstats count as "codeCount" by Date,ResponseStatus
| eval percent=round((codecount/totalCount)*100)
| chart values(percent) by Date,ResponseStatus
But it is hitting the disk usage limit (500MB - which I can't increase) for a 10 days interval. And I'd like to be able to have this on a 3/4 months interval.
What I have noticed is that If I only run the match part of the query, I get all the events without hitting any disk limit, which makes me think the problem is with the counting and group by part of the query.
My guess is that Splunk is making the computation by keeping in-memory (or, trying to do so and eventually swapping to disk) the full event message even if I specified the useful fields via the fields command.
Is there any way to either effectively have Splunk ignore all the remaining part of the message or obtain the same result via a different path?
Thanks a lot!
Reduce use of eventstats is always good. Secondly, you can use table to reduce event (row) size; fields doesn't do quite that.
In your example, you can eliminate one of eventstats like this
match some http requests
| table _time,ResponseStatus,RequestName ``` fields does not reduce row size ```
| eval Date=strftime(_time, "%m/%d/%Y")
| eval ResponseStatus=if(isnull(ResponseStatus), 504, ResponseStatus)
| stats count as "codeCount" by Date,ResponseStatus
| eventstats sum(count) as "totalCount" by Date
| eval percent=round((codecount/totalCount)*100)
| chart values(percent) by Date,ResponseStatus
You might explicitly remove _raw early in the process so that you operate only on the set of fields you need without dragging more data around.
| fields - _raw
Furthermore, why do you render the _time to a string? You might use bin/bucket to align the data to full days or whatever time unit you need without creating additional fields.
You might also rethink your eventstats. If you already calculate by date and responsestatus, why calculating by date again? You might use the one you already have and sum it over date.
Thank you! Your message made me realize I could write the query in another way. This is where I landed after the changes suggested by @yuanliu + your message:
match some stuff
| fields _time,ResponseStatus,RequestName
| fields - _raw
| bucket _time span=1d
| eval ResponseStatus=if(isnull(ResponseStatus), 504, ResponseStatus)
| eventstats count as "total" by _time
| stats count first(total) as "total" by _time, ResponseStatus
| eval percent=(count/total)*100
| timechart span=1d first(percent) by ResponseStatus
Most likely further improvements can be made here as well, but as long as it is not hitting the disk threshold I'm happy with it 🙂
Could be even simplified further:
<base_search>
| table _time ResponseStatus | fields - _raw
| bucket _time span=1d
| fillnull value=504 ResponseStatus
| top 100 ResponseStatus by _time showcount=f
| timechart limit=30 span=1d first(percent) by ResponseStatus
No. You don't want to do that. Firstly, the removing _raw is not needed. But more importantly, table command is a transforming command and moves further processing to searchhead which kills parallelization.
You know, adding table and removing _raw was actually slightly slower in my testing and took up slightly more disk but the difference was too small for me to be sure. You were the one who first recommended removing _raw and someone else here recommended table.
| table _time ResponseStatus | fields - _raw
Reduce use of eventstats is always good. Secondly, you can use table to reduce event (row) size; fields doesn't do quite that.
In your example, you can eliminate one of eventstats like this
match some http requests
| table _time,ResponseStatus,RequestName ``` fields does not reduce row size ```
| eval Date=strftime(_time, "%m/%d/%Y")
| eval ResponseStatus=if(isnull(ResponseStatus), 504, ResponseStatus)
| stats count as "codeCount" by Date,ResponseStatus
| eventstats sum(count) as "totalCount" by Date
| eval percent=round((codecount/totalCount)*100)
| chart values(percent) by Date,ResponseStatus
Thank you! Converting that eventstats into stats made it work. The use of table didn't seem to have an impact on performance overall
Where did you find out fields does not reduce row size? This contradicts what we've been told over many years.
@richgalloway You and @isoutamo are correct. I didn't realize that events and fields are two separate spaces. fields allows events (_raw) to carry on, but search buffers are not burdened by them. As isoutamo points out, table may actually carry higher performance penalty.