Splunk Search

Summary / accelerate query counting disjoint indexed events

shawnce
Engager

I have a relatively large number of events being indexed and funneled into its own index based on source & source type (millions a week). This stream of events contains information about user activity in a product of ours and we desire to summarize user activity on a daily basis then build a dashboard that visualizes this summary information in various ways (often on longer timescales). We will likely utilize an accelerated search (prefer the simplicity) but may decide to use a summary search.

Note we are currently still using splunk 5.0.5.

The following is an example of a summary query that I am experimenting with and I am looking for any suggestions on how to improve it. It seems a little wrong to use if/match like I am.

index=myproduct build_type=prod (event_type="creating shape" OR event_type="Selecting tool" OR event_type="Undoing shape" OR event_type="Redoing shape") | eval DrawEvent=if(match(event_type,"creating shape"),"1","0") | eval ToolEvent=if(match(event_type,"Selecting tool"),"1","0") | eval UndoEvent=if(match(event_type,"Undoing shape"),"1","0") | eval RedoEvent=if(match(event_type,"Redoing shape"),"1","0") | bucket _time span=1day | stats sum(DrawEvent) AS UserDrawCount sum(ToolEvent) AS UserToolCount sum(UndoEvent) AS UserUndoCount sum(RedoEvent) AS UserRedoCount by _time,logged_user_id

...which produces a table like the following...

    _time   logged_user_id  UserDrawCount   UserToolCount   UserUndoCount   UserRedoCount
1   3/16/14 12:00:00.000 AM AAAAA   59  7   0   0
2   3/16/14 12:00:00.000 AM BBBBBB  135 35  42  2
3   3/16/14 12:00:00.000 AM CCCCC   139 3   0   0
4   3/16/14 12:00:00.000 AM DDDDD   895 65  54  1

Note in a future version of the product we are reworking the naming conventions used to allow for a wildcard to be used in the search (instead of such specific text) to narrow down the event stream to a family of user actions we wish to summarize in the same query.

Tags (3)
0 Karma
1 Solution

martin_mueller
SplunkTrust
SplunkTrust

All in all - yeah, seems reasonable to me.

Consider moving the categorizing-eval-chain out into a macro for easy reuse and maintenance.

View solution in original post

0 Karma

martin_mueller
SplunkTrust
SplunkTrust

All in all - yeah, seems reasonable to me.

Consider moving the categorizing-eval-chain out into a macro for easy reuse and maintenance.

0 Karma

martin_mueller
SplunkTrust
SplunkTrust

You could merge the match into the stats like this:

... | stats count(eval(match(event_type, "creating shape"))) as UserDrawCount ...

But that's not necessarily better to read and maintain. From a performance point of view it's not going to matter much.

0 Karma

shawnce
Engager

Basically is searching on event_type to narrow the number of events looked at followed by using eval=if(match(...) to tally each event_type matched, then bucketing by day, then summarizing using stats makes sense... or does a better way exist to do the daily summary not using the eval=if(match(..)) stuff but maybe features of stats more directly?

Again it needs to be grouped by day and logged in user.

0 Karma

martin_mueller
SplunkTrust
SplunkTrust

Yeah, feeding that into a summary index will give you great long-term statistics performance.

0 Karma

shawnce
Engager

I am basically looking to see if what I am doing about is reasonable or if a better way exists.

I have a stream of events like the following coming in from users using our app...

logged_user_id="AAAAA" event_type="creating shape" ...
logged_user_id="BBBBBB" event_type="Selecting tool" ...
logged_user_id="AAAAA" event_type="creating shape" ...
logged_user_id="CCCCC" event_type="Redoing shape" ...

I want to summarize this into a daily tally of each type of event by user, so turning multiple events into a single event for each user on each day. This will then be used to feed sub searches.

0 Karma

martin_mueller
SplunkTrust
SplunkTrust

Maybe it's just me, but what is your question?

0 Karma
Get Updates on the Splunk Community!

Automatic Discovery Part 1: What is Automatic Discovery in Splunk Observability Cloud ...

If you’ve ever deployed a new database cluster, spun up a caching layer, or added a load balancer, you know it ...

Real-Time Fraud Detection: How Splunk Dashboards Protect Financial Institutions

Financial fraud isn't slowing down. If anything, it's getting more sophisticated. Account takeovers, credit ...

Splunk + ThousandEyes: Correlate frontend, app, and network data to troubleshoot ...

 Are you tired of troubleshooting delays caused by siloed frontend, application, and network data? We've got a ...