- Mark as New
- Bookmark Message
- Subscribe to Message
- Mute Message
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content

Hi,
I have a requirement to create dashboards around user activity. Best practice suggests I use summary indexes but I am having no joy.
The requirement is to view user activity by day and month.
- Distinct count of users over a day or month range.
- Number of days each user logs on per month.
- Distinct count of users logging on per month in certain ranges. 1-2 days - 236 users 2-8 days - 453 users etc
- Hits per user per day or month and again distinct count of users per certain Hit ranges.
- Ability to use the timerange picker to populate the date range in the dashboard.
I need to setup the summary index so it contains all past data and indexes the data moving forward.
I thought I could just do the below once and schedule it moving forward.
index=iis
| sistats count by cs_username date
| collect index=sumuserdate
and run all the searches off that index but the results are odd and also I lose the ability to use the time range picker.
For example I can use the below search against the normal index but this results in Splunk going over 60M events
index=iis | stats count as Hits dc(date) by cs_username date_month date_year | rangemap field=dc(date) "1"=1-1 "2-11"=2-11 "12-19"=12-19 "20"=20-50 | stats count dc(cs_username) by range date_month date_year
Once complete I plan to enrich the users with extra information form a lookup table, geo-location, user rights, business area etc
Do I need to add a bucket & span? strptime the date? timechart & span the query? have separate index's for the day and month?
I know I'm well off track so any help would be great.
- Mark as New
- Bookmark Message
- Subscribe to Message
- Mute Message
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hello,
Don't include the time any where while doing the summary. When you start aggregating data the search automatically becomes faster.We all know collect command will populate all the value but it has limitation where it take the local machine's name rather than the actual host. So do something like this.
Have some sample timing to have granularity of the summary
index=iis|bucket _time span=30m| stats count dc(cs_username) by cs_username
Create a saved search and schedule it for a one time load to the target summary index. define the earliest and latest parameter according to your requirement. Avoid running it again and again to have duplicate entries.
with the above you can preserve internal event timing and you can use the time range picker because in the summary data you have the time trend.
in your Dashboard do the calculation. You should be fine now by user selecting time ranges
Thanks,
L
- Mark as New
- Bookmark Message
- Subscribe to Message
- Mute Message
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content

I don't know if this is an error on my part but would the below bug be causing a problem?
I know it causes issues with REST as well as the UI.
(From another post) - To set the time for summary index events, Splunk uses the following information, in this order of precedence:
- The _time value of the event being summarized
- The earliest (or minimum) time of the search
- The current system time (in the case of an "all time" search, where no "earliest" value is specified)
Could this bug be causing _time to be skipped over?
- Mark as New
- Bookmark Message
- Subscribe to Message
- Mute Message
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hello,
Don't include the time any where while doing the summary. When you start aggregating data the search automatically becomes faster.We all know collect command will populate all the value but it has limitation where it take the local machine's name rather than the actual host. So do something like this.
Have some sample timing to have granularity of the summary
index=iis|bucket _time span=30m| stats count dc(cs_username) by cs_username
Create a saved search and schedule it for a one time load to the target summary index. define the earliest and latest parameter according to your requirement. Avoid running it again and again to have duplicate entries.
with the above you can preserve internal event timing and you can use the time range picker because in the summary data you have the time trend.
in your Dashboard do the calculation. You should be fine now by user selecting time ranges
Thanks,
L
- Mark as New
- Bookmark Message
- Subscribe to Message
- Mute Message
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content

Thanks for the advice, so I use stats instead of sistats?
- Mark as New
- Bookmark Message
- Subscribe to Message
- Mute Message
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
You can use anything as long as you keep meaningful data. It is not mandatory to use sistats for Summary. Make your Summary set and make the searches go from there which will work perfectly. Choose your desired level of bucket time.
- Mark as New
- Bookmark Message
- Subscribe to Message
- Mute Message
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content

Thanks for the advice but I cant get the above to work.
Searching against the summary only works over all time as there is only one timestamp on each event after adding it to the summary index. All event default to 01/06/2014.
In my test env I used the following query
index=iis | bucket _time span=30m | stats count dc(cs_username) by cs_username
Start time -3mon@mon Finish Time -20d@d
Scheduled the search via cron 15 17 * * *
Enabled summary indexing and assigned it to "summarytest2"
Any ideas?
- Mark as New
- Bookmark Message
- Subscribe to Message
- Mute Message
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Delete the summary index completely and index them freshly, it should work..
- Mark as New
- Bookmark Message
- Subscribe to Message
- Mute Message
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content

Same issue as before, all the timestamps are the start time of the time range. I thought I had the query wrong but its was as above.
I don't know if it makes any difference but for the index / source type I query I had to adjust the porps.conf to get the correct timestamp when indexing, would this cause any issues?
[iis-prod]
TIME_FORMAT = %Y-%m-%d %H:%M:%S
- Mark as New
- Bookmark Message
- Subscribe to Message
- Mute Message
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content

sample event in the summary index
05/01/2014 00:00:00 +0100, search_name="TEST-BL2", search_now=1411493100.000, info_min_time=1398898800.000, info_max_time=1410130800.000, info_search_time=1411493101.076, count=190, cs_username=PSI8xxxxxxxx, dc(cs_username)=1
host = NLDxxxxxxP source = TEST-BL2 sourcetype = stash
- Mark as New
- Bookmark Message
- Subscribe to Message
- Mute Message
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
it seems like your raw events are not having any other _time. Did you delete the existing index and tries to re-index it?
- Mark as New
- Bookmark Message
- Subscribe to Message
- Mute Message
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content

index=iis | bucket _time span=1d | timechart span=1d count dc(cs_username)
The above works and the correct time is passed to the summary index.
I can query on _time and the time range picker workls, there is no date field but there is date_zone,date_year, date_zone, date_wday, date_second, date_month, date_minuet, date_mday, date_hour.
The below query still will not work but at least I've got something. I'll try to adjust it so its by cs_username.
index=iis | bucket _time span=30m | stats count dc(cs_username) by cs_username
- Mark as New
- Bookmark Message
- Subscribe to Message
- Mute Message
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
yeah timechart also should do. Yesterday asked you to check with
| stats count by _time, cs_username
.
My main intention was to make the timerange option available in summary. Happy that you worked it out 🙂
- Mark as New
- Bookmark Message
- Subscribe to Message
- Mute Message
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content

That works as well. 🙂 thanks
- Mark as New
- Bookmark Message
- Subscribe to Message
- Mute Message
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content

Yes deleted the index, created new index, new reports etc. I'll keep plugging away at this. Thanks so much for your help so far. I'll drop an update if there is any meaningful progress.
- Mark as New
- Bookmark Message
- Subscribe to Message
- Mute Message
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
when you run the search manually with the timestamp selected, don't you see the trend? the test env search should give you the data for every 30minutes count. if you want daily distinct value you need to use different summary index, that will not give you correct value after summarized as there are no values to look at
- Mark as New
- Bookmark Message
- Subscribe to Message
- Mute Message
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content

I do in Verbose Mode in the Events tab, the time is grouped into the 30 min buckets with the correct date stamp. In the Statistics tab I only see cs_username count dc(cs_username)
