Splunk Search
Highlighted

summary indexing - search with a 2-hour transaction every 5 minutes ?

Path Finder

hm, my question seems very similar to this one: http://bit.ly/M4yZl2 , but differs in the details.

i have an extant regular search i'd like to convert to a summary index search.
it looks more or less like this:

search foo | transaction maxspan=2h my_key | timechart count by bar

the trick is that we'd like to run this every 5 minutes, while maintaining the transaction maxspan of 2 hours.
so i'm not sure what the schedule for the summary index should be.

the clear choice is to schedule it every five minutes with earliest = -125m and latest = -5m,
(thanks to lguinn for the earliest/latest tip, here: http://bit.ly/L2Q4yS).
my concern is that each 5 minute span is now being searched 24 times, and presumably indexed that way as well,
and i don't know how this may affect the timechart on the summary index.

when i scheduled the summary for every 5 minutes with earliest = -10m and latest = -5m,
i got distinctly different results than from the non-summarized search. which makes sense if the transactions are being limited to 5 minutes.

naturally i'll just try a 2-hour search every 5 minutes and compare the summarized search to the non-summarized one,
but it would be great to hear any theory or best-practices around this situation.

tia,
orion

0 Karma
Highlighted

Re: summary indexing - search with a 2-hour transaction every 5 minutes ?

Path Finder

well,
my experiment came out as i feared:
the overlapped searches yield an inflated number of results.
.. which http://docs.splunk.com/Documentation/Splunk/latest/Knowledge/Configuresummaryindexes is pretty clear about avoiding.

hm, well i guess each summary search could emit little mini-transactions and the search against the summary index could do the full transaction on those, but i'm not sure i'm getting the win from SI then.

0 Karma
Highlighted

Re: summary indexing - search with a 2-hour transaction every 5 minutes ?

Legend

I have an idea - only summarize the transactions that completed in the last 5 minutes.

search foo | 
transaction maxspan=2h my_key  | 
eval endTime = _time + duration |
where endTime > relative_time(now(),-5m) |
timechart count by bar

This calculates the ending time of each transaction. now() is always the beginning of the search, not the current moment in time. So the 4th line compares the endTime to the last 5 minutes - if the transaction ended within the last 5 minutes, it is kept and charted.

HTH