Splunk Search

si versus collect

mansel_scheffel
Explorer

Hi,

Is there any benefit to using the old method when using summary indexing? Basically I would like to the know differences in terms of performance or any value the one way may have over the other, between using si commands/the new way and | collect/the old way.

Thanks

0 Karma

ckp123
Explorer

I see only one difference.
Summary indexes(SI) can be created only based existing reports, whereas we create collect through searches by appending teh command "| collect index=" at the end.

0 Karma

somesoni2
Revered Legend

I don't think I've compared the performance of the two but I always prefer selecting Summary indexing in Saved search Option versus using Collect command in-line. The former does the summary indexing (processing result, generating raw events etc) in background and should be better than in-line option. Love hear thoughts from others.

0 Karma

gcusello
SplunkTrust
SplunkTrust

Using Summary indexing I found a large benefit in terms of performances (also ten times quicker!), for example with BlueCoat logs that have billions of events every day!
Problems are that it's impossible to delete a part of summarized logs, but only the full tsixd file (I asked to Splunk to take in consideration the opportunity to insert the delete functionality), this means that you cannot do errors in summarizing, because you cannot delete them.
In addition using Summary indexing there is a delay in data access because logs are indexed two times and you have to wait that logs are indexed before to summarize them.
In my case I have a larger delay because I have to be sure that logs are really all arrived before summarizing.
There aren't many checks on the summary indexing operation, so there is the risk to index twice a log or lost it.
At the end there is a greater disk space occupation, but in my case is less of the problem.
In conclusion: use it because is really useful but attention!
Bye.
Giuseppe

0 Karma

mansel_scheffel
Explorer

Hi Guiseppe,

Thanks for the reply, however I am interested in the difference between the two methods of implementing summary indexes(si commands vs | collect) rather than using summary indexing itself.

0 Karma

gcusello
SplunkTrust
SplunkTrust

I used tscollect.
your_search | table fields... | tscollect namespace=blueacoat_stats

and in searches I used
| tstats count FROM blueacoat_stats | ...

Bye.
Giuseppe

0 Karma

gcusello
SplunkTrust
SplunkTrust

if you're satisfied of the answer, please, accept the answer.
Bye.
Giuseppe

0 Karma