Splunk Search

search to generate 5 sample events for lots of index/sourcetype pairs

esalesapns2
Communicator

I need to provide feedback on ways logging formats could be improved.

To that end, I'm trying to create a search that ends with:

| stats values(source) values(_raw) by index sourcetype

so I get some examples of logs, but I only want to see a max of 5 values in the source and _raw columns.

I tried using foreach with append, but append isn't streaming, so I manually created 204 lines like this:

index=index1 sourcetype=sourcetype1 | head 5
| append [ search index=index1 sourcetype=sourcetype2 | head 5 ]
| append [ search index=index2 sourcetype=sourcetype1 | head 5 ]
...

It took a long time in "Parsing job...", but eventually produced the results I wanted.
What are some different ways of getting this result?

Labels (1)
0 Karma
1 Solution

PickleRick
SplunkTrust
SplunkTrust

Ouch. So many subsearches. No wonder it takes forever to run (and might produce wrong/incomplete results).

This is actually one of the relatively few legitimate uses of the dedup command

index IN (index1, index2, ...) sourcetype IN (sourcetype1, sourcetype2,...)
| dedup 5 index sourcetype

 

View solution in original post

esalesapns2
Communicator

this works great, but over lots of indexes over hours (some are infrequent log sources) it takes a long time, so I shortened the time to 15-minutes and it ran in a few minutes, thank you!

0 Karma

PickleRick
SplunkTrust
SplunkTrust

That's true. Dedup works on the results but first it has to get those results so over a long time span it will be a relatively "heavy" command. If you can safely assume that all your results are contained within a certain time range from the latest event you could cheat a little by creating the search results dynamically.

Normally you could do something like this

| tstats max(_time) as latest where index IN (...) sourcetype IN (...) by index sourcetype

to find latest event time for each sourcetype/index.

Now if you can safely assume that all interesting events are within a certain range (let's say - within 5 minutes from the latest event), you could use this as a subsearch (but be aware of the subsearch limitations and be aware that it might return incomplete results in some cases!) to narrow down your initial search criteria

[ | tstats max(_time) as latest where index IN (...) sourcetype IN (...) by index sourcetype
  | eval earliest=latest-300 ]
| dedup 5 index sourcetype

This trick will make Splunk only look within latest 5 minutes for each index/sourcetype combination.

And - as far as I remember - it does not work if you want to use tstats. It only works with normal search.

0 Karma

PickleRick
SplunkTrust
SplunkTrust

Ouch. So many subsearches. No wonder it takes forever to run (and might produce wrong/incomplete results).

This is actually one of the relatively few legitimate uses of the dedup command

index IN (index1, index2, ...) sourcetype IN (sourcetype1, sourcetype2,...)
| dedup 5 index sourcetype

 

Get Updates on the Splunk Community!

Index This | Why did the turkey cross the road?

November 2025 Edition  Hayyy Splunk Education Enthusiasts and the Eternally Curious!   We’re back with this ...

Enter the Agentic Era with Splunk AI Assistant for SPL 1.4

  🚀 Your data just got a serious AI upgrade — are you ready? Say hello to the Agentic Era with the ...

Feel the Splunk Love: Real Stories from Real Customers

Hello Splunk Community,    What’s the best part of hearing how our customers use Splunk? Easy: the positive ...