Splunk Search

search to generate 5 sample events for lots of index/sourcetype pairs

esalesapns2
Communicator

I need to provide feedback on ways logging formats could be improved.

To that end, I'm trying to create a search that ends with:

| stats values(source) values(_raw) by index sourcetype

so I get some examples of logs, but I only want to see a max of 5 values in the source and _raw columns.

I tried using foreach with append, but append isn't streaming, so I manually created 204 lines like this:

index=index1 sourcetype=sourcetype1 | head 5
| append [ search index=index1 sourcetype=sourcetype2 | head 5 ]
| append [ search index=index2 sourcetype=sourcetype1 | head 5 ]
...

It took a long time in "Parsing job...", but eventually produced the results I wanted.
What are some different ways of getting this result?

Labels (1)
0 Karma
1 Solution

PickleRick
SplunkTrust
SplunkTrust

Ouch. So many subsearches. No wonder it takes forever to run (and might produce wrong/incomplete results).

This is actually one of the relatively few legitimate uses of the dedup command

index IN (index1, index2, ...) sourcetype IN (sourcetype1, sourcetype2,...)
| dedup 5 index sourcetype

 

View solution in original post

esalesapns2
Communicator

this works great, but over lots of indexes over hours (some are infrequent log sources) it takes a long time, so I shortened the time to 15-minutes and it ran in a few minutes, thank you!

0 Karma

PickleRick
SplunkTrust
SplunkTrust

That's true. Dedup works on the results but first it has to get those results so over a long time span it will be a relatively "heavy" command. If you can safely assume that all your results are contained within a certain time range from the latest event you could cheat a little by creating the search results dynamically.

Normally you could do something like this

| tstats max(_time) as latest where index IN (...) sourcetype IN (...) by index sourcetype

to find latest event time for each sourcetype/index.

Now if you can safely assume that all interesting events are within a certain range (let's say - within 5 minutes from the latest event), you could use this as a subsearch (but be aware of the subsearch limitations and be aware that it might return incomplete results in some cases!) to narrow down your initial search criteria

[ | tstats max(_time) as latest where index IN (...) sourcetype IN (...) by index sourcetype
  | eval earliest=latest-300 ]
| dedup 5 index sourcetype

This trick will make Splunk only look within latest 5 minutes for each index/sourcetype combination.

And - as far as I remember - it does not work if you want to use tstats. It only works with normal search.

0 Karma

PickleRick
SplunkTrust
SplunkTrust

Ouch. So many subsearches. No wonder it takes forever to run (and might produce wrong/incomplete results).

This is actually one of the relatively few legitimate uses of the dedup command

index IN (index1, index2, ...) sourcetype IN (sourcetype1, sourcetype2,...)
| dedup 5 index sourcetype

 

Get Updates on the Splunk Community!

Accelerating Observability as Code with the Splunk AI Assistant

We’ve seen in previous posts what Observability as Code (OaC) is and how it’s now essential for managing ...

Integrating Splunk Search API and Quarto to Create Reproducible Investigation ...

 Splunk is More Than Just the Web Console For Digital Forensics and Incident Response (DFIR) practitioners, ...

Congratulations to the 2025-2026 SplunkTrust!

Hello, Splunk Community! We are beyond thrilled to announce our newest group of SplunkTrust members!  The ...