Splunk Search

Removing fields vs. search performance

PickleRick
SplunkTrust
SplunkTrust

I'm watching the Fundamentals 2 course (finally XD) and I've come across the search ending with something like: | sort -field | rename field2 as something_else | fields - field3
And the question is whether it would be a bit faster to first remove the field and then sort? Or is it the other way around? On the one hand - removing fields should give you less data to manipulate when sorting. On the other hand - I don't expect Splunk to physically rewrite each and every event on each pipe so it might not really matter at all.

Side question - let's assume we rewrite it into | search field2=something | fields - field3

In this case - is it better to first trim the event set and then remove field or first remove field and then trim?

Of course I know that probably it's completely insignificant compared to the time it takes to get the data from the indexes. But that's just me digging into the internals 😉

Labels (1)
0 Karma
1 Solution

codebuilder
Influencer

You use "fields +" and "fields -" to include/ exclude fields. And yes, depending on the size of your event/field size it can have a significant impact on performance as it can reduce the amount of data. By default, the _raw field is returned in a general search, which contains the entire unparsed event. You can use "fields - _raw", for example, to eliminate that data and increase performance,. There are some tradeoffs of course.

Worth noting, "fields +" also excludes results. In that case your are telling Splunk to return only the fields listed.

----
An upvote would be appreciated and Accept Solution if it helps!

View solution in original post

0 Karma

codebuilder
Influencer

You use "fields +" and "fields -" to include/ exclude fields. And yes, depending on the size of your event/field size it can have a significant impact on performance as it can reduce the amount of data. By default, the _raw field is returned in a general search, which contains the entire unparsed event. You can use "fields - _raw", for example, to eliminate that data and increase performance,. There are some tradeoffs of course.

Worth noting, "fields +" also excludes results. In that case your are telling Splunk to return only the fields listed.

----
An upvote would be appreciated and Accept Solution if it helps!
0 Karma

PickleRick
SplunkTrust
SplunkTrust

So, in general, it would be best to remove unneeded fields as soon as possible, right?

(of course it's always a trade-off between performance now and - for example - flexibility to modify your search later)

0 Karma

codebuilder
Influencer

Yes, that's correct. It's always best to eliminate data as early as possible, especially in events with many fields.

----
An upvote would be appreciated and Accept Solution if it helps!
0 Karma
Get Updates on the Splunk Community!

ICYMI - Check out the latest releases of Splunk Edge Processor

Splunk is pleased to announce the latest enhancements to Splunk Edge Processor.  HEC Receiver authorization ...

Introducing the 2024 SplunkTrust!

Hello, Splunk Community! We are beyond thrilled to announce our newest group of SplunkTrust members!  The ...

Introducing the 2024 Splunk MVPs!

We are excited to announce the 2024 cohort of the Splunk MVP program. Splunk MVPs are passionate members of ...