Splunk Search

Removing fields vs. search performance

PickleRick
SplunkTrust
SplunkTrust

I'm watching the Fundamentals 2 course (finally XD) and I've come across the search ending with something like: | sort -field | rename field2 as something_else | fields - field3
And the question is whether it would be a bit faster to first remove the field and then sort? Or is it the other way around? On the one hand - removing fields should give you less data to manipulate when sorting. On the other hand - I don't expect Splunk to physically rewrite each and every event on each pipe so it might not really matter at all.

Side question - let's assume we rewrite it into | search field2=something | fields - field3

In this case - is it better to first trim the event set and then remove field or first remove field and then trim?

Of course I know that probably it's completely insignificant compared to the time it takes to get the data from the indexes. But that's just me digging into the internals 😉

0 Karma
1 Solution

codebuilder
Influencer

You use "fields +" and "fields -" to include/ exclude fields. And yes, depending on the size of your event/field size it can have a significant impact on performance as it can reduce the amount of data. By default, the _raw field is returned in a general search, which contains the entire unparsed event. You can use "fields - _raw", for example, to eliminate that data and increase performance,. There are some tradeoffs of course.

Worth noting, "fields +" also excludes results. In that case your are telling Splunk to return only the fields listed.

----
An upvote would be appreciated and Accept Solution if it helps!

View solution in original post

0 Karma

codebuilder
Influencer

You use "fields +" and "fields -" to include/ exclude fields. And yes, depending on the size of your event/field size it can have a significant impact on performance as it can reduce the amount of data. By default, the _raw field is returned in a general search, which contains the entire unparsed event. You can use "fields - _raw", for example, to eliminate that data and increase performance,. There are some tradeoffs of course.

Worth noting, "fields +" also excludes results. In that case your are telling Splunk to return only the fields listed.

----
An upvote would be appreciated and Accept Solution if it helps!
0 Karma

PickleRick
SplunkTrust
SplunkTrust

So, in general, it would be best to remove unneeded fields as soon as possible, right?

(of course it's always a trade-off between performance now and - for example - flexibility to modify your search later)

0 Karma

codebuilder
Influencer

Yes, that's correct. It's always best to eliminate data as early as possible, especially in events with many fields.

----
An upvote would be appreciated and Accept Solution if it helps!
0 Karma
Get Updates on the Splunk Community!

Splunk Mobile: Your Brand-New Home Screen

Meet Your New Mobile Hub  Hello Splunk Community!  Staying connected to your data—no matter where you are—is ...

Introducing Value Insights (Beta): Understand the Business Impact your organization ...

Real progress on your strategic priorities starts with knowing the business outcomes your teams are delivering ...

Enterprise Security (ES) Essentials 8.3 is Now GA — Smarter Detections, Faster ...

As of today, Enterprise Security (ES) Essentials 8.3 is now generally available, helping SOC teams simplify ...