Splunk Search
Highlighted

Using chained eval or separate eval statements, any performance gains?

Contributor

Is there any performance benefit in :

using one eval with several chained statements

v/s

using separate eval statements ( which may be split to improve SPL readability for extremely large SPL's)

| eval A = "OM"
| eval B = " NOM"
| eval C = " NOM"
| eval D= " NOM"
| eval E = " NOM"

or 

| eval A = "OM"  ,  B = " NOM"  ,  C = " NOM" ,  D= " NOM"  ,  E = " NOM"
0 Karma
Highlighted

Re: Using chained eval or separate eval statements, any performance gains?

SplunkTrust
SplunkTrust

@stanwin,

I'm not sure about the performance benefit using chained eval.
The chained eval supported in Splunk 6.4 version. As per my suggestion to go with separate eval statements for backwards compatibility and readability.

0 Karma
Highlighted

Re: Using chained eval or separate eval statements, any performance gains?

Legend

@stanwin I also feel they are only for making eval more readable. Similarly for rename command as well. But may be someone from Splunk may confirm that it is just for readability or more!




| eval message="Happy Splunking!!!"


0 Karma
Highlighted

Re: Using chained eval or separate eval statements, any performance gains?

Contributor

yes , but in cases like for example normalizing data etc with huge number of eval statements e.g 150+ if we consider.. will it be more efficient/performant? only Splunk can comment on that.

0 Karma
Highlighted

Re: Using chained eval or separate eval statements, any performance gains?

Builder

TL;DR: It appears chained evals are slightly faster than separated evals.

Methodology:
We are able to go through this ourselves using the job inspector.
Following is a run anywhere example scaled up and ran in verbose mode, so differences might be seen.

The gentimes command is used to generate unique timestamps at 1 second for each event, so we get unique events every time we run the search. In the command below, it generates 8,640,000 events (which is the number of seconds in 100 days).

Chained Command:

| gentimes start=-100 end=0 increment=1s
| eval A = "OM" , B = "NOM" , C = "NOM" , D = "NOM" , E = "NOM" , F = "OM" , G = "NOM" , H = "NOM" , I = "NOM" , J = "NOM" , K = "OM" , L = "NOM" , M = "NOM" , N = "NOM" , O = "NOM" , P = "OM" , Q = "NOM" , R = "NOM" , S = "NOM" , T = "NOM" , U = "OM" , V = "NOM" , W = "NOM" , X = "NOM" , Y = "NOM" , Z = "OM"

Separated Command:

| gentimes start=-100 end=0 increment=1s
| eval A = "OM" 
| eval B = "NOM" 
| eval C = "NOM" 
| eval D = "NOM" 
| eval E = "NOM" 
| eval F = "OM" 
| eval G = "NOM" 
| eval H = "NOM" 
| eval I = "NOM" 
| eval J = "NOM" 
| eval K = "OM" 
| eval L = "NOM" 
| eval M = "NOM" 
| eval N = "NOM" 
| eval O = "NOM" 
| eval P = "OM" 
| eval Q = "NOM" 
| eval R = "NOM" 
| eval S = "NOM" 
| eval T = "NOM" 
| eval U = "OM" 
| eval V = "NOM" 
| eval W = "NOM" 
| eval X = "NOM" 
| eval Y = "NOM" 
| eval Z = "OM"

Results:

of Events = 8,640,000

Chained Evals Search Time = 325.429 (80.13 seconds for the command.eval)
Separated Evals Search Time = 348.053 (98.77 seconds for the command.eval)

I seem to recall, but was not able to locate a reference, that every pipe costs something. In this example of 8.64 million events, that looks to be at least 18 seconds more using separated evals than chained evals (the remaining time is between running command.gentimes). YMMV based on your needs and your infrastructure, but it might be worth the readability to use the extra text and separate the evals.

###

If this reply helps you, an upvote would be appreciated.
Highlighted

Re: Using chained eval or separate eval statements, any performance gains?

Contributor

Thanks for the check efavreau , I did something similar myself.

But the test numbers are too transient due to environmental factors.

0 Karma
Highlighted

Re: Using chained eval or separate eval statements, any performance gains?

Builder

You can use this methodology to see the impact of the different eval construction. Run it multiple times, and see the trend between the two. That's what I am providing here. There is an observable difference on large data sets.

###

If this reply helps you, an upvote would be appreciated.
0 Karma
Highlighted

Re: Using chained eval or separate eval statements, any performance gains?

Contributor

Right thanks efavreau , yes makes sense that it probably may take longer for larger datasets.

0 Karma