Dashboards & Visualizations

Can an eval, foreach, streamstats or eventstats ever INCREASE the number of search results?

actionabledata
Path Finder

My long set of SPL starts with the typical filtering on the primary search line. It then uses various eval, foreach, streamstats and eventstats commands to process the data for a big stats aggregation command.

Here is the problem or at least a gap in my misunderstanding:

Early in the SPL I use a "| where" command to eliminate events not containing a specific value. This works great. The results filter down to 351,513 events.

However, between this where command and the line just before the big stats command, I only use eval, foreach, streamstats and eventstats commands ... and the search results increase by 29. I thought each of these commands merely modified / created fields within the events.

There are now 351,532 events instead of the 351,513 events.

So the question is ... Can an eval, foreach, streamstats or eventstats ever INCREASE the number of search results or am I just misinterpreting the results.

Labels (1)
0 Karma
1 Solution

actionabledata
Path Finder

ITWhisperer, I apologize for the delay in responding. Had deadlines to meet.

Your question made me think.


I reran the tests this morning with ... | stats count ... at the bottom of each set of code and the results were identical ... 351,426

I have to admit that I am puzzled because I thought this was the process I used recently; however, at this time, it looks like a false alarm.

Thank you again.

View solution in original post

0 Karma

actionabledata
Path Finder

Update 8/9/21

I reran one of the tests just to see if adding the | stats count at the bottom made any difference in the resulting count. It did. Nothing is different except for the | stats count line.

Here are some images showing the results ... which I cannot explain.

Without the | stats count

actionabledata_0-1628535101294.png

 

With the | stats count

actionabledata_1-1628535121102.png

 

0 Karma

ITWhisperer
SplunkTrust
SplunkTrust

Your time periods are different - what are the extra events that occur in those 8 minutes and 55 seconds and how do they interact with the rest of the events returned? What do you get if you force the time periods to be identical? There are 118 lines in your search so I would guess it is quite complex. without an understanding of those, it is difficult to determine why you are getting the results you are getting.

0 Karma

actionabledata
Path Finder

@ITWhisperer Good questions. Let me clarify.

The dataset this algorithm is operating on is fixed. The data is indexed and does not change. This is why my algorithm should be very deterministic.

Consequently, the time periods you see in the images are really just the different times of the day that I executed the code.  

The code is quite complex, not rocket science,  but sequentially arduous ... creating correlations and then sub-correlations in some places and then aggregating those pre-results for summary level dashboard graphics.

I need to obfuscate the code sufficiently while leaving the command sets in place such that you can better follow the functionality I am performing.

Truly appreciate the help.

 

0 Karma

ITWhisperer
SplunkTrust
SplunkTrust

How are you calculating the counts?

actionabledata
Path Finder

ITWhisperer, I apologize for the delay in responding. Had deadlines to meet.

Your question made me think.


I reran the tests this morning with ... | stats count ... at the bottom of each set of code and the results were identical ... 351,426

I have to admit that I am puzzled because I thought this was the process I used recently; however, at this time, it looks like a false alarm.

Thank you again.

0 Karma

bowesmana
SplunkTrust
SplunkTrust

Interesting...

I don't believe those specific commands can add event rows to the pipeline, but it may depend on what you are doing inside the foreach. Things like the standard append* commands and mvexpand will obviously add events.

Obvious question - is the time range the same and the dat stable within that time range when run the search?

Diagnostics - after the where clause, add an eval dummy=1 and then before the big stats, do a " | search NOT dummy=*" to see if that finds new events - although that's not a guaranteed way to find additions.

I assume you meant an increase of 19, not 29


There are now 351,532 events instead of the 351,513 events.

Can you share what you are doing between the where and stats

 

actionabledata
Path Finder

@bowesmana , I apologize for the delay in responding. Had deadlines to meet.

Good catch on my bad math.

I also responded to @ITWhisperer  and the bottom line is that after rerunning the tests this morning, the results were identical. I must have been doing something odd the other day. I do not know what the difference in my tests was and this bothers me, but for now, all is well.

My foreach is merely appending a numerical prefix to two (2) separate fields to retain the time ordered sequence through the big stats.

The time frame is the same for both tests.

I did use your suggestion of adding a | eval dummy=1 and then the | search NOT dummy=* and at least today, the answers are consistent.

Unfortunately, I cannot show the algorithm, at least at this juncture.

Thank you for responding. The community supporting Splunk Answers is phenomenal!!

0 Karma
Get Updates on the Splunk Community!

Index This | I am a number, but when you add ‘G’ to me, I go away. What number am I?

March 2024 Edition Hayyy Splunk Education Enthusiasts and the Eternally Curious!  We’re back with another ...

What’s New in Splunk App for PCI Compliance 5.3.1?

The Splunk App for PCI Compliance allows customers to extend the power of their existing Splunk solution with ...

Extending Observability Content to Splunk Cloud

Register to join us !   In this Extending Observability Content to Splunk Cloud Tech Talk, you'll see how to ...