Solved: Can an eval, foreach, streamstats or eventstats ev...

actionabledata · ‎08-04-2021

My long set of SPL starts with the typical filtering on the primary search line. It then uses various eval, foreach, streamstats and eventstats commands to process the data for a big stats aggregation command.

Here is the problem or at least a gap in my misunderstanding:

Early in the SPL I use a "| where" command to eliminate events not containing a specific value. This works great. The results filter down to 351,513 events.

However, between this where command and the line just before the big stats command, I only use eval, foreach, streamstats and eventstats commands ... and the search results increase by 29. I thought each of these commands merely modified / created fields within the events.

There are now 351,532 events instead of the 351,513 events.

So the question is ... Can an eval, foreach, streamstats or eventstats ever INCREASE the number of search results or am I just misinterpreting the results.

actionabledata · ‎08-09-2021

ITWhisperer, I apologize for the delay in responding. Had deadlines to meet.

Your question made me think.

I reran the tests this morning with ... | stats count ... at the bottom of each set of code and the results were identical ... 351,426

I have to admit that I am puzzled because I thought this was the process I used recently; however, at this time, it looks like a false alarm.

Thank you again.

View solution in original post

actionabledata · ‎08-09-2021

Update 8/9/21

I reran one of the tests just to see if adding the | stats count at the bottom made any difference in the resulting count. It did. Nothing is different except for the | stats count line.

Here are some images showing the results ... which I cannot explain.

Without the | stats count

With the | stats count

ITWhisperer · ‎08-09-2021

Your time periods are different - what are the extra events that occur in those 8 minutes and 55 seconds and how do they interact with the rest of the events returned? What do you get if you force the time periods to be identical? There are 118 lines in your search so I would guess it is quite complex. without an understanding of those, it is difficult to determine why you are getting the results you are getting.

actionabledata · ‎08-09-2021

@ITWhisperer Good questions. Let me clarify.

The dataset this algorithm is operating on is fixed. The data is indexed and does not change. This is why my algorithm should be very deterministic.

Consequently, the time periods you see in the images are really just the different times of the day that I executed the code.

The code is quite complex, not rocket science, but sequentially arduous ... creating correlations and then sub-correlations in some places and then aggregating those pre-results for summary level dashboard graphics.

I need to obfuscate the code sufficiently while leaving the command sets in place such that you can better follow the functionality I am performing.

Truly appreciate the help.

ITWhisperer · ‎08-04-2021

How are you calculating the counts?

actionabledata · ‎08-09-2021

ITWhisperer, I apologize for the delay in responding. Had deadlines to meet.

Your question made me think.

I reran the tests this morning with ... | stats count ... at the bottom of each set of code and the results were identical ... 351,426

I have to admit that I am puzzled because I thought this was the process I used recently; however, at this time, it looks like a false alarm.

Thank you again.

bowesmana · ‎08-04-2021

Interesting...

I don't believe those specific commands can add event rows to the pipeline, but it may depend on what you are doing inside the foreach. Things like the standard append* commands and mvexpand will obviously add events.

Obvious question - is the time range the same and the dat stable within that time range when run the search?

Diagnostics - after the where clause, add an eval dummy=1 and then before the big stats, do a " | search NOT dummy=*" to see if that finds new events - although that's not a guaranteed way to find additions.

I assume you meant an increase of 19, not 29

There are now 351,532 events instead of the 351,513 events.

Can you share what you are doing between the where and stats

actionabledata · ‎08-09-2021

@bowesmana , I apologize for the delay in responding. Had deadlines to meet.

Good catch on my bad math.

I also responded to @ITWhisperer and the bottom line is that after rerunning the tests this morning, the results were identical. I must have been doing something odd the other day. I do not know what the difference in my tests was and this bothers me, but for now, all is well.

My foreach is merely appending a numerical prefix to two (2) separate fields to retain the time ordered sequence through the big stats.

The time frame is the same for both tests.

I did use your suggestion of adding a | eval dummy=1 and then the | search NOT dummy=* and at least today, the answers are consistent.

Unfortunately, I cannot show the algorithm, at least at this juncture.

Thank you for responding. The community supporting Splunk Answers is phenomenal!!

Can an eval, foreach, streamstats or eventstats ever INCREASE the number of search results?

Upcoming Webinar: Unmasking Insider Threats with Slunk Enterprise Security’s UEBA

.conf25 technical session recap of Observability for Gen AI: Monitoring LLM ...

A Season of Skills: New Splunk Courses to Light Up Your Learning Journey

Join the Conversation

Can an eval, foreach, streamstats or eventstats ever INCREASE the number of search results?

Upcoming Webinar: Unmasking Insider Threats with Slunk Enterprise Security’s UEBA

.conf25 technical session recap of Observability for Gen AI: Monitoring LLM ...

A Season of Skills: New Splunk Courses to Light Up Your Learning Journey