Splunk Search

why is Streamstats not working ?

adityaanand
Explorer

Hi,

I am trying to find cumulative sum of unique IPAddress by IsManuallyInstalled monthly. IsManuallyInstalled has two values: true and false. So I want cumulative sum of true and false monthly.
I did it this way:

source="D:\\Splunk Data\\RSD Demo\\*" host="PU4D9W0ND02" index="test" sourcetype="RSD_Log"|spath| rex "IPAddress=\"(?<IPAddress>[^\"]*)\" IsManuallyInstalled=\"(?<IsManuallyInstalled>[^\"]*)\"" | dedup IPAddress sortby -_time |timechart span=1mon count(IPAddress) As "MachineCount" by IsManuallyInstalled |streamstats global=f  sum(MachineCount)

But it is not giving the expected result as I supposed streamstats not working. It is giving the same result if I remove:
|streamstats global=f sum(MachineCount).

Please help how can i achieve this?

Regards,
Aditya

Tags (2)
0 Karma

jacobwilkins
Communicator

It looks like you want a distinct count, the dc() method...

source="D:\\Splunk Data\\RSD Demo\\*" host="PU4D9W0ND02" index="test" sourcetype="RSD_Log" 
|spath 
| rex "IPAddress=\"(?<IPAddress>[^\"]*)\" IsManuallyInstalled=\"(?<IsManuallyInstalled>[^\"]*)\""  
| timechart span=1mon dc(IPAddress) As "MachineCount" by IsManuallyInstalled 

Using dc() lets you get rid of that dedup.

I question the necessity of using both spath and rex, but you know your dataset better than I do.

0 Karma

jacobwilkins
Communicator

Oh, and you were looking for accumulations. Yeah. Listen to jeffland

0 Karma

jeffland
SplunkTrust
SplunkTrust

I also noticed the count/distinct count issue, but then saw the dedup ahead of it which makes dc redundant.

0 Karma

jeffland
SplunkTrust
SplunkTrust

This could be done with a simple stats, such as this:

 source="D:\\Splunk Data\\RSD Demo\\*" host="PU4D9W0ND02" index="test" sourcetype="RSD_Log"|spath| rex "IPAddress=\"(?<IPAddress>[^\"]*)\" IsManuallyInstalled=\"(?<IsManuallyInstalled>[^\"]*)\"" | dedup IPAddress sortby -_time | stats count by IsManuallyInstalled

If you want these results per month:

source="D:\\Splunk Data\\RSD Demo\\*" host="PU4D9W0ND02" index="test" sourcetype="RSD_Log"|spath| rex "IPAddress=\"(?<IPAddress>[^\"]*)\" IsManuallyInstalled=\"(?<IsManuallyInstalled>[^\"]*)\"" | dedup IPAddress sortby -_time | eval month=strftime(_time, "%B") | stats count by IsManuallyInstalled month

Be careful not to run this search over more than a year, or include the year in strftime as well.

0 Karma

adityaanand
Explorer

It is not giving cumulative sum.
Suppose in April month false count is 3 and in May false count is 6 then
Result should be like this:
April - 3
May - 9

0 Karma

jeffland
SplunkTrust
SplunkTrust

Hm, on second thought: your timechart has a by-clause. That means your columns are titled "true" and "false" (or how the results of IsManuallyInstalled are precisely), and you need to work your streamstats on those titles - i.e.

source="D:\\Splunk Data\\RSD Demo\\*" host="PU4D9W0ND02" index="test" sourcetype="RSD_Log"|spath| rex "IPAddress=\"(?<IPAddress>[^\"]*)\" IsManuallyInstalled=\"(?<IsManuallyInstalled>[^\"]*)\"" | dedup IPAddress sortby -_time |timechart span=1mon count(IPAddress) by IsManuallyInstalled |streamstats sum(true) sum(false)
0 Karma

jeffland
SplunkTrust
SplunkTrust

Edit: this is now obsolete.

Ah! I see. Sorry, somehow overlooked the "cumulative" in your question and wondered why you were using streamstats...

In your case, a way to do it would be via accum, something like

source="D:\\Splunk Data\\RSD Demo\\*" host="PU4D9W0ND02" index="test" sourcetype="RSD_Log"|spath| rex "IPAddress=\"(?<IPAddress>[^\"]*)\" IsManuallyInstalled=\"(?<IsManuallyInstalled>[^\"]*)\"" | dedup IPAddress sortby -_time |timechart span=1mon count(IPAddress) As "MachineCount" by IsManuallyInstalled | accum MachineCount as AccumulatedMachineCount
Career Survey
First 500 qualified respondents will receive a $20 gift card! Tell us about your professional Splunk journey.
Get Updates on the Splunk Community!

Tech Talk Recap | Mastering Threat Hunting

Mastering Threat HuntingDive into the world of threat hunting, exploring the key differences between ...

Observability for AI Applications: Troubleshooting Latency

If you’re working with proprietary company data, you’re probably going to have a locally hosted LLM or many ...

Splunk AI Assistant for SPL vs. ChatGPT: Which One is Better?

In the age of AI, every tool promises to make our lives easier. From summarizing content to writing code, ...