Hi,
I am trying to find cumulative sum of unique IPAddress
by IsManuallyInstalled
monthly. IsManuallyInstalled
has two values: true and false. So I want cumulative sum of true and false monthly.
I did it this way:
source="D:\\Splunk Data\\RSD Demo\\*" host="PU4D9W0ND02" index="test" sourcetype="RSD_Log"|spath| rex "IPAddress=\"(?<IPAddress>[^\"]*)\" IsManuallyInstalled=\"(?<IsManuallyInstalled>[^\"]*)\"" | dedup IPAddress sortby -_time |timechart span=1mon count(IPAddress) As "MachineCount" by IsManuallyInstalled |streamstats global=f sum(MachineCount)
But it is not giving the expected result as I supposed streamstats not working. It is giving the same result if I remove:
|streamstats global=f sum(MachineCount).
Please help how can i achieve this?
Regards,
Aditya
It looks like you want a distinct count, the dc()
method...
source="D:\\Splunk Data\\RSD Demo\\*" host="PU4D9W0ND02" index="test" sourcetype="RSD_Log"
|spath
| rex "IPAddress=\"(?<IPAddress>[^\"]*)\" IsManuallyInstalled=\"(?<IsManuallyInstalled>[^\"]*)\""
| timechart span=1mon dc(IPAddress) As "MachineCount" by IsManuallyInstalled
Using dc()
lets you get rid of that dedup
.
I question the necessity of using both spath
and rex
, but you know your dataset better than I do.
Oh, and you were looking for accumulations. Yeah. Listen to jeffland
I also noticed the count/distinct count issue, but then saw the dedup ahead of it which makes dc redundant.
This could be done with a simple stats
, such as this:
source="D:\\Splunk Data\\RSD Demo\\*" host="PU4D9W0ND02" index="test" sourcetype="RSD_Log"|spath| rex "IPAddress=\"(?<IPAddress>[^\"]*)\" IsManuallyInstalled=\"(?<IsManuallyInstalled>[^\"]*)\"" | dedup IPAddress sortby -_time | stats count by IsManuallyInstalled
If you want these results per month:
source="D:\\Splunk Data\\RSD Demo\\*" host="PU4D9W0ND02" index="test" sourcetype="RSD_Log"|spath| rex "IPAddress=\"(?<IPAddress>[^\"]*)\" IsManuallyInstalled=\"(?<IsManuallyInstalled>[^\"]*)\"" | dedup IPAddress sortby -_time | eval month=strftime(_time, "%B") | stats count by IsManuallyInstalled month
Be careful not to run this search over more than a year, or include the year in strftime as well.
It is not giving cumulative sum.
Suppose in April month false count is 3 and in May false count is 6 then
Result should be like this:
April - 3
May - 9
Hm, on second thought: your timechart has a by-clause. That means your columns are titled "true" and "false" (or how the results of IsManuallyInstalled are precisely), and you need to work your streamstats on those titles - i.e.
source="D:\\Splunk Data\\RSD Demo\\*" host="PU4D9W0ND02" index="test" sourcetype="RSD_Log"|spath| rex "IPAddress=\"(?<IPAddress>[^\"]*)\" IsManuallyInstalled=\"(?<IsManuallyInstalled>[^\"]*)\"" | dedup IPAddress sortby -_time |timechart span=1mon count(IPAddress) by IsManuallyInstalled |streamstats sum(true) sum(false)
Edit: this is now obsolete.
Ah! I see. Sorry, somehow overlooked the "cumulative" in your question and wondered why you were using streamstats
...
In your case, a way to do it would be via accum
, something like
source="D:\\Splunk Data\\RSD Demo\\*" host="PU4D9W0ND02" index="test" sourcetype="RSD_Log"|spath| rex "IPAddress=\"(?<IPAddress>[^\"]*)\" IsManuallyInstalled=\"(?<IsManuallyInstalled>[^\"]*)\"" | dedup IPAddress sortby -_time |timechart span=1mon count(IPAddress) As "MachineCount" by IsManuallyInstalled | accum MachineCount as AccumulatedMachineCount