Splunk Search

How to generate statistical data without knowing field names in advance?

decenior
Engager

Honored Splunkodes,

I am trying to keep track of the manpower in each of my legions, so that if any legion loses too many troops at once, I know which one to reinforce.
However, I have many legions, and thus I track all of their manpower without knowing which ones will be important each day. I can't leave my myrmidons without reinforcements!

I'd like to generate statistical information about them at the time of graph generation.

Currently I'm doing this, it's dirty but it works.

I get my legion manpower by querying that index, dropping any that don't fall in the top 50.
index=legions LegionName=*
| timechart span=1d limit=50 count by LegionName
| fields - OTHER
| untable _time LegionName ManPower
| outputlookup append=f mediterranean_legions.csv

Then I load up my lookup:
| inputlookup mediterranean_legions.csv
| convert timeformat="%Y-%m-%dT%H:%M:%S" mktime(_time) as _time
| bucket _time span=1d
| timechart avg(ManPower) by LegionName
| fields - OTHER
| untable _time LegionName ManPower
| streamstats global=f window=10 avg(ManPower) AS avg_value by LegionName
| eval lowerBound=(-avg_value*1.25)
| eval upperBound=(avg_value*1.25)
| eval isOutlier=if('ManPower' < lowerBound OR 'ManPower' > upperBound, "XXX".ManPower, ManPower)
| search isOutlier="XXX*"
| table _time, LegionName, ManPower, *

This gives me a quick idea which legions have lost (or gained) a lot of manpower each day.

Now ideally, I'd like to generate standard deviation and determine if they are outliers based on z score rather than just guessing with the lower and upper bound values.

If this worked, I'd get what I want. Is there a way to accomplish this?

| streamstats global=f window=10 avg(ManPower) AS mp_avg by LegionName, stdev(ManPower) as mp_stdev by LegionName, max(ManPower) as mp_max by LegionName, min(ManPower) as mp_min by LegionName

Labels (3)
0 Karma
1 Solution

ITWhisperer
SplunkTrust
SplunkTrust

I am not sure what it is you are trying to accomplish but does something like this work?

| gentimes start=-28 increment=1m
| rename starttime as _time 
| streamstats count as row 
| eval LegionName=mvindex(split("ABCDEFGHIJKLMNOPQRSTUVWXYZ",""),random()%4).mvindex(split("ABCDEFGHIJKLMNOPQRSTUVWXYZ",""),random()%25)
| timechart span=1d limit=50 count by LegionName
| fields - OTHER
| untable _time LegionName ManPower
| timechart avg(ManPower) by LegionName
| fields - OTHER
| untable _time LegionName ManPower
| streamstats global=f window=10 avg(ManPower) AS mp_avg stdev(ManPower) as mp_stdev max(ManPower) as mp_max min(ManPower) as mp_min by LegionName
| eval lowerBound=mp_avg-mp_stdev
| eval upperBound=mp_avg+mp_stdev
| eval isOutlier=if('ManPower' < lowerBound OR 'ManPower' > upperBound, "XXX".ManPower, ManPower)
| search isOutlier="XXX*"
| table _time, LegionName, ManPower, *

View solution in original post

ITWhisperer
SplunkTrust
SplunkTrust

I am not sure what it is you are trying to accomplish but does something like this work?

| gentimes start=-28 increment=1m
| rename starttime as _time 
| streamstats count as row 
| eval LegionName=mvindex(split("ABCDEFGHIJKLMNOPQRSTUVWXYZ",""),random()%4).mvindex(split("ABCDEFGHIJKLMNOPQRSTUVWXYZ",""),random()%25)
| timechart span=1d limit=50 count by LegionName
| fields - OTHER
| untable _time LegionName ManPower
| timechart avg(ManPower) by LegionName
| fields - OTHER
| untable _time LegionName ManPower
| streamstats global=f window=10 avg(ManPower) AS mp_avg stdev(ManPower) as mp_stdev max(ManPower) as mp_max min(ManPower) as mp_min by LegionName
| eval lowerBound=mp_avg-mp_stdev
| eval upperBound=mp_avg+mp_stdev
| eval isOutlier=if('ManPower' < lowerBound OR 'ManPower' > upperBound, "XXX".ManPower, ManPower)
| search isOutlier="XXX*"
| table _time, LegionName, ManPower, *
Got questions? Get answers!

Join the Splunk Community Slack to learn, troubleshoot, and make connections with fellow Splunk practitioners in real time!

Meet up IRL or virtually!

Join Splunk User Groups to connect and learn in-person by region or remotely by topic or industry.

Get Updates on the Splunk Community!

Why Splunk Customers Should Attend Cisco Live 2026 Las Vegas

Why Splunk Customers Should Attend Cisco Live 2026 Las Vegas     Cisco Live 2026 is almost here, and this ...

What Is the Name of the USB Key Inserted by Bob Smith? (BOTS Hint, Not the Answer)

Hello Splunkers,   So you searched, “what is the name of the usb key inserted by bob smith?”  Not gonna lie… ...

Automating Threat Operations and Threat Hunting with Recorded Future

    Automating Threat Operations and Threat Hunting with Recorded Future June 29, 2026 | Register   Is your ...