<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic How to generate statistical data without knowing field names in advance? in Splunk Search</title>
    <link>https://community.splunk.com/t5/Splunk-Search/How-to-generate-statistical-data-without-knowing-field-names-in/m-p/585644#M204047</link>
    <description>&lt;P&gt;Honored Splunkodes,&lt;/P&gt;
&lt;P&gt;I am trying to keep track of the manpower in each of my legions, so that if any legion loses too many troops at once, I know which one to reinforce.&lt;BR /&gt;However, I have many legions, and thus I track all of their manpower without knowing which ones will be important each day. I can't leave my myrmidons without reinforcements!&lt;/P&gt;
&lt;P&gt;I'd like to generate statistical information about them at the time of graph generation.&lt;/P&gt;
&lt;P&gt;Currently I'm doing this, it's dirty but it works.&lt;/P&gt;
&lt;P&gt;I get my legion manpower by querying that index, dropping any that don't fall in the top 50.&lt;BR /&gt;index=legions LegionName=*&lt;BR /&gt;| timechart span=1d limit=50 count by LegionName&lt;BR /&gt;| fields - OTHER&lt;BR /&gt;| untable _time LegionName ManPower&lt;BR /&gt;| outputlookup append=f mediterranean_legions.csv&lt;/P&gt;
&lt;P&gt;Then I load up my lookup:&lt;BR /&gt;| inputlookup mediterranean_legions.csv&lt;BR /&gt;| convert timeformat="%Y-%m-%dT%H:%M:%S" mktime(_time) as _time&lt;BR /&gt;| bucket _time span=1d&lt;BR /&gt;| timechart avg(ManPower) by LegionName&lt;BR /&gt;| fields - OTHER&lt;BR /&gt;| untable _time LegionName ManPower&lt;BR /&gt;| streamstats global=f window=10 avg(ManPower) AS avg_value by LegionName&lt;BR /&gt;| eval lowerBound=(-avg_value*1.25)&lt;BR /&gt;| eval upperBound=(avg_value*1.25)&lt;BR /&gt;| eval isOutlier=if('ManPower' &amp;lt; lowerBound OR 'ManPower' &amp;gt; upperBound, "XXX".ManPower, ManPower)&lt;BR /&gt;| search isOutlier="XXX*"&lt;BR /&gt;| table _time, LegionName, ManPower, *&lt;/P&gt;
&lt;P&gt;This gives me a quick idea which legions have lost (or gained) a lot of manpower each day.&lt;/P&gt;
&lt;P&gt;Now ideally, I'd like to generate standard deviation and determine if they are outliers based on z score rather than just guessing with the lower and upper bound values.&lt;/P&gt;
&lt;P&gt;If this worked, I'd get what I want. Is there a way to accomplish this?&lt;/P&gt;
&lt;P&gt;| streamstats global=f window=10 avg(ManPower) AS mp_avg by LegionName, stdev(ManPower) as mp_stdev by LegionName, max(ManPower) as mp_max by LegionName, min(ManPower) as mp_min by LegionName&lt;/P&gt;</description>
    <pubDate>Fri, 18 Feb 2022 01:09:06 GMT</pubDate>
    <dc:creator>decenior</dc:creator>
    <dc:date>2022-02-18T01:09:06Z</dc:date>
    <item>
      <title>How to generate statistical data without knowing field names in advance?</title>
      <link>https://community.splunk.com/t5/Splunk-Search/How-to-generate-statistical-data-without-knowing-field-names-in/m-p/585644#M204047</link>
      <description>&lt;P&gt;Honored Splunkodes,&lt;/P&gt;
&lt;P&gt;I am trying to keep track of the manpower in each of my legions, so that if any legion loses too many troops at once, I know which one to reinforce.&lt;BR /&gt;However, I have many legions, and thus I track all of their manpower without knowing which ones will be important each day. I can't leave my myrmidons without reinforcements!&lt;/P&gt;
&lt;P&gt;I'd like to generate statistical information about them at the time of graph generation.&lt;/P&gt;
&lt;P&gt;Currently I'm doing this, it's dirty but it works.&lt;/P&gt;
&lt;P&gt;I get my legion manpower by querying that index, dropping any that don't fall in the top 50.&lt;BR /&gt;index=legions LegionName=*&lt;BR /&gt;| timechart span=1d limit=50 count by LegionName&lt;BR /&gt;| fields - OTHER&lt;BR /&gt;| untable _time LegionName ManPower&lt;BR /&gt;| outputlookup append=f mediterranean_legions.csv&lt;/P&gt;
&lt;P&gt;Then I load up my lookup:&lt;BR /&gt;| inputlookup mediterranean_legions.csv&lt;BR /&gt;| convert timeformat="%Y-%m-%dT%H:%M:%S" mktime(_time) as _time&lt;BR /&gt;| bucket _time span=1d&lt;BR /&gt;| timechart avg(ManPower) by LegionName&lt;BR /&gt;| fields - OTHER&lt;BR /&gt;| untable _time LegionName ManPower&lt;BR /&gt;| streamstats global=f window=10 avg(ManPower) AS avg_value by LegionName&lt;BR /&gt;| eval lowerBound=(-avg_value*1.25)&lt;BR /&gt;| eval upperBound=(avg_value*1.25)&lt;BR /&gt;| eval isOutlier=if('ManPower' &amp;lt; lowerBound OR 'ManPower' &amp;gt; upperBound, "XXX".ManPower, ManPower)&lt;BR /&gt;| search isOutlier="XXX*"&lt;BR /&gt;| table _time, LegionName, ManPower, *&lt;/P&gt;
&lt;P&gt;This gives me a quick idea which legions have lost (or gained) a lot of manpower each day.&lt;/P&gt;
&lt;P&gt;Now ideally, I'd like to generate standard deviation and determine if they are outliers based on z score rather than just guessing with the lower and upper bound values.&lt;/P&gt;
&lt;P&gt;If this worked, I'd get what I want. Is there a way to accomplish this?&lt;/P&gt;
&lt;P&gt;| streamstats global=f window=10 avg(ManPower) AS mp_avg by LegionName, stdev(ManPower) as mp_stdev by LegionName, max(ManPower) as mp_max by LegionName, min(ManPower) as mp_min by LegionName&lt;/P&gt;</description>
      <pubDate>Fri, 18 Feb 2022 01:09:06 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/How-to-generate-statistical-data-without-knowing-field-names-in/m-p/585644#M204047</guid>
      <dc:creator>decenior</dc:creator>
      <dc:date>2022-02-18T01:09:06Z</dc:date>
    </item>
    <item>
      <title>Re: How to generate statistical data without knowing field names in advance?</title>
      <link>https://community.splunk.com/t5/Splunk-Search/How-to-generate-statistical-data-without-knowing-field-names-in/m-p/585673#M204057</link>
      <description>&lt;P&gt;I am not sure what it is you are trying to accomplish but does something like this work?&lt;/P&gt;&lt;LI-CODE lang="markup"&gt;| gentimes start=-28 increment=1m
| rename starttime as _time 
| streamstats count as row 
| eval LegionName=mvindex(split("ABCDEFGHIJKLMNOPQRSTUVWXYZ",""),random()%4).mvindex(split("ABCDEFGHIJKLMNOPQRSTUVWXYZ",""),random()%25)
| timechart span=1d limit=50 count by LegionName
| fields - OTHER
| untable _time LegionName ManPower
| timechart avg(ManPower) by LegionName
| fields - OTHER
| untable _time LegionName ManPower
| streamstats global=f window=10 avg(ManPower) AS mp_avg stdev(ManPower) as mp_stdev max(ManPower) as mp_max min(ManPower) as mp_min by LegionName
| eval lowerBound=mp_avg-mp_stdev
| eval upperBound=mp_avg+mp_stdev
| eval isOutlier=if('ManPower' &amp;lt; lowerBound OR 'ManPower' &amp;gt; upperBound, "XXX".ManPower, ManPower)
| search isOutlier="XXX*"
| table _time, LegionName, ManPower, *&lt;/LI-CODE&gt;</description>
      <pubDate>Fri, 18 Feb 2022 08:06:40 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/How-to-generate-statistical-data-without-knowing-field-names-in/m-p/585673#M204057</guid>
      <dc:creator>ITWhisperer</dc:creator>
      <dc:date>2022-02-18T08:06:40Z</dc:date>
    </item>
  </channel>
</rss>

