<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: report on disk usage spikes? in Splunk Search</title>
    <link>https://community.splunk.com/t5/Splunk-Search/report-on-disk-usage-spikes/m-p/485001#M193503</link>
    <description>&lt;P&gt;I see the pics.&lt;/P&gt;

&lt;P&gt;This is because of  &lt;CODE&gt;|where delta &amp;gt; 20&lt;/CODE&gt; &lt;/P&gt;

&lt;P&gt;My answer is updated.&lt;/P&gt;</description>
    <pubDate>Tue, 05 May 2020 23:24:03 GMT</pubDate>
    <dc:creator>to4kawa</dc:creator>
    <dc:date>2020-05-05T23:24:03Z</dc:date>
    <item>
      <title>report on disk usage spikes?</title>
      <link>https://community.splunk.com/t5/Splunk-Search/report-on-disk-usage-spikes/m-p/484995#M193497</link>
      <description>&lt;P&gt;Need a report that:&lt;/P&gt;

&lt;OL&gt;
&lt;LI&gt;Lists volumes with significant disk usage spikes over a given timeframe.&lt;/LI&gt;
&lt;LI&gt;Plots those disk usage spikes over time.&lt;/LI&gt;
&lt;/OL&gt;

&lt;P&gt;P.S. Not interested in volumes with high percentage of used disk space - only in those that had a &lt;STRONG&gt;spike&lt;/STRONG&gt; of say more than 20%.&lt;/P&gt;

&lt;P&gt;I am assuming I'd need to:&lt;/P&gt;

&lt;OL&gt;
&lt;LI&gt;List volumes that had such a spike by calculating max and average values for e.g. &lt;CODE&gt;UsePct&lt;/CODE&gt; for a volume and then leaving only those with the delta &amp;gt; 20;&lt;/LI&gt;
&lt;LI&gt;Run a &lt;CODE&gt;timechart&lt;/CODE&gt; or something similar on those volumes.&lt;/LI&gt;
&lt;/OL&gt;

&lt;P&gt;Blanking out on how to do that and would appreciate your help - thanks!&lt;/P&gt;

&lt;P&gt;P.P.S. This is as far as I've gotten - and it seems to correctly ID volumes with usage spikes (updated May 5):&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;sourcetype=WinHostMon source=disk FileSystem!="SNFS"
| stats min(storage_used_percent) as min,
        avg(storage_used_percent) as avg,
        max(storage_used_percent) as max,
        by host, Name FileSystem DriveType
| eval delta = max - avg
| where delta&amp;gt;20
| sort - max delta avg
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;The above produces the full stats table for all hosts and their volumes that had a spike; adding &lt;CODE&gt;| fields host Name&lt;/CODE&gt; to it would produce just the hosts and volume names; the question remains: what is the best way to plot &lt;CODE&gt;storage_used_percent&lt;/CODE&gt; on those volumes over the timeframe of the search?&lt;/P&gt;

&lt;P&gt;P.P.P.S. Bonus points for streamlining the above search and making it faster; generally a streamlined mechanism for pinpointing anomalies (spikes, unusual deviations or volatility) on any available metrics - such as CPU, memory, disk and network utilization. (I have yet to properly configure Splunk infrastructure apps - perhaps such mechanisms are included in those.)&lt;/P&gt;</description>
      <pubDate>Thu, 30 Apr 2020 01:11:49 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/report-on-disk-usage-spikes/m-p/484995#M193497</guid>
      <dc:creator>mitag</dc:creator>
      <dc:date>2020-04-30T01:11:49Z</dc:date>
    </item>
    <item>
      <title>Re: report on disk usage spikes?</title>
      <link>https://community.splunk.com/t5/Splunk-Search/report-on-disk-usage-spikes/m-p/484996#M193498</link>
      <description>&lt;P&gt;hi @mitag&lt;/P&gt;

&lt;OL&gt;
&lt;LI&gt;your query has no &lt;CODE&gt;_time&lt;/CODE&gt; . nobody makes &lt;CODE&gt;timechart&lt;/CODE&gt;&lt;/LI&gt;
&lt;LI&gt;you don't provide sample logs. if you can create SPL with no logs, but others can't.&lt;/LI&gt;
&lt;LI&gt;Using &lt;CODE&gt;stats&lt;/CODE&gt; can't compare the original values, &lt;CODE&gt;eventstats&lt;/CODE&gt; is better.&lt;/LI&gt;
&lt;/OL&gt;</description>
      <pubDate>Thu, 30 Apr 2020 01:48:59 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/report-on-disk-usage-spikes/m-p/484996#M193498</guid>
      <dc:creator>to4kawa</dc:creator>
      <dc:date>2020-04-30T01:48:59Z</dc:date>
    </item>
    <item>
      <title>Re: report on disk usage spikes?</title>
      <link>https://community.splunk.com/t5/Splunk-Search/report-on-disk-usage-spikes/m-p/484997#M193499</link>
      <description>&lt;P&gt;Sorry for the delay! The sourcetype is the standard &lt;CODE&gt;sourcetype=WinHostMon&lt;/CODE&gt;. Searching for &lt;CODE&gt;Type=Disk&lt;/CODE&gt; or &lt;CODE&gt;source=disk&lt;/CODE&gt; would give you disk stats. Events look like this:&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;Type=Disk
Name="C:"
DriveType="fixed"
TotalSpaceKB=116859900
FreeSpaceKB=62318744
FileSystem="NTFS"
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;(host = &lt;CODE&gt;ws2016_016&lt;/CODE&gt; source = &lt;CODE&gt;disk&lt;/CODE&gt; sourcetype = &lt;CODE&gt;WinHostMon&lt;/CODE&gt;)&lt;/P&gt;

&lt;P&gt;(If you'd like, I can send you a sample of raw events.)&lt;/P&gt;

&lt;P&gt;They are sampled every 5-15 minutes. Some additional fields are calculated - e.g. for the above single event these fields are:&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt; storage                114120.99609375
 storage_free            60858.1484375
 storage_free_percent       53.32774031126161
 storage_used            53262.84765625
 storage_used_percent       46.67225968873839
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;My specific case is this: on several of our hosts, the boot disk ("C:") went full (from about 45% to 100% within minutes, then after 15-45 minutes - back to normal). I need to do a report that only shows those hosts and volumes that had a spike, and plot those spikes over time.&lt;/P&gt;

&lt;P&gt;We could of course just search for all hosts with volumes close to full (say, over 90%) - but that does not isolate the spikes correctly as some volumes have been close to full for a while.&lt;/P&gt;

&lt;P&gt;So I am thinking:&lt;/P&gt;

&lt;OL&gt;
&lt;LI&gt;calculate min, average and max &lt;CODE&gt;storage_used_percent&lt;/CODE&gt; for each volume,&lt;/LI&gt;
&lt;LI&gt;calculate the delta (difference) between max and avg for each volume / host;&lt;/LI&gt;
&lt;LI&gt;List hosts and volumes where that delta is over a threshold (say, 20%)&lt;/LI&gt;
&lt;LI&gt;run a &lt;CODE&gt;timechart&lt;/CODE&gt; command just on those volumes and hosts.&lt;/LI&gt;
&lt;/OL&gt;

&lt;P&gt;With the following search I am getting a &lt;STRONG&gt;list&lt;/STRONG&gt; of hosts and volumes that had a spike:&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;sourcetype=WinHostMon source=disk FileSystem!="SNFS"
| stats min(storage_used_percent) as min
        avg(storage_used_percent) as avg
        max(storage_used_percent) as max
        by host, Name FileSystem DriveType
| eval delta = max - avg
| where delta&amp;gt;20
| sort - max delta avg
| fields Name host
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;Now, how do I pipe the results into a timechart (or any other plotting mechanism)?&lt;/P&gt;

&lt;P&gt;Thanks!&lt;/P&gt;</description>
      <pubDate>Tue, 05 May 2020 19:08:17 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/report-on-disk-usage-spikes/m-p/484997#M193499</guid>
      <dc:creator>mitag</dc:creator>
      <dc:date>2020-05-05T19:08:17Z</dc:date>
    </item>
    <item>
      <title>Re: report on disk usage spikes?</title>
      <link>https://community.splunk.com/t5/Splunk-Search/report-on-disk-usage-spikes/m-p/484998#M193500</link>
      <description>&lt;P&gt;Does this look right? (Feels weird - as if I am doing two very similar transforms one after another - i.e. doesn't feel efficient.)&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;sourcetype=WinHostMon source="disk" 
    [ search sourcetype=WinHostMon source="disk" 
      | stats min(storage_used_percent) as min,
              avg(storage_used_percent) as avg,
              max(storage_used_percent) as max,
              by host, Name FileSystem DriveType
      | eval delta = max - avg
      | where delta&amp;gt;20
      | sort - max delta avg
      | table host Name 
      ]
| timechart max(storage_used_percent) by host
&lt;/CODE&gt;&lt;/PRE&gt;</description>
      <pubDate>Tue, 05 May 2020 20:34:00 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/report-on-disk-usage-spikes/m-p/484998#M193500</guid>
      <dc:creator>mitag</dc:creator>
      <dc:date>2020-05-05T20:34:00Z</dc:date>
    </item>
    <item>
      <title>Re: report on disk usage spikes?</title>
      <link>https://community.splunk.com/t5/Splunk-Search/report-on-disk-usage-spikes/m-p/484999#M193501</link>
      <description>&lt;P&gt;UPDATE:&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt; sourcetype=WinHostMon source=disk FileSystem!="SNFS"
| eval description=host."_".Name."_".FileSystem."_".DriveType
| bin _time span=1h
| stats min(storage_used_percent) as min,
           avg(storage_used_percent) as avg,
           max(storage_used_percent) as max by _time description
| eval delta = max - avg
| eval host=mvindex(split(description,"_"),0)
| eval flag = if(delta &amp;gt; 20,1,0)
| eventstats sum(flag) as flag by host
| where flag &amp;gt; 0
| sort 0 _time
| fields _time host max
| xyseries _time host max
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;I see, thanks to provide the detail. &lt;BR /&gt;
It would be &lt;EM&gt;very&lt;/EM&gt; easy to understand if other people wrote like &lt;EM&gt;this&lt;/EM&gt; too.&lt;/P&gt;</description>
      <pubDate>Tue, 05 May 2020 22:18:16 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/report-on-disk-usage-spikes/m-p/484999#M193501</guid>
      <dc:creator>to4kawa</dc:creator>
      <dc:date>2020-05-05T22:18:16Z</dc:date>
    </item>
    <item>
      <title>Re: report on disk usage spikes?</title>
      <link>https://community.splunk.com/t5/Splunk-Search/report-on-disk-usage-spikes/m-p/485000#M193502</link>
      <description>&lt;P&gt;Thank you - appreciate the kind words. Doesn't seem to be working though (probably something simple).&lt;/P&gt;

&lt;P&gt;(Can't seem to post an image... Here is the &lt;A href="https://photos.app.goo.gl/NJXAGw6ABcUPrD9s5"&gt;link to the two screenshots&lt;/A&gt;. Hopefully this works.)&lt;/P&gt;</description>
      <pubDate>Tue, 05 May 2020 22:49:25 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/report-on-disk-usage-spikes/m-p/485000#M193502</guid>
      <dc:creator>mitag</dc:creator>
      <dc:date>2020-05-05T22:49:25Z</dc:date>
    </item>
    <item>
      <title>Re: report on disk usage spikes?</title>
      <link>https://community.splunk.com/t5/Splunk-Search/report-on-disk-usage-spikes/m-p/485001#M193503</link>
      <description>&lt;P&gt;I see the pics.&lt;/P&gt;

&lt;P&gt;This is because of  &lt;CODE&gt;|where delta &amp;gt; 20&lt;/CODE&gt; &lt;/P&gt;

&lt;P&gt;My answer is updated.&lt;/P&gt;</description>
      <pubDate>Tue, 05 May 2020 23:24:03 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/report-on-disk-usage-spikes/m-p/485001#M193503</guid>
      <dc:creator>to4kawa</dc:creator>
      <dc:date>2020-05-05T23:24:03Z</dc:date>
    </item>
    <item>
      <title>Re: report on disk usage spikes?</title>
      <link>https://community.splunk.com/t5/Splunk-Search/report-on-disk-usage-spikes/m-p/485002#M193504</link>
      <description>&lt;P&gt;as is - still doesn't work. See the same link above for two more screenshots. If I replace the last line with:&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;| timechart max(max) by host
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;.... then it's working.&lt;/P&gt;</description>
      <pubDate>Sat, 09 May 2020 01:13:35 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/report-on-disk-usage-spikes/m-p/485002#M193504</guid>
      <dc:creator>mitag</dc:creator>
      <dc:date>2020-05-09T01:13:35Z</dc:date>
    </item>
    <item>
      <title>Re: report on disk usage spikes?</title>
      <link>https://community.splunk.com/t5/Splunk-Search/report-on-disk-usage-spikes/m-p/485003#M193505</link>
      <description>&lt;P&gt;good news.&lt;/P&gt;

&lt;P&gt;please provide correct query and accept yours.&lt;/P&gt;</description>
      <pubDate>Sat, 09 May 2020 01:22:22 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/report-on-disk-usage-spikes/m-p/485003#M193505</guid>
      <dc:creator>to4kawa</dc:creator>
      <dc:date>2020-05-09T01:22:22Z</dc:date>
    </item>
    <item>
      <title>Re: report on disk usage spikes?</title>
      <link>https://community.splunk.com/t5/Splunk-Search/report-on-disk-usage-spikes/m-p/485004#M193506</link>
      <description>&lt;P&gt;I don't understand how yours works yet... &lt;span class="lia-unicode-emoji" title=":slightly_smiling_face:"&gt;🙂&lt;/span&gt;&lt;/P&gt;

&lt;P&gt;The one I've been battling with is this:&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;(sourcetype=WinHostMon source=disk FileSystem!="SNFS") OR (sourcetype=df source="df" Type!="cvfs")
    [ search ((sourcetype=WinHostMon source=disk FileSystem!="SNFS") OR (sourcetype=df source="df" Type!="cvfs"))
      | eval Name     = if (isnull (Name),       mount, Name)
      | eval FileSystem = if (isnull (FileSystem), Type, FileSystem)

      | stats min(storage_used_percent) as min,
              avg(storage_used_percent) as avg,
              max(storage_used_percent) as max,
              by host, Name FileSystem DriveType
      | eval delta = max - avg
      | where delta&amp;gt;20
      | table host Name
      ]
| timechart max(storage_used_percent) by host
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;... it works but only for Windows hosts ( &lt;CODE&gt;sourcetype=WinHostMon source=disk&lt;/CODE&gt;). For Linux hosts - not yet... ( &lt;CODE&gt;sourcetype=df source="df"&lt;/CODE&gt;)&lt;/P&gt;

&lt;P&gt;P.S. Thank you for all your help with this.&lt;/P&gt;</description>
      <pubDate>Sat, 09 May 2020 04:39:12 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/report-on-disk-usage-spikes/m-p/485004#M193506</guid>
      <dc:creator>mitag</dc:creator>
      <dc:date>2020-05-09T04:39:12Z</dc:date>
    </item>
    <item>
      <title>Re: report on disk usage spikes?</title>
      <link>https://community.splunk.com/t5/Splunk-Search/report-on-disk-usage-spikes/m-p/485005#M193507</link>
      <description>&lt;P&gt;Hi @mitag&lt;BR /&gt;
&lt;CODE&gt;timechart&lt;/CODE&gt; creates times from time picker. &lt;BR /&gt;
However&lt;CODE&gt;xyseries&lt;/CODE&gt; are only changing the vertical and horizontal.&lt;/P&gt;

&lt;P&gt;As a reference.&lt;/P&gt;</description>
      <pubDate>Sat, 09 May 2020 08:10:47 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/report-on-disk-usage-spikes/m-p/485005#M193507</guid>
      <dc:creator>to4kawa</dc:creator>
      <dc:date>2020-05-09T08:10:47Z</dc:date>
    </item>
  </channel>
</rss>

