Splunk Search

Monitor CPU, RAM, DISK, INBOUND and OUTBOUND NETWORK TRAFIC of forwarders

israbenbr
Explorer

Hello,

I am posting here to know if anyone of you have an idea about the queries i have to search in order to save them and create a single dashboard to monitor my forwarders.

I need queries to : 

-  show the maximum CPU usage (in percent) per machine monitored, and the maximum CPU usage (in percent) of all these machines

- another one exactly as the previous one, but for the average CPU usage (in percentage)

-A third one with the same concept, but for RAM instead of CPU (always in percentage)

- Same thing, with disk usage (in percentage)

-2 other ones for the inbound and outbound network trafic (in percentage with unit : 1Gbps)

 

The data are collected from the monitored machines via the plugin 'Splunk add-on for Unix and Linux', and stored in an index called Linux

Thank you !

Labels (4)
0 Karma

johnhuang
Motivator

Couple of searches I've saved around Indexers, Forwarders, etc.

-- Indexer and Search Head System and Hardware Utilization
| rest /services/server/status/resource-usage/hostwide
| table splunk_server cpu_arch cpu_count virtual_cpu_count cpu_idle_pct cpu_system_pct cpu_user_pct mem mem_used pg_paged_out pg_swapped_out normalized_load_avg_1min runnable_process_count os_name os_name_ext os_version splunk_version
| append 
    [| rest /services/server/status/partitions-space
| rename available AS disk_available capacity AS disk_capacity free AS disk_free
| table splunk_server disk_available disk_capacity disk_free fs_type mount_point]
| append 
    [| rest /services/server/info
| eval server_role=mvindex(server_roles, 0)
| table splunk_server, server_role, cluster_label]
| stats values(*) AS * by splunk_server
| table splunk_server server_role cpu_arch cpu_count virtual_cpu_count cpu_idle_pct cpu_system_pct cpu_user_pct mem mem_used pg_paged_out pg_swapped_out  disk_available disk_capacity disk_free fs_type mount_point os_name os_version splunk_version cluster_label


-- TCP Input Stats to Indeder by Forwarder
index=_internal sourcetype=splunkd group=tcpin_connections (connectionType=cooked OR connectionType=cookedSSL) fwdType=full guid=* 
| rename fwdType AS forwarder_type version AS splunk_ver arch AS os_arch os AS os_type
| stats max(_time) as _time, sum(kb) as tcp_kb_total, sparkline(avg(tcp_KBps), 1m) as tcp_kbps_avg_sparkline, avg(tcp_KBps) as tcp_kbps_avg, avg(tcp_eps) as tcp_eps_avg, max(tcp_eps) as tcp_eps_max by hostname forwarder_type splunk_ver os_arch os_type
| foreach tcp_kbps_avg tcp_kb_total tcp_eps_avg tcp_eps_max  [| eval <<FIELD>>=ROUND(<<FIELD>>, 0)]
| eval hostname=UPPER(hostname)
| table _time hostname forwarder_type splunk_ver os_arch os_type tcp_kbps_avg_sparkline tcp_kbps_avg tcp_kb_total tcp_eps_avg tcp_eps_max


-- Average 24 Hourly Event Throughput in MB (Forwarder)
index=_internal source=*metrics.log group=per_sourcetype_thruput earliest=-2d@d 
    [search index=_internal sourcetype=splunkd group=tcpin_connections fwdType=full | dedup hostname | rename hostname AS host | table host]
| bucket _time span=1h
| eval series=host
| stats sum(kb) AS size_kb BY _time series
| eval size_mb=size_kb/1024
| eval event_hour=strftime(_time, "%H:%M")
| rename series AS data_source
| chart limit=24 avg(size_mb) AS size_mb by data_source event_hour
| fillnull value="0.00" 
| addtotals fieldname="hourly_avg"
| eval hourly_avg=ROUND(hourly_avg/24, 2)
| foreach *:* hourly_avg [| eval <<FIELD>>=ROUND('<<FIELD>>', 2)]

 

0 Karma

gcusello
SplunkTrust
SplunkTrust

Hi @israbenbr,

the easiest way is to see the Splunk App for Linux and Unix (https://splunkbase.splunk.com/app/273/) where you can find all the requested searches, and also the Splunk Monitor Console.

Ciao.

Giuseppe

0 Karma

israbenbr
Explorer

Hello,

The problem is that the Splunk support told me to avoid that solution because it will be soon no more supported by Splunk

 

 

0 Karma

gcusello
SplunkTrust
SplunkTrust

Hi @israbenbr,

I don't know why Splunk Support told this, but anyway, you could use that app as a guide to find the searches to put in your own app.

Anyway I developed this dashboard, see if it could help you.

<form>
  <label>Hardware and Software Details: Linux Servers</label>
  <fieldset submitButton="false">
    <input type="dropdown" token="host">
      <label>Server</label>
      <prefix>host="</prefix>
      <suffix>"</suffix>
      <fieldForLabel>host</fieldForLabel>
      <fieldForValue>host</fieldForValue>
      <search>
        <query>index=os sourcetype=hardware [ | inputlookup Server | fields host ] 
          | eval host=upper(host) 
          | dedup host 
          | sort host 
          | table host</query>
      </search>
    </input>
  </fieldset>
  <row>
    <panel>
      <title>HostName</title>
      <html>
      <h3 align="center">
        <strong> <font size="10">Server<img src="/static/app/infrastructure_monitoring/Linux_logo.png" style="height:100px;border:0;"/>
            </font>
          </strong>
        </h3>
    </html>
      <single>
        <search>
          <query>index=os sourcetype=hardware $host$ 
            | dedup host 
            | table host</query>
          <earliest>-24h@h</earliest>
          <latest>now</latest>
          <sampleRatio>1</sampleRatio>
        </search>
      </single>
    </panel>
  </row>
  <row>
    <panel>
      <title>Hardware</title>
      <table>
        <search>
          <query>index=os sourcetype=hardware $host$
            | dedup host 
            | eval MEMORY_REAL=MEMORY_REAL/1024/1024, MEMORY_SWAP=MEMORY_SWAP/1024/1024, host=upper(host)
            | lookup Server host OUTPUT IP Tipologia
            | table IP Tipologia CPU_TYPE CPU_COUNT CPU_CACHE MEMORY_REAL MEMORY_SWAP fd0 hdc sda 
            | rename CPU_TYPE AS CPU CPU_COUNT AS "Number of CPUs" CPU_CACHE AS Cache MEMORY_REAL As RAM MEMORY_SWAP AS Swap HARD_DRIVES AS "Hard Disks" fd0 AS "Floppy Disk" hdc AS "Hard Disk" sda AS "Virtual disk"</query>
          <earliest>-24h@h</earliest>
          <latest>now</latest>
          <sampleRatio>1</sampleRatio>
        </search>
        <option name="count">100</option>
        <option name="dataOverlayMode">none</option>
        <option name="drilldown">cell</option>
        <option name="percentagesRow">false</option>
        <option name="rowNumbers">false</option>
        <option name="totalsRow">false</option>
        <option name="wrap">true</option>
        <format type="number" field="Floppy Disk">
          <option name="unit">GB</option>
        </format>
        <format type="number" field="Hard Disk">
          <option name="unit">GB</option>
        </format>
        <format type="number" field="Virtual disk">
          <option name="unit">GB</option>
        </format>
        <format type="number" field="RAM">
          <option name="unit">GB</option>
        </format>
        <format type="number" field="Swap">
          <option name="unit">GB</option>
        </format>
        <format type="number" field="Cache">
          <option name="unit">kB</option>
        </format>
      </table>
    </panel>
    <panel>
      <title>df</title>
      <table>
        <search>
          <query>index=os  sourcetype=df $host$ 
            | dedup host 
            | multikv 
            | table Filesystem Type Size Used Avail UsePct MountedOn</query>
          <earliest>-24h@h</earliest>
          <latest>now</latest>
          <sampleRatio>1</sampleRatio>
        </search>
        <option name="count">100</option>
        <option name="dataOverlayMode">none</option>
        <option name="drilldown">cell</option>
        <option name="percentagesRow">false</option>
        <option name="rowNumbers">false</option>
        <option name="totalsRow">false</option>
        <option name="wrap">true</option>
        <format type="number" field="Floppy Disk">
          <option name="unit">GB</option>
        </format>
        <format type="number" field="Hard Disk">
          <option name="unit">GB</option>
        </format>
        <format type="number" field="Virtual disk">
          <option name="unit">GB</option>
        </format>
        <format type="number" field="RAM">
          <option name="unit">GB</option>
        </format>
        <format type="number" field="Swap">
          <option name="unit">GB</option>
        </format>
        <format type="number" field="Cache">
          <option name="unit">kB</option>
        </format>
      </table>
    </panel>
  </row>
  <row>
    <panel>
      <title>Processes</title>
      <table>
        <search>
          <query>index=os sourcetype=ps $host$ 
            | multikv 
            | table USER PID PSR pctCPU CPUTIME pctMEM RSZ_KB VSZ_KB TTY S ELAPSED COMMAND ARGS</query>
          <earliest>-24h@h</earliest>
          <latest>now</latest>
          <sampleRatio>1</sampleRatio>
        </search>
        <option name="count">10</option>
        <option name="dataOverlayMode">none</option>
        <option name="drilldown">cell</option>
        <option name="percentagesRow">false</option>
        <option name="rowNumbers">false</option>
        <option name="totalsRow">false</option>
        <option name="wrap">true</option>
        <format type="number" field="Floppy Disk">
          <option name="unit">GB</option>
        </format>
        <format type="number" field="Hard Disk">
          <option name="unit">GB</option>
        </format>
        <format type="number" field="Virtual disk">
          <option name="unit">GB</option>
        </format>
        <format type="number" field="RAM">
          <option name="unit">GB</option>
        </format>
        <format type="number" field="Swap">
          <option name="unit">GB</option>
        </format>
        <format type="number" field="Cache">
          <option name="unit">kB</option>
        </format>
      </table>
    </panel>
    <panel>
      <title>top command</title>
      <table>
        <search>
          <query>index=os sourcetype=top $host$ 
            | dedup host 
            | multikv 
            | table PID USER PR NI VIRT RES SHR S pctCPU pctMEM cpuTIME COMMAND</query>
          <earliest>-24h@h</earliest>
          <latest>now</latest>
          <sampleRatio>1</sampleRatio>
        </search>
        <option name="count">10</option>
        <option name="dataOverlayMode">none</option>
        <option name="drilldown">cell</option>
        <option name="percentagesRow">false</option>
        <option name="rowNumbers">false</option>
        <option name="totalsRow">false</option>
        <option name="wrap">true</option>
        <format type="number" field="Floppy Disk">
          <option name="unit">GB</option>
        </format>
        <format type="number" field="Hard Disk">
          <option name="unit">GB</option>
        </format>
        <format type="number" field="Virtual disk">
          <option name="unit">GB</option>
        </format>
        <format type="number" field="RAM">
          <option name="unit">GB</option>
        </format>
        <format type="number" field="Swap">
          <option name="unit">GB</option>
        </format>
        <format type="number" field="Cache">
          <option name="unit">kB</option>
        </format>
      </table>
    </panel>
  </row>
  <row>
    <panel>
      <title>netstat</title>
      <table>
        <search>
          <query>index=os sourcetype=netstat $host$ 
            | dedup host 
            | multikv 
            | table Proto Recv-Q Send-Q LocalAddress ForeignAddress State</query>
          <earliest>-24h@h</earliest>
          <latest>now</latest>
          <sampleRatio>1</sampleRatio>
        </search>
        <option name="count">10</option>
        <option name="dataOverlayMode">none</option>
        <option name="drilldown">cell</option>
        <option name="percentagesRow">false</option>
        <option name="rowNumbers">false</option>
        <option name="totalsRow">false</option>
        <option name="wrap">true</option>
        <format type="number" field="Floppy Disk">
          <option name="unit">GB</option>
        </format>
        <format type="number" field="Hard Disk">
          <option name="unit">GB</option>
        </format>
        <format type="number" field="Virtual disk">
          <option name="unit">GB</option>
        </format>
        <format type="number" field="RAM">
          <option name="unit">GB</option>
        </format>
        <format type="number" field="Swap">
          <option name="unit">GB</option>
        </format>
        <format type="number" field="Cache">
          <option name="unit">kB</option>
        </format>
      </table>
    </panel>
    <panel>
      <title>packages</title>
      <table>
        <search>
          <query>index=os sourcetype=package $host$ 
            | multikv 
            | dedup host NAME 
            | table NAME VERSION RELEASE ARCH VENDOR GROUP 
            | sort NAME</query>
          <earliest>-24h@h</earliest>
          <latest>now</latest>
          <sampleRatio>1</sampleRatio>
        </search>
        <option name="count">10</option>
        <option name="dataOverlayMode">none</option>
        <option name="drilldown">cell</option>
        <option name="percentagesRow">false</option>
        <option name="rowNumbers">false</option>
        <option name="totalsRow">false</option>
        <option name="wrap">true</option>
        <format type="number" field="Floppy Disk">
          <option name="unit">GB</option>
        </format>
        <format type="number" field="Hard Disk">
          <option name="unit">GB</option>
        </format>
        <format type="number" field="Virtual disk">
          <option name="unit">GB</option>
        </format>
        <format type="number" field="RAM">
          <option name="unit">GB</option>
        </format>
        <format type="number" field="Swap">
          <option name="unit">GB</option>
        </format>
        <format type="number" field="Cache">
          <option name="unit">kB</option>
        </format>
      </table>
    </panel>
  </row>
  <row>
    <panel>
      <title>openPorts</title>
      <table>
        <search>
          <query>index=os sourcetype=openPorts $host$ 
            | dedup host 
            | multikv 
            | table Proto Port</query>
          <earliest>-24h@h</earliest>
          <latest>now</latest>
          <sampleRatio>1</sampleRatio>
        </search>
        <option name="count">10</option>
        <option name="dataOverlayMode">none</option>
        <option name="drilldown">cell</option>
        <option name="percentagesRow">false</option>
        <option name="rowNumbers">false</option>
        <option name="totalsRow">false</option>
        <option name="wrap">true</option>
        <format type="number" field="Floppy Disk">
          <option name="unit">GB</option>
        </format>
        <format type="number" field="Hard Disk">
          <option name="unit">GB</option>
        </format>
        <format type="number" field="Virtual disk">
          <option name="unit">GB</option>
        </format>
        <format type="number" field="RAM">
          <option name="unit">GB</option>
        </format>
        <format type="number" field="Swap">
          <option name="unit">GB</option>
        </format>
        <format type="number" field="Cache">
          <option name="unit">kB</option>
        </format>
      </table>
    </panel>
    <panel>
      <title>protocol</title>
      <table>
        <search>
          <query>index=os sourcetype=protocol $host$ 
            | dedup host 
            | multikv 
            | table IPdropped TCPrexmits TCPreorder TCPpktRecv TCPpktSent UDPpktLost UDPunkPort UDPpktRecv UDPpktSent</query>
          <earliest>-24h@h</earliest>
          <latest>now</latest>
          <sampleRatio>1</sampleRatio>
        </search>
        <option name="count">10</option>
        <option name="dataOverlayMode">none</option>
        <option name="drilldown">cell</option>
        <option name="percentagesRow">false</option>
        <option name="rowNumbers">false</option>
        <option name="totalsRow">false</option>
        <option name="wrap">true</option>
        <format type="number" field="Floppy Disk">
          <option name="unit">GB</option>
        </format>
        <format type="number" field="Hard Disk">
          <option name="unit">GB</option>
        </format>
        <format type="number" field="Virtual disk">
          <option name="unit">GB</option>
        </format>
        <format type="number" field="RAM">
          <option name="unit">GB</option>
        </format>
        <format type="number" field="Swap">
          <option name="unit">GB</option>
        </format>
        <format type="number" field="Cache">
          <option name="unit">kB</option>
        </format>
      </table>
    </panel>
  </row>
  <row>
    <panel>
      <title>Users with private logins</title>
      <table>
        <search>
          <query>index=os sourcetype=usersWithLoginPrivs $host$ 
            | dedup host 
            | multikv 
            | table USERNAME HOME_DIR USER_INFO</query>
          <earliest>-24h@h</earliest>
          <latest>now</latest>
          <sampleRatio>1</sampleRatio>
        </search>
        <option name="count">100</option>
        <option name="dataOverlayMode">none</option>
        <option name="drilldown">cell</option>
        <option name="percentagesRow">false</option>
        <option name="rowNumbers">false</option>
        <option name="totalsRow">false</option>
        <option name="wrap">true</option>
        <format type="number" field="Floppy Disk">
          <option name="unit">GB</option>
        </format>
        <format type="number" field="Hard Disk">
          <option name="unit">GB</option>
        </format>
        <format type="number" field="Virtual disk">
          <option name="unit">GB</option>
        </format>
        <format type="number" field="RAM">
          <option name="unit">GB</option>
        </format>
        <format type="number" field="Swap">
          <option name="unit">GB</option>
        </format>
        <format type="number" field="Cache">
          <option name="unit">kB</option>
        </format>
      </table>
    </panel>
  </row>
  <row>
    <panel ref="Footer" app="infrastructure_monitoring"></panel>
  </row>
</form>

Ciao.

Giuseppe

0 Karma

israbenbr
Explorer

Thank you for sharing your code.

I am editing it to create my own customized dashboard, but i am really struggling : 

This is the query I made :

index=linux sourcetype=df host="the_host_name" Filesystem=/dev/s* earliest=-7d
| dedup host Filesystem
| stats avg(UsePct) AS Utilisation_Moyenne, max(UsePct) AS Utilisation_Maximale
| table host Filesystem UsePct Utilisation_Moyenne Utilisation_Maximale

 

It doesn't work : it only shows the field "Utilisation_Maximale" in only one row..

I want it to show for a given host, the max value (in percentage) and average value (in percentage) of its 2 disks usage, for the last week

I think it doesn't work because max and avg needs to have a numeric value, but it's strange because it is showing one in the field "Utilisation_Maximale"

 

Any ideas ? 

 

Thanks.

0 Karma

gcusello
SplunkTrust
SplunkTrust

Hi @israbenbr,

after a stats command you have only the fields in the stats not all the fields, so please try this:

index=linux sourcetype=df host="the_host_name" Filesystem=/dev/s* earliest=-7d
| stats avg(UsePct) AS Utilisation_Moyenne, max(UsePct) AS Utilisation_Maximale BY host Filesystem
| table host Filesystem Utilisation_Moyenne Utilisation_Maximale

Ciao.

Giuseppe

0 Karma

israbenbr
Explorer

Oh thank you,

But it worked only for the max field, the avg field is still empty

That's very strange

0 Karma

gcusello
SplunkTrust
SplunkTrust

Hi @israbenbr,

check if you have also the percentage in your "UsePct" values, if yes you cannot calculate the avg, so you have to extract values using a regex.

Ciao.

Giuseppe

0 Karma

israbenbr
Explorer

Hey,

It finally worked

Now i am struggling with the query that shows the RAM utilisation

I followed this tuto : https://www.youtube.com/watch?v=nsC4YytjRCY&ab_channel=SplunkHow-To

The problem is that no search i made detects the vmstat.sh data

Any ideas ? 

 

Many thanks

0 Karma
Get Updates on the Splunk Community!

.conf24 | Registration Open!

Hello, hello! I come bearing good news: Registration for .conf24 is now open!   conf is Splunk’s rad annual ...

ICYMI - Check out the latest releases of Splunk Edge Processor

Splunk is pleased to announce the latest enhancements to Splunk Edge Processor.  HEC Receiver authorization ...

Introducing the 2024 SplunkTrust!

Hello, Splunk Community! We are beyond thrilled to announce our newest group of SplunkTrust members!  The ...