Getting Data In

Quickest way to ensure data is coming in for a sourcetype across dozens of servers...

a212830
Champion

Hi,

I have an app that creates lots of files (roll over at 50mb, about every 2-3 min during business hours), and has lots of servers (50+). I've had complaints that data and/or files are missing on occasion, so I'm looking for a quick/efficient way to ensure these servers and sourcetype are continuously sending data into Splunk and if they aren't I can identify the server having an issue.

Any suggestions?

0 Karma
1 Solution

MuS
SplunkTrust
SplunkTrust

Hi a212830,

if you are on Splunk 6.2.x and newer, you can use the Distributed Management Console http://docs.splunk.com/Documentation/Splunk/latest/DMC/WhatcanDMCdo which has pre-defined searches to cover that.

Another command is metadata http://docs.splunk.com/Documentation/Splunk/6.4.0/SearchReference/Metadata which will give you a quick overview, use it like this:

 | metadata type=sourcetypes index=*

Hope this helps ...

cheers, MuS

View solution in original post

jplumsdaine22
Influencer

metasearch is another useful command for this sort of thing if you don't have a DMC running

| metasearch sourcetype=<your_sourcetype>  | timechart useother=f values(source) by host

This search should show you which sources are indexed per host (adjust your span= for timechart to suit). If you don't have many different filenames I would probably swap the aggregator:

| metasearch sourcetype=<your_sourcetype>  | timechart useother=f values(host) by source

You can also just do a distinct count and set an alert based on how many hosts there should be.

| metasearch sourcetype=<your_sourcetype>  | timechart useother=f dc(host) by source

Checkout http://docs.splunk.com/Documentation/Splunk/latest/SearchReference/Metasearch

MuS
SplunkTrust
SplunkTrust

Hi a212830,

if you are on Splunk 6.2.x and newer, you can use the Distributed Management Console http://docs.splunk.com/Documentation/Splunk/latest/DMC/WhatcanDMCdo which has pre-defined searches to cover that.

Another command is metadata http://docs.splunk.com/Documentation/Splunk/6.4.0/SearchReference/Metadata which will give you a quick overview, use it like this:

 | metadata type=sourcetypes index=*

Hope this helps ...

cheers, MuS

a212830
Champion

Good stuff! Thanks!

0 Karma

martin_mueller
SplunkTrust
SplunkTrust

In any imaginable scenario except "I don't know of tstats", use |tstats instead of |metadata:

| tstats count where index=* by host sourcetype source

| tstats count where index=* host=suspicious sourcetype=weird by _time source | timechart sum(count) by source

Run the latter over a suspicious time range and see if a stacked bar chart has gaps.

jplumsdaine22
Influencer

MIND BLOWN

tstas 4 ever

0 Karma

martin_mueller
SplunkTrust
SplunkTrust

Additionally, make sure your forwarders also monitor rolled .1, .2, etc. files in case there is a short congestion. Don't monitor rolled .gz files though.

0 Karma

a212830
Champion

metadata could be useful, but how does one track it across sources and hosts - doesn't seem to be possible.

0 Karma

sloshburch
Splunk Employee
Splunk Employee

Correct, I can't think of a way to show unique tuples with metadata. You can only show a list of one or the other (or hosts) and append them.

0 Karma

MuS
SplunkTrust
SplunkTrust

As @martin_mueller suggested use | tstats count where index=* by host sourcetype source instead

Get Updates on the Splunk Community!

Welcome to the Splunk Community!

(view in My Videos) We're so glad you're here! The Splunk Community is place to connect, learn, give back, and ...

Tech Talk | Elevating Digital Service Excellence: The Synergy of Splunk RUM & APM

Elevating Digital Service Excellence: The Synergy of Real User Monitoring and Application Performance ...

Adoption of RUM and APM at Splunk

    Unleash the power of Splunk Observability   Watch Now In this can't miss Tech Talk! The Splunk Growth ...