Getting Data In

Quickest way to ensure data is coming in for a sourcetype across dozens of servers...

a212830
Champion

Hi,

I have an app that creates lots of files (roll over at 50mb, about every 2-3 min during business hours), and has lots of servers (50+). I've had complaints that data and/or files are missing on occasion, so I'm looking for a quick/efficient way to ensure these servers and sourcetype are continuously sending data into Splunk and if they aren't I can identify the server having an issue.

Any suggestions?

0 Karma
1 Solution

MuS
SplunkTrust
SplunkTrust

Hi a212830,

if you are on Splunk 6.2.x and newer, you can use the Distributed Management Console http://docs.splunk.com/Documentation/Splunk/latest/DMC/WhatcanDMCdo which has pre-defined searches to cover that.

Another command is metadata http://docs.splunk.com/Documentation/Splunk/6.4.0/SearchReference/Metadata which will give you a quick overview, use it like this:

 | metadata type=sourcetypes index=*

Hope this helps ...

cheers, MuS

View solution in original post

jplumsdaine22
Influencer

metasearch is another useful command for this sort of thing if you don't have a DMC running

| metasearch sourcetype=<your_sourcetype>  | timechart useother=f values(source) by host

This search should show you which sources are indexed per host (adjust your span= for timechart to suit). If you don't have many different filenames I would probably swap the aggregator:

| metasearch sourcetype=<your_sourcetype>  | timechart useother=f values(host) by source

You can also just do a distinct count and set an alert based on how many hosts there should be.

| metasearch sourcetype=<your_sourcetype>  | timechart useother=f dc(host) by source

Checkout http://docs.splunk.com/Documentation/Splunk/latest/SearchReference/Metasearch

MuS
SplunkTrust
SplunkTrust

Hi a212830,

if you are on Splunk 6.2.x and newer, you can use the Distributed Management Console http://docs.splunk.com/Documentation/Splunk/latest/DMC/WhatcanDMCdo which has pre-defined searches to cover that.

Another command is metadata http://docs.splunk.com/Documentation/Splunk/6.4.0/SearchReference/Metadata which will give you a quick overview, use it like this:

 | metadata type=sourcetypes index=*

Hope this helps ...

cheers, MuS

View solution in original post

a212830
Champion

Good stuff! Thanks!

0 Karma

martin_mueller
SplunkTrust
SplunkTrust

In any imaginable scenario except "I don't know of tstats", use |tstats instead of |metadata:

| tstats count where index=* by host sourcetype source

| tstats count where index=* host=suspicious sourcetype=weird by _time source | timechart sum(count) by source

Run the latter over a suspicious time range and see if a stacked bar chart has gaps.

jplumsdaine22
Influencer

MIND BLOWN

tstas 4 ever

0 Karma

martin_mueller
SplunkTrust
SplunkTrust

Additionally, make sure your forwarders also monitor rolled .1, .2, etc. files in case there is a short congestion. Don't monitor rolled .gz files though.

0 Karma

a212830
Champion

metadata could be useful, but how does one track it across sources and hosts - doesn't seem to be possible.

0 Karma

sloshburch
Ultra Champion

Correct, I can't think of a way to show unique tuples with metadata. You can only show a list of one or the other (or hosts) and append them.

0 Karma

MuS
SplunkTrust
SplunkTrust

As @martin_mueller suggested use | tstats count where index=* by host sourcetype source instead

Did you miss .conf21 Virtual?

Good news! The event's keynotes and many of its breakout sessions are now available online, and still totally FREE!