I have a set of hosts (hosts.csv) and I want to find out all the possible sourcetypes or indexes are reporting on those hosts. Assuming all hosts are configured properly for all sourcetypes or indexes, I tried to use metadata query, but the index field is missing from the query.
| metadata type=hosts index=* | search [|inputlookup hosts.csv| fields host]
The query returned the result which is missing the indexes
firstTime host lastTime recentTime totalCount type
1491509278 1.1.1.1 1492203261 1492203261 22 hosts
1491509277 1.1.1.2 1492203241 1492203241 29 hosts
1491509276 1.1.1.3 1492203241 1492203241 14 hosts
1491509292 1.1.1.4 1492203228 1492203228 20 hosts
Does anyone know there are other ways to achieve goal?
Thanks.
tstats is super fast and ideal choice to generate the list. But if you are looking for indexing volume along with that , metric data is the best pace to search.
Ex:
index="_internal" idx=* sourcetype=splunkd source="*license_usage.log"|stats min(_time) AS firstTime max(_time) AS lastTime,sum(b) as bytes_indexed by h idx st |convert ctime(*Time)
Of all the answers, I like @mmodestino's best - tstats is fast and accurate. metadata can return incomplete results in a larger/busy environment, and the normal index=* search is slow.
All of these solutions use the host that is supplied by the parsing process. Imagine that you have a server that is a log repository - perhaps it is a file server with log files from a bunch of different machines. When you set up the forwarder, you should properly set the host to the name of the originating machine. And the name of the originating machine will be used in the tstats or whatever.
That's great if you want to measure how much data / which sourcetypes / etc. is coming from each original host. But if you want to measure forwarder traffic, look at the Monitoring Console. Or make your own search, perhaps something like this:
index=_internal sourcetype=splunkd component=LicenseUsage
| stats sum(b) as b by idx host h s st
| rename host as sourceServer
| rename h as host s as source st as sourcetype idx as index
| eval MB = round(b/1024/1024,1) | fields - b
Hey SplunkRocks2014!
When reporting on what's in Splunk, you will want to use the tstats command.
Check out meta woot! app and the scheduled search it runs. Provides some great info to help you account for what is in your Splunk instance!
https://splunkbase.splunk.com/app/2949/#/details
| tstats count min(_time) as firstTime, max(_time) as lastTime, max(_indextime) as recentTime by host, sourcetype, index
Hi mmodestino, thank you for your information.
The app, Meta woot!, doesn't work from my environment even the Splunk version is matched to the requirement. However, I captured some searches from the URL, basically, it uses "... index=&form.sourcetype=&form.host=*&form.filter=where%20recentTime>(now()-86400)&form.latency=latency> ... ", this will be very slow. And the tstats query always return 15 records from os and main indexes regardless the timerange.
| tstats count min(_time) as firstTime, max(_time) as lastTime, max(_indextime) as recentTime by host, sourcetype, index
What version of Splunk are you running??
Not sure why you are looking at the url... You can see the searches under saved searches...The search they use is not slow...its | tstats
.
| tstats count min(_time) as firstTime, max(_time) as lastTime, max(_indextime) as recentTime where _index_earliest=-10m@m _index_latest=-5m@m (`meta_woot_host_filter`) (`meta_woot_sourcetype_filter`) (`meta_woot_index_filter`) by host, sourcetype, index
| eval host=lower(host)
| eval _time=now()-300
| collect sourcetype=meta_woot index=`meta_woot_summary`
| fields - _time count
| eval hash=md5(host.sourcetype.index)
| lookup meta_woot hash OUTPUTNEW _key as _key, firstTime as firstTimekv
| eval firstTime=if((isnotnull(firstTimekv) AND firstTimekv
I don't think you should re-invent the wheel here, especially if your reason for not using meta woot is that are on an old version, so just start simple with this
| tstats count WHERE index=* by index, host, sourcetype
Hi splunkrocks2014,
I'm not sure to had understood your need: do you want to know all the indexes and all the sourcetypes of each host?
If this is you need try something like this:
index=* | stats values(index) AS index values(sourcetype) AS sourcetype count by host
if instead you want to know the count of each group host, index, sourcetype, you can try something like this
index=* | stats count by host, index, sourcetype
Bye.
Giuseppe
Thanks Giuseppe. Initially, I used the same way to pull the host per index or sourcetype; however, the performance is terribly slow due to the huge environment. Also, the hosts.csv can contain more than 200k records.
You can use wildcard characters for index and multiple indexes. Following is an example from Splunk documentation: https://docs.splunk.com/Documentation/Splunk/latest/SearchReference/Metadata#Optional_arguments
| metadata type=hosts index=cs* index=na* index=ap* index=eu
Hi Niketnilay, this is the same query I am using, | metadata type=hosts index=*, but the resultset doesn't include the index. Thanks.