Splunk Search

How to "Join" REST with metastats to get sourcetypes by index?

SeanBatt
Explorer

I have been wondering how to produce a table Of indexes and the sourcetypes used in them, something like:

 

 

 

| rest splunk_server=localhost /services/data/indexes
| fields + title
| rename title AS Index
| MAGIC-COMMAND [ | metadata type=sourcetype index=Index ]

 

 

 

MAGIC-COMMAND can't be join (no common field), appendcols, appendpipe and append don't do the right thing, and multisearch wants streaming data. If I switch the search and subsearch, I think I almost get what I want except I no longer have an index/title field.

 

 

 

| metadata type=sourcetypes index=Index [
  | rest splunk_server=localhost /services/data/indexes 
  | fields + title
  | rename title AS Index ]

 

 

 

I don't see how to use a REST call for /services/saved/sourcetypes instead of metadata because there's no index reference.
I'm stumped. Any advice?

 

Labels (3)
0 Karma
1 Solution

MuS
Legend

Hi there,

The tstats command performs queries on indexed fields in tsidx files. The indexed fields can be from indexed data, metadata or accelerated data models. This means it will not scan the _raw events and should normally be super fast except you have bloated tsidx files due to the above mentioned cases.

So usually the answer using `tstats` would be the best options and like you said `metadata` does not provide an index name ... but I tried something like this 

 

| dbinspect index=* 
| rename index AS baz 
| table baz 
| map maxsearches=100 search="| metadata type=sourcetypes index=$baz$ | eval foo=$baz$"

 

which almost worked but something once the search is done removes the field `foo` from the result 😞 

Not too good in web/UI debugging but I suspect its related of some UI cleanup of `_fieldname` 

cheers, MuS

View solution in original post

0 Karma

SeanBatt
Explorer

I am still interested in why my tsidx command takes an hour before returning only partial results (some of the indexers disconnect from the query). Does that indicate a problem with the tsidx files that I should get fixed?

0 Karma

SeanBatt
Explorer

Tstats reported more than 15 billion of events, took an hour to complete and then told me the results may be yet be incomplete because some indexers stopped responding. Summing the events from the metadata command came to more than 17 billion. 
We may well have bloated tsidx files; what are the reasons for that? Perhaps a post highlighting those was deleted as I can't see anything now.
Thanks MuS for your code with dbinspect! I'll try it out now.

0 Karma

yuanliu
SplunkTrust
SplunkTrust

to produce a table Of indexes and the sourcetypes used in them, why not simply

| tstats values(sourcetype) as sourcetype where splunk_server=localhost by index
0 Karma

SeanBatt
Explorer

I was hoping not to have to scan every event in Splunk and metadata seemed to be a vastly more efficient way of getting the data. I can't really trust the results from such a process, as the indexers keep reporting

The search process with search_id="..." may have returned partial results. Try running your search again.

It also doesn't seem to examine the last few billion events, though maybe it's right about the number of events and it's the metadata totalCounts that's wrong?

0 Karma

MuS
Legend

Hi there,

The tstats command performs queries on indexed fields in tsidx files. The indexed fields can be from indexed data, metadata or accelerated data models. This means it will not scan the _raw events and should normally be super fast except you have bloated tsidx files due to the above mentioned cases.

So usually the answer using `tstats` would be the best options and like you said `metadata` does not provide an index name ... but I tried something like this 

 

| dbinspect index=* 
| rename index AS baz 
| table baz 
| map maxsearches=100 search="| metadata type=sourcetypes index=$baz$ | eval foo=$baz$"

 

which almost worked but something once the search is done removes the field `foo` from the result 😞 

Not too good in web/UI debugging but I suspect its related of some UI cleanup of `_fieldname` 

cheers, MuS

0 Karma

SeanBatt
Explorer

Thank you, MuS, using map was just what I was after. I ended up with

| rest splunk_server=local /services/data/indexes 
| table title
| map maxsearches=250 search="| metadata type=sourcetypes index=$title$ | eval Index=\"$title$\" "

and except for the rest call not quite returning all indexes it does what I want.

Thanks again.

Sean

0 Karma
Get Updates on the Splunk Community!

Splunk Enterprise Security 8.0.2 Availability: On cloud and On-premise!

A few months ago, we released Splunk Enterprise Security 8.0 for our cloud customers. Today, we are excited to ...

Logs to Metrics

Logs and Metrics Logs are generally unstructured text or structured events emitted by applications and written ...

Developer Spotlight with Paul Stout

Welcome to our very first developer spotlight release series where we'll feature some awesome Splunk ...