Is there an easy way to correlate per_index_thruput with per_host_thruput in the internal logs?
Essentially, I have a 1000 hosts coming into a unique index and i am trying to understand the volume per host for this index.
Right now i have a search that looks at the log events coming into the unique index for the past 24hrs and dedups on hostname. I then feed this hostname into a search of the internal index to get the volume by host. This works, but the search of the log events for hostname takes a long time to run as there are millions of events.
Thanks
I'm afraid the splunk metadata regarding source,sourcetype and host is not cross-referenced at all.
However IF it is true that a given host only appears in a single index, then you could generate a large lookup table from host to index. And then you could pipe your search results from metadata through the lookup command.
You'd have to create the lookup in manager first and then the search to generate the lookup csv would look something like this:
* | stats count by host, index | outputlookup hostToIndex
And once it exists the other search would look roughly like this:
index=_internal source="*metrics.log" group="per_host_thruput" | rename series as host | lookup hostToIndex | search index="myIndex"
I'm afraid the splunk metadata regarding source,sourcetype and host is not cross-referenced at all.
However IF it is true that a given host only appears in a single index, then you could generate a large lookup table from host to index. And then you could pipe your search results from metadata through the lookup command.
You'd have to create the lookup in manager first and then the search to generate the lookup csv would look something like this:
* | stats count by host, index | outputlookup hostToIndex
And once it exists the other search would look roughly like this:
index=_internal source="*metrics.log" group="per_host_thruput" | rename series as host | lookup hostToIndex | search index="myIndex"
That is essentially what i am doing, except i put the results into a summary index then use the summary results against the _internal index.
Problem is that initial query takes a very log time to run as it has to look back through all the data to get the hostnames. It sounds like there is no simple way. Thanks for the answer.