Hi Splunkers, currently we are managing an Enterprise Splunk environment previously managed by another company. As sadly often occurs, no documentation has been released and so we had to discover almost information about architecture by ourselves. We successfully managed many tasks related to this big problem, but few ones remain; in particular, the one for what I open this discussion.
The point is this: almost total ingested data are collected flowing to a couple of HF. This means that data flow is, typically:
Log sources -> On prem HF -> Cloud HF (on IaaS VM) -> Cloud Indexer (on IaaS VM).
With a search discovered here on community, based on internal logs, I found how to understand what Splunk component send data to another Splunk one. I mean: suppose I have
HF on prem 1 -> Hf on cloud 2
I know how to discover this analyzing the internal logs. But what about if I want to discover which HF on prem collect data sent to a specific index? Let me do an example. Suppose I have this host set:
Log sources (with NO UF installed on them)
Log source 1
Log source 2
Log source 3
On prem HF
HF on prem 1
HF on prem 2
HF on prem 3
On cloud HF (IaaS VM)
HF on Cloud 1
On cloud indexer
Indexer on cloud 1 (IaaS VM)
Indexes
index1
index2
index3
At starting point, I know only that all 3 On prem HF collect data and send them to HF on Cloud: then, data are sent to the Indexer. I don’t know which On prem HF collect data from which Log source, and in which index data are collected once they arrive on indexer; for sure, I could ask to system owner what configuration has been performed on log sources, but the idea is to discover this with a Splunk Search. Is this possible?
The idea is to have a search where I can specify the exact flow. For example, suppose that 1 of the above flow is:
Log source 1 -> On Prem HF 2 -> On Cloud HF -> On Cloud Indexer -> index3
I must be able to discover it.
HFs process data transparently so there's no way to track the flow of events. Many customers work around that by having the HF add a field to every event where the value of the field is the HF's name.
HFs process data transparently so there's no way to track the flow of events. Many customers work around that by having the HF add a field to every event where the value of the field is the HF's name.