We are receiving data via a diode. However, event logs are from multiple hosts. How can we parse data from different hosts and direct it to the indexers?
Hi @radparik
yes hostname is good enough just for clarity do you want to assign different sourcetype for different hosts or are you creating different indexes per server(which is not good idea)
whats your use case here
Splunk can route events to a specific index based on the hostnames. below is a snapshot of props.conf as well as the transforms.conf stanza where you can perform the index routing that you described.
https://docs.splunk.com/Documentation/Splunk/8.2.6/Admin/Transformsconf
Props.conf
description = index where data is to be ingested
TRANSFORMS-theindex=theindexbyhost
Transforms.conf
[theindexbyhost]
SOURCE_KEY = MetaData:Host
REGEX = regexto match
if it helps karma points are appreciated/if it resolves acceptance as solution is appreciated
Please explain your use case. Why are you parsing the data instead of letting Splunk do it?
Hello,
We have a diode between two Splunk environments (that is the design). Right now, the data flow is-
HF->Diode->HF->IDX
Unfortunately, the diode forwards data from all hosts into one file in raw format. Is there a way to parse data from that one file?
Heavy forwarders communicate using Splunk-to-Splunk protocol. Is that the "raw format" to which you refer? If so, no action is needed on your part. The receiving HF understands the protocol and will process the data as necessary.
If, however, the diode is modifying the data then it must be prevented from doing that.
P.S. Why is the second HF there? What value does it add in this environment?
It simply depends on what is in your events. If the events in the stream are indistinguishable between hosts and the diode itself isn't able to add any kind of metadata how would you decide where to send which events?
Raw logs include the host name in the beginning - no metadata from the diode itself. Would that be enough to send to Indexers for parsing?
I was asking about metadata concerning source host. The diode itself is not important in this context (apart from the fact that you can't differentiate on source IP). The typical setup that I've seen which involved diode used syslog over UDP since it's easiest "diodeable" form of transport - it's inherently unidirectional. Are you using it or other transport/protocol? Do you really have a Heavy Forwarder inside the diode-separated environment? From my experience I seriously doubt it. If you indeed use syslog/UDP it's easiest to set up some syslog server (sc4s, rsyslog) and write proper rules for it so it adds proper metadata to the events (like index, source, sourcetype, optionally other indexed fields) and sends it to HEC on your HF. That's what I would do.