Getting Data In

How can we parse data from different hosts and direct it to the indexers?

radparik
Engager

We are receiving data via a diode. However, event logs are from multiple hosts. How can we parse data from different hosts and direct it to the indexers?

0 Karma

venky1544
Builder

Hi @radparik 

yes hostname is good enough just for clarity do you want to assign different sourcetype for different hosts or are you creating different indexes per server(which is not good idea)

whats your use case here 

Splunk can route events to a specific index based on the hostnames. below is a snapshot of  props.conf as well as the transforms.conf stanza where you can perform the index routing that you described.

https://docs.splunk.com/Documentation/Splunk/8.2.6/Admin/Transformsconf

Props.conf 

description = index where data is to be ingested
TRANSFORMS-theindex=theindexbyhost

Transforms.conf

[theindexbyhost]
SOURCE_KEY = MetaData:Host
REGEX = regexto match

 if it helps karma points are appreciated/if it resolves acceptance as solution is appreciated

0 Karma

richgalloway
SplunkTrust
SplunkTrust

Please explain your use case.  Why are you parsing the data instead of letting Splunk do it?

---
If this reply helps you, Karma would be appreciated.
0 Karma

radparik
Engager

Hello,

We have a diode between two Splunk environments (that is the design). Right now, the data flow is-

HF->Diode->HF->IDX

Unfortunately, the diode forwards data from all hosts into one file in raw format.  Is there a way to parse data from that one file?

 

0 Karma

richgalloway
SplunkTrust
SplunkTrust

Heavy forwarders communicate using Splunk-to-Splunk protocol.  Is that the "raw format" to which you refer?  If so, no action is needed on your part.  The receiving HF understands the protocol and will process the data as necessary.

If, however, the diode is modifying the data then it must be prevented from doing that.

P.S.  Why is the second HF there?  What value does it add in this environment?

---
If this reply helps you, Karma would be appreciated.
0 Karma

PickleRick
SplunkTrust
SplunkTrust

It simply depends on what is in your events. If the events in the stream are indistinguishable between hosts and the diode itself isn't able to add any kind of metadata how would you decide where to send which events?

0 Karma

radparik
Engager

Raw logs include the host name in the beginning - no metadata from the diode itself. Would that be enough to send to Indexers for parsing?

0 Karma

PickleRick
SplunkTrust
SplunkTrust

I was asking about metadata concerning source host. The diode itself is not important in this context (apart from the fact that you can't differentiate on source IP). The typical setup that I've seen which involved diode used syslog over UDP since it's easiest "diodeable" form of transport - it's inherently unidirectional. Are you using it or other transport/protocol? Do you really have a Heavy Forwarder inside the diode-separated environment? From my experience I seriously doubt it. If you indeed use syslog/UDP it's easiest to set up some syslog server (sc4s, rsyslog) and write proper rules for it so it adds proper metadata to the events (like index, source, sourcetype, optionally other indexed fields) and sends it to HEC on your HF. That's what I would do.

0 Karma
Career Survey
First 500 qualified respondents will receive a $20 gift card! Tell us about your professional Splunk journey.
Get Updates on the Splunk Community!

Introduction to Splunk AI

How are you using AI in Splunk? Whether you see AI as a threat or opportunity, AI is here to stay. Lucky for ...

Splunk + ThousandEyes: Correlate frontend, app, and network data to troubleshoot ...

Are you tired of troubleshooting delays caused by siloed frontend, application, and network data? We've got a ...

Maximizing the Value of Splunk ES 8.x

Splunk Enterprise Security (ES) continues to be a leader in the Gartner Magic Quadrant, reflecting its pivotal ...