Getting Data In

How can we parse data from different hosts and direct it to the indexers?

radparik
Engager

We are receiving data via a diode. However, event logs are from multiple hosts. How can we parse data from different hosts and direct it to the indexers?

0 Karma

venky1544
Builder

Hi @radparik 

yes hostname is good enough just for clarity do you want to assign different sourcetype for different hosts or are you creating different indexes per server(which is not good idea)

whats your use case here 

Splunk can route events to a specific index based on the hostnames. below is a snapshot of  props.conf as well as the transforms.conf stanza where you can perform the index routing that you described.

https://docs.splunk.com/Documentation/Splunk/8.2.6/Admin/Transformsconf

Props.conf 

description = index where data is to be ingested
TRANSFORMS-theindex=theindexbyhost

Transforms.conf

[theindexbyhost]
SOURCE_KEY = MetaData:Host
REGEX = regexto match

 if it helps karma points are appreciated/if it resolves acceptance as solution is appreciated

0 Karma

richgalloway
SplunkTrust
SplunkTrust

Please explain your use case.  Why are you parsing the data instead of letting Splunk do it?

---
If this reply helps you, Karma would be appreciated.
0 Karma

radparik
Engager

Hello,

We have a diode between two Splunk environments (that is the design). Right now, the data flow is-

HF->Diode->HF->IDX

Unfortunately, the diode forwards data from all hosts into one file in raw format.  Is there a way to parse data from that one file?

 

0 Karma

richgalloway
SplunkTrust
SplunkTrust

Heavy forwarders communicate using Splunk-to-Splunk protocol.  Is that the "raw format" to which you refer?  If so, no action is needed on your part.  The receiving HF understands the protocol and will process the data as necessary.

If, however, the diode is modifying the data then it must be prevented from doing that.

P.S.  Why is the second HF there?  What value does it add in this environment?

---
If this reply helps you, Karma would be appreciated.
0 Karma

PickleRick
SplunkTrust
SplunkTrust

It simply depends on what is in your events. If the events in the stream are indistinguishable between hosts and the diode itself isn't able to add any kind of metadata how would you decide where to send which events?

0 Karma

radparik
Engager

Raw logs include the host name in the beginning - no metadata from the diode itself. Would that be enough to send to Indexers for parsing?

0 Karma

PickleRick
SplunkTrust
SplunkTrust

I was asking about metadata concerning source host. The diode itself is not important in this context (apart from the fact that you can't differentiate on source IP). The typical setup that I've seen which involved diode used syslog over UDP since it's easiest "diodeable" form of transport - it's inherently unidirectional. Are you using it or other transport/protocol? Do you really have a Heavy Forwarder inside the diode-separated environment? From my experience I seriously doubt it. If you indeed use syslog/UDP it's easiest to set up some syslog server (sc4s, rsyslog) and write proper rules for it so it adds proper metadata to the events (like index, source, sourcetype, optionally other indexed fields) and sends it to HEC on your HF. That's what I would do.

0 Karma
Get Updates on the Splunk Community!

Announcing Scheduled Export GA for Dashboard Studio

We're excited to announce the general availability of Scheduled Export for Dashboard Studio. Starting in ...

Extending Observability Content to Splunk Cloud

Watch Now!   In this Extending Observability Content to Splunk Cloud Tech Talk, you'll see how to leverage ...

More Control Over Your Monitoring Costs with Archived Metrics GA in US-AWS!

What if there was a way you could keep all the metrics data you need while saving on storage costs?This is now ...