Solved: Where does HEC push come in from?

ddrillic · ‎06-08-2019

A customer is asking:

"How can we tell where an HEC push is actually coming in from? or is that just not logged anywhere?"

MuS · ‎06-10-2019

Hi ddrillic,

I started to use a props.conf, transforms.conf setting to work around exactly this issue. It will add a meta data field containing the host that parsed the event therefore you will always know which Splunk instance parsed the events or where it came from.

Here is my config:
props.conf

[default]
TRANSFORMS-000-add-relay-info-to-meta = add-relay-info-to-meta

transforms.conf

[add-relay-info-to-meta]
FORMAT = splunk_hwf::<hostNameHere>
REGEX = .
WRITE_META = true

You can then search for index=_internal splunk_hwf::* to see the Splunk instance that pared the events.
The down side is that the hostname value needs to be hard coded, but I have an app that works around this as well 😉

Also be aware this will only work on non- INDEXED_EXTRACTIONS events, if you use a default parsing pipeline setup.

Hope this helps ...

cheers, MuS

UPDATE

The correct answer would be changing the connection_host in inputs.conf for the according [http...] stanza

connection_host = [ip|dns|proxied_ip|none]
* Specifies the host if an event doesn't have a host set.
* "ip" sets the host to the IP address of the system sending the data.
* "dns" sets the host to the reverse DNS entry for IP address of the system
  sending the data.
* "proxied_ip" checks whether an X-Forwarded-For header was sent
  (presumably by a proxy server) and if so, sets the host to that value.
  Otherwise, the IP address of the system sending the data is used.
* "none" leaves the host as specified in the HTTP header.
* No default.

View solution in original post

MuS · ‎06-10-2019

Hi ddrillic,

I started to use a props.conf, transforms.conf setting to work around exactly this issue. It will add a meta data field containing the host that parsed the event therefore you will always know which Splunk instance parsed the events or where it came from.

Here is my config:
props.conf

[default]
TRANSFORMS-000-add-relay-info-to-meta = add-relay-info-to-meta

transforms.conf

[add-relay-info-to-meta]
FORMAT = splunk_hwf::<hostNameHere>
REGEX = .
WRITE_META = true

You can then search for index=_internal splunk_hwf::* to see the Splunk instance that pared the events.
The down side is that the hostname value needs to be hard coded, but I have an app that works around this as well 😉

Also be aware this will only work on non- INDEXED_EXTRACTIONS events, if you use a default parsing pipeline setup.

Hope this helps ...

cheers, MuS

UPDATE

The correct answer would be changing the connection_host in inputs.conf for the according [http...] stanza

connection_host = [ip|dns|proxied_ip|none]
* Specifies the host if an event doesn't have a host set.
* "ip" sets the host to the IP address of the system sending the data.
* "dns" sets the host to the reverse DNS entry for IP address of the system
  sending the data.
* "proxied_ip" checks whether an X-Forwarded-For header was sent
  (presumably by a proxy server) and if so, sets the host to that value.
  Otherwise, the IP address of the system sending the data is used.
* "none" leaves the host as specified in the HTTP header.
* No default.

MuS · ‎06-10-2019

Update ping - I misunderstood the question, but this will give you a two in one answer/solution 🙂

ddrillic · ‎06-10-2019

Very interesting information as always @MuS.

ddrillic · ‎06-12-2019

@Mus, my buddy says -

-- If the host isn't specified for data that is coming through HEC, then it takes the VIP/HF hostname…which is what we don't want.

Is it right?

MuS · ‎06-12-2019

Just the messenger here 😉 The docs say:

* Specifies the host if an event doesn't have a host set.
 * "ip" sets the host to the IP address of the system sending the data.
 * "dns" sets the host to the reverse DNS entry for IP address of the system
   sending the data.

It is most likely true, because the default for connection_host is empty and therefore you would get the hostname of the instance running the HEC input.
I never used this setting nor had to play with it, because we use dedicated HEC inputs and also only have one sender for each HEC input.

cheers, MuS

ddrillic · ‎06-12-2019

@MuS - we don't set the connection_host parameter and the host field ends up to be one of the indexers, but let me check....

Eldenhanjoel · ‎06-10-2019

Did you create multiple tokens and want to find out which token is sending in the logs or do you want to know from which host you are receiving logs from?

Eldenhanjoel · ‎06-10-2019

| tstats count where index=* sourcetype=X by host
| sort 0 -count

ddrillic · ‎06-10-2019

The second - from which host you are receiving logs from?

VatsalJagani · ‎06-08-2019

@ddrillic

I think you want to understand the working of HEC. The easiest way to understand HEC is that consider it as rest-endpoint.
Just like rest-endpoint, it is always in listening mode as and when someone tries to access it will respond. In case of HEC when someone sends an event, HEC receives as a request and get the parameters from it and stores those as an event in Splunk.

I hope you understand what I'm trying to say.

ddrillic · ‎06-09-2019

Fair enough, the question is whether any data about the sender is stored somewhere...

VatsalJagani · ‎06-09-2019

Not sure about sendor as if two sendor is using same token then they are considered to be same in Splunk. But for debugging purpose you can use index="_introspection" sourcetype=http_event_collector_metrics.

ddrillic · ‎06-10-2019

Great. index="_internal" <HEC Key> was also useful ; -)

VatsalJagani · ‎06-11-2019

Hey right, thanks for sharing this information.

Where does HEC push come in from?

Infographic provides the TL;DR for the 2024 Splunk Career Impact Report

Enterprise Security Content Update (ESCU) | New Releases

Why am I not seeing the finding in Splunk Enterprise Security Analyst Queue?