Getting Data In

Sequence of activities at index time

gcusello
SplunkTrust
SplunkTrust

Hi at all,

I have a new doubt about the sequence of activities during indextime.
I have a data flow, arriving from HEC on an HF that I need to elaborate it because these data arrive from a concentrator and are relative to many different data flows (linux, oracle, etc...), so I have to assign the correct sourcetype to these data and I have to elaborate logs because they are modified by securelog: the original logs are inserted in a field of json adding some metadata.

I configured the following flow:

in props.conf:

[source::http:logstash*]
TRANSFORMS-000 = global_set_metadata
TRANSFORMS-001 = set_sourcetype_by_regex
TRANSFORMS-001 = set_index_by_sourcetype

in transforms.conf:

[global_set_metadata]
INGEST_EVAL = host := coalesce(json_extract(_raw, "host.name"), json_extract(_raw, "host.hostname")), relay_hostname := json_extract(_raw, "hub"), source := "http:logstash".coalesce("::".json_extract(_raw, "log.file.path"), "")

[set_sourcetype_by_regex]
INGEST_EVAL = sourcetype := case(searchmatch("/var/log/audit/audit.log"), "linux_audit", true(), "logstash")

[set_index_by_sourcetype]
INGEST_EVAL = index:=case(sourcetype=linux, "index_linux", sourcetype=logstash, "index_logstash")

in which:
the first transformation extract (using INGEST_EVAL) metadata as host, source and relay_hostname (the concentrator from which the logs arrive),
the second one assign the correct sourcetype based on a regex.
the third one assign the correct index based on sourcetype and usig INGEST_EVAL to avoid to re-run a regex,
the first two transformations are correctly executed, but the third doesn't use the sourcetype assigned by the second one.

I also tried a different approach using CLONE_SOURCETYPE in the second one (instead of INGEST_EVAL) and it runs, but I'm verifying if the above flow can run because it's more linear and should be less heavy for the system.

Where could I search the issue?
is there something wrong in the activity flow?

Thank you to all.
Ciao.
Giuseppe

Labels (4)
Tags (2)
0 Karma
1 Solution

gcusello
SplunkTrust
SplunkTrust

Hi @isoutamo ,

thank you for your support.

it was a mistyping, the issue was that the searchmatch() function doesn't run in INGEST_EVAL, ising the match() function, my INGEST_EVAL is working.

Thank you again for your support.

Ciao.

Giuseppe

View solution in original post

isoutamo
SplunkTrust
SplunkTrust
You propably used raw endpoint on HEC?
0 Karma

gcusello
SplunkTrust
SplunkTrust

Hi @isoutamo ,

nice to hear you!

yes, I'm using HEC on premise, so I cannot use Edge.

Ciao.

Giuseppe

0 Karma

isoutamo
SplunkTrust
SplunkTrust
But are you using HEC's raw endpoint instead of event?

Also you have two same TRANSFORMS
TRANSFORMS-001 = set_sourcetype_by_regex
TRANSFORMS-001 = set_index_by_sourcetype

Which means that only one of those are used!
0 Karma

gcusello
SplunkTrust
SplunkTrust

Hi @isoutamo ,

thank you for your support.

it was a mistyping, the issue was that the searchmatch() function doesn't run in INGEST_EVAL, ising the match() function, my INGEST_EVAL is working.

Thank you again for your support.

Ciao.

Giuseppe

Career Survey
First 500 qualified respondents will receive a $20 gift card! Tell us about your professional Splunk journey.
Get Updates on the Splunk Community!

Tech Talk Recap | Mastering Threat Hunting

Mastering Threat HuntingDive into the world of threat hunting, exploring the key differences between ...

Observability for AI Applications: Troubleshooting Latency

If you’re working with proprietary company data, you’re probably going to have a locally hosted LLM or many ...

Splunk AI Assistant for SPL vs. ChatGPT: Which One is Better?

In the age of AI, every tool promises to make our lives easier. From summarizing content to writing code, ...