Splunk Search

How to extract variable field name from log file path and add them to each log line while sending to splunk

Deep
Engager

Hi Everyone,
I am using splunk forwarder and I have below requirements 
We have log files under path /opt/airflow/logs/*/*/*/*.log
for example 
/opt/airflow/logs/getServerInfo/some_run_id/get_uptime/1.log  or 
/opt/airflow/logs/build_upgrade/some_run_id/ami_snapshot_task/5.log

Now i want to extract the field some_run_id from the log file path and want to add this some_run_id to each log line while sending the logs to splunk

Below is my normal logs format

[2024-01-17, 03:17:02 UTC] {subprocess.py:89} INFO - PLAY [Gather host information]
[2024-01-17, 03:17:01 UTC] {taskinstance.py:1262} INFO - Executing <Task(BashOperator): get_os_info> on 2024-01-17 03:16:37+00:00
[2024-01-17, 03:17:01 UTC] {standard_task_runner.py:52} INFO - Started process 1081826 to run task

Now i want below format of logs in splunk (I want this format of logs in splunk not on the actual log files)

some_run_id [2024-01-17, 03:17:02 UTC] {subprocess.py:89} INFO - PLAY [Gather host information]
some_run_id [2024-01-17, 03:17:01 UTC] {taskinstance.py:1262} INFO - Executing <Task(BashOperator): get_os_info> on 2024-01-17 03:16:37+00:00
some_run_id [2024-01-17, 03:17:01 UTC] {standard_task_runner.py:52} INFO - Started process 1081826 to run task

Any help is much appreciated !

Labels (4)
0 Karma

ITWhisperer
SplunkTrust
SplunkTrust

Check your events in splunk - there is a Splunk provide field called source which holds the file name from where the event came from. Can you use this to extract the data you want?

| eval some_run_id=mvindex(split(source,"/"),5)
0 Karma

isoutamo
SplunkTrust
SplunkTrust

Hi

if there is no real reason to add it in ingest phase you should use @ITWhisperer 's example.

But if you really need it on ingest time then you can look e.g. 

how to use e.g. INGEST_EVAL to manipulate events in ingest phase. 

r. Ismo

0 Karma

Deep
Engager

@isoutamo 
If i choose to extract these fields from file path and append those in ingest phase than below approach will work ?

props.conf

[source::/opt/airflow/logs/*/*/*/*.log]

TRANSFORMS-set_run_id = extract_run_id

transform.conf

[extract_run_id]

INGEST_EVAL = _runid = mvindex(split(source,"/"),5)

0 Karma

PickleRick
SplunkTrust
SplunkTrust

1. I'm not sure if you can easily create fields with names beginning with underscore. I'm not saying you definitely can't but by convention they are internal Splunk's fields so I wouldn't be surprised if you couldn't (or had problems accessing them later).

2. If you already have that info in the source field there is not much point in creating additional indexed field duplicating the value (I could agree that in some very rare cases there could be a use of such an indexed field if that info was stored in the raw event itself but since it's contained in the source which itself is an indexed field, there is not much point in just rewriting it elsewhere).

isoutamo
SplunkTrust
SplunkTrust

I agree with @PickleRick that don't use _ as a prefix for your own fields. I'm not sure if it even works or not?

Also it's usually better to do that on search time not an ingest time.

If you really need it then your solution should work as you show. One thing to remember is that you must put that props&transforms into 1st full splunk instance HF or Indexer from source to splunk indexers to get it working.

0 Karma
Get Updates on the Splunk Community!

Federated Search for Amazon S3 | Key Use Cases to Streamline Compliance Workflows

Modern business operations are supported by data compliance. As regulations evolve, organizations must ...

New Dates, New City: Save the Date for .conf25!

Wake up, babe! New .conf25 dates AND location just dropped!! That's right, this year, .conf25 is taking place ...

Introduction to Splunk Observability Cloud - Building a Resilient Hybrid Cloud

Introduction to Splunk Observability Cloud - Building a Resilient Hybrid Cloud  In today’s fast-paced digital ...