Hi Everyone,
I am using splunk forwarder and I have below requirements
We have log files under path /opt/airflow/logs/*/*/*/*.log
for example
/opt/airflow/logs/getServerInfo/some_run_id/get_uptime/1.log or
/opt/airflow/logs/build_upgrade/some_run_id/ami_snapshot_task/5.log
Now i want to extract the field some_run_id from the log file path and want to add this some_run_id to each log line while sending the logs to splunk
Below is my normal logs format
[2024-01-17, 03:17:02 UTC] {subprocess.py:89} INFO - PLAY [Gather host information]
[2024-01-17, 03:17:01 UTC] {taskinstance.py:1262} INFO - Executing <Task(BashOperator): get_os_info> on 2024-01-17 03:16:37+00:00
[2024-01-17, 03:17:01 UTC] {standard_task_runner.py:52} INFO - Started process 1081826 to run task
Now i want below format of logs in splunk (I want this format of logs in splunk not on the actual log files)
some_run_id [2024-01-17, 03:17:02 UTC] {subprocess.py:89} INFO - PLAY [Gather host information]
some_run_id [2024-01-17, 03:17:01 UTC] {taskinstance.py:1262} INFO - Executing <Task(BashOperator): get_os_info> on 2024-01-17 03:16:37+00:00
some_run_id [2024-01-17, 03:17:01 UTC] {standard_task_runner.py:52} INFO - Started process 1081826 to run task
Any help is much appreciated !
Check your events in splunk - there is a Splunk provide field called source which holds the file name from where the event came from. Can you use this to extract the data you want?
| eval some_run_id=mvindex(split(source,"/"),5)
Hi
if there is no real reason to add it in ingest phase you should use @ITWhisperer 's example.
But if you really need it on ingest time then you can look e.g.
how to use e.g. INGEST_EVAL to manipulate events in ingest phase.
r. Ismo
@isoutamo
If i choose to extract these fields from file path and append those in ingest phase than below approach will work ?
props.conf
[source::/opt/airflow/logs/*/*/*/*.log]
TRANSFORMS-set_run_id = extract_run_id
transform.conf
[extract_run_id]
INGEST_EVAL = _runid = mvindex(split(source,"/"),5)
1. I'm not sure if you can easily create fields with names beginning with underscore. I'm not saying you definitely can't but by convention they are internal Splunk's fields so I wouldn't be surprised if you couldn't (or had problems accessing them later).
2. If you already have that info in the source field there is not much point in creating additional indexed field duplicating the value (I could agree that in some very rare cases there could be a use of such an indexed field if that info was stored in the raw event itself but since it's contained in the source which itself is an indexed field, there is not much point in just rewriting it elsewhere).
I agree with @PickleRick that don't use _ as a prefix for your own fields. I'm not sure if it even works or not?
Also it's usually better to do that on search time not an ingest time.
If you really need it then your solution should work as you show. One thing to remember is that you must put that props&transforms into 1st full splunk instance HF or Indexer from source to splunk indexers to get it working.