I'm trying to extract fields using regex based on the condition.
Below are the raw payload.
{"group_id": "aa2211-3b22-4263-8fe7-e7ef6c7859ed", "Status": "OPEN", "description": "aa01 : GOT EQQE056I JOB T375P201(JOB91552), OPERATION(0010), OPERATION TEXT( ), ENDED IN ERROR JCL . PRTY=1, APPL = T375P2 , WORK STATION = CPU4, IE= 2603161420, NO E2E RC", "hostname": "aa01", "CI_company": "zzz Group", "CEC_ID": "job-585796313", "ORIGIN_By_Product": "Microsoft"}
{"group_id": "bb2211-3b22-4263-8fe7-e7ef6c7859ed", "Status": "OPEN", "description": "bb01 : GOT EQQE006I JOB T375Q201(JOB91632), OPERATION(0010), OPERATION TEXT( ), ENDED IN ERROR JCL . RTY=1, APPL = T375Q2 , WORK STATION = CPU5, IE= 2606661420, NO E2E RC", "hostname": "bb01", "CI_company": "xxx Group", "CEC_ID": "job-565796313", "ORIGIN_By_Product": "Microsoft"}
i want to extract highlighted value as field <core_key> based on the hostname.
if hostname matches aa01 than core_key =T37
if hostname matches bb01 than core_key =75Q
Hi @ra_52194724,
Try one of the following EXTRACT settings in props.conf:
[your_sourcetype_or_other_spec]
EXTRACT-core_key_1 = JOB (?|(?=.+"hostname": "aa01")|..(?=.+"hostname": "bb01"))(?<core_key>...)
EXTRACT-core_key_2 = "description": "(?:aa01 .+ JOB |bb01 .+ JOB ..)(?<core_key>...)
The core_key_1 extract uses alternating positive lookaheads to discard leading characters in the job identifier based on a trailing hostname value.
If the hostname always appears at the start of the description value, you can use the core_key_2 extract.
You can also use both regular expression in SPL:
| rex "JOB (?|(?=.+\"hostname\": \"aa01\")|..(?=.+\"hostname\": \"bb01\"))(?<core_key>...)"
| rex "\"description\": \"(?:aa01 .+ JOB |bb01 .+ JOB ..)(?<core_key>...)"
If you have an arbitrarily long list of hostnames to match, I might take a different approach. Let us know.
Hi @ra_52194724,
N.B.: I posted this answer a moment ago, and the forum lost it. Apologies if it appears twice.
You can achieve this using alternating positive lookaheads to discard leading characters in the job identifier based on a trailing hostname value:
| makeresults format=csv data="
_raw
\"{\"\"group_id\"\": \"\"aa2211-3b22-4263-8fe7-e7ef6c7859ed\"\", \"\"Status\"\": \"\"OPEN\"\", \"\"description\"\": \"\"aa01 : GOT EQQE056I JOB T375P201(JOB91552), OPERATION(0010), OPERATION TEXT( ), ENDED IN ERROR JCL . PRTY=1, APPL = T375P2 , WORK STATION = CPU4, IE= 2603161420, NO E2E RC\"\", \"\"hostname\"\": \"\"aa01\"\", \"\"CI_company\"\": \"\"zzz Group\"\", \"\"CEC_ID\"\": \"\"job-585796313\"\", \"\"ORIGIN_By_Product\"\": \"\"Microsoft\"\"}\"
\"{\"\"group_id\"\": \"\"bb2211-3b22-4263-8fe7-e7ef6c7859ed\"\", \"\"Status\"\": \"\"OPEN\"\", \"\"description\"\": \"\"bb01 : GOT EQQE006I JOB T375Q201(JOB91632), OPERATION(0010), OPERATION TEXT( ), ENDED IN ERROR JCL . RTY=1, APPL = T375Q2 , WORK STATION = CPU5, IE= 2606661420, NO E2E RC\"\", \"\"hostname\"\": \"\"bb01\"\", \"\"CI_company\"\": \"\"xxx Group\"\", \"\"CEC_ID\"\": \"\"job-565796313\"\", \"\"ORIGIN_By_Product\"\": \"\"Microsoft\"\"}\"
"
| rex "JOB (?|(?=.+\"hostname\": \"aa01\")|..(?=.+\"hostname\": \"bb01\"))(?<core_key>...)"
| table core_keycore_key
T37
75QUsing a search-time extraction in props.conf:
[your_sourcetype_or_other_spec]
EXTRACT-core_key = JOB (?|(?=.+"hostname": "aa01")|..(?=.+"hostname": "bb01"))(?<core_key>...)EDIT:
Since the hostname also appears at the beginning of the description, you can simplify the regular expression:
"description": "(?:aa01 .+ JOB |bb01 .+ JOB ..)(?<core_key>...)If you have an arbitrarily long list of hostnames to match, I might take a different approach. Let us know.
🌟 Did this answer help you? If so, please consider:
Your feedback encourages the volunteers in this community to continue contributing
Hi @ra_52194724
How about this?
| makeresults
| eval _raw="{\"hostname\": \"aa01\", \"description\": \"aa01 : GOT EQQE056I JOB T375P201(JOB91552)\"}"
| append [
| makeresults
| eval _raw="{\"hostname\": \"bb01\", \"description\": \"bb01 : GOT EQQE006I JOB T375Q201(JOB91632)\"}"
]
| spath
```Use replace() to capture exact characters without a separate rex command```
| eval core_key = case(
hostname == "aa01" AND match(description, "JOB\s+"), replace(description, "(?s)^.*?JOB\s+(.{3}).*$", "\1"),
hostname == "bb01" AND match(description, "JOB\s+"), replace(description, "(?s)^.*?JOB\s+..(.{3}).*$", "\1")
)
| table hostname, description, core_keyThis could be made into a calculated field for props.conf:
#props.conf
[yourSourcetype]
EVAL-core_key = case(hostname == "aa01" AND match(description, "JOB\\s+"), replace(description, "(?s)^.*?JOB\\s+(.{3}).*$", "\\1"), hostname == "bb01" AND match(description, "JOB\\s+"), replace(description, "(?s)^.*?JOB\\s+..(.{3}).*$", "\\1"))🌟 Did this answer help you? If so, please consider:
Your feedback encourages the volunteers in this community to continue contributing
i'll be able to manage my use case as you provided regex. Help me the regex query if i have to from highlighted position.
Example : APPL = T375P2
APPL = T375Q2
{"group_id": "aa2211-3b22-4263-8fe7-e7ef6c7859ed", "Status": "OPEN", "description": "aa01 : GOT EQQE056I JOB T375P201(JOB91552), OPERATION(0010), OPERATION TEXT( ), ENDED IN ERROR JCL . PRTY=1, APPL = T375P2 , WORK STATION = CPU4, IE= 2603161420, NO E2E RC", "hostname": "aa01", "CI_company": "zzz Group", "CEC_ID": "job-585796313", "ORIGIN_By_Product": "Microsoft"}
{"group_id": "bb2211-3b22-4263-8fe7-e7ef6c7859ed", "Status": "OPEN", "description": "bb01 : GOT EQQE006I JOB T375Q201(JOB91632), OPERATION(0010), OPERATION TEXT( ), ENDED IN ERROR JCL . RTY=1, APPL = T375Q2 , WORK STATION = CPU5, IE= 2606661420, NO E2E RC", "hostname": "bb01", "CI_company": "xxx Group", "CEC_ID": "job-565796313", "ORIGIN_By_Product": "Microsoft"}
This is a very similar issue - instead of matching with "JOB ", you need to match with "APPL = "
#props.conf
[yourSourcetype]
EVAL-core_key = case(hostname == "aa01" AND match(description, "APPL\\s+=\\s+"), replace(description, "(?s)^.*?APPL\\s+=\\s+(.{3}).*$", "\\1"), hostname == "bb01" AND match(description, "APPL\\s+=\\s+"), replace(description, "(?s)^.*?APPL\\s+=\\s+..(.{3}).*$", "\\1"))
@ra_52194724 - You can set up custom props.conf so that Splunk automatically extracts both the hostname and job_id from the description field, then derive core_key from job_id. This will be like creating an EXTRACT- stanza with respective regex pattern which will pull out the values, and then you can add an EVAL stanza to compute the core_key field automatically. That way, every event of that sourcetype will have hostname, job_id, and core_key available without needing to run rex in your searches.
Use regex101, my personal favorite for building regex.
regex101: build, test, and debug regex
Hope this helps.
>>
If this post addressed your question, you can:
Acknowledging helpful answers keeps the community strong and motivates contributors to continue sharing their expertise.
>>
Assuming this has been already parsed as JSON, and that for "aa01" you want the next 3 characters after "JOB ", and for "bb01" you want to skip an extra 2 characters before taking the next 3, you could try this:
| rex field=description "aa01.+?JOB (?<core_key>...)"
| rex field=description "bb01.+?JOB ..(?<core_key>...)"
Can't we do it using props rather than writing two time inline rex command.?