Splunk Search

Field Extraction with Dynamic Data Structure (Field/Value Pair)?

SplunkDash
Motivator

Hello,

I have a data source with dynamic structure, position of comma separated field/value changes for some of the events. A few sample events and the extraction I used are giving below. My extraction is working for event one, but not working for other 2 events as field/values position changes there. Is there any way we can use one field extraction code to address this issue will be highly appreciated. Thank you so much.

Timestamp:(?P<TIME_STAMP>.+), Type:(?P<TYPE>.+), EType:(?P<EType>.+), TCode:(?P <TCode>.+), EventId: (?P<EventId>.+), Id: (?P<Id>.+),  SAddress: (?P<SAddress>.+), System: (?P< System >.+), SId: (?P<SId>.+), eSignCode: (?P< eSignCode >.+), RCode: (?P< RCode >.+), Error: (?P< Error >.+)

2022-10-12 06:42:36.591 { INFO } [default task-79] - Timestamp: 2022-10-12T11:42:39.591Z, eSignCode: 3012, Type: REGT, EType: ESIGN, TCode: 23005, EventId: GET_SIGN, Id: 12045, SAddress: 35.168.40.67,  System: EIVES, SId: =/=S()A.b(X(-yJrV/+do)f(Q_)uca-/6+o_v.k|39OYc+Fh_=YOX-iDA++===,  RCode: 000, Error: nullm

2022-10-12 06:42:30.591 { INFO } [default task-79] - Timestamp: 2022-10-12T11:42:30.591Z, Type: REGT, TCode: 23305,  Id: 12045, SAddress: 35.168.40.67, System: EIVES, SId: =/=S()A.b(X(-yJrV/+do)f(Q_)uca-/6+o_v.k|39OYc+Fh_=YOX-iDA++===, eSignCode: 3012, EventId: GET_SIGN, Error: nullm

2022-10-14 06:42:26.591 { INFO } [default task-79] - Timestamp: 2022-10-12T11:42:26.591Z, Type: REGT, TCode: 23015, EventId: GET_SIGN, RCode: 010, Id: 12045, SAddress: 35.168.40.65, System: EIVES, SId: =/=S()A.b(X(-yJrV/98do)f(Q_)tca-/6+o_v.k|39OYc+Fh_=YOX-iDA++===, EventId: GET_SIGN,  Error: nullm

 

Labels (1)
0 Karma

SplunkDash
Motivator

Hello all, 

Thank you so much for your quick response, but any of them I cannot use in In-Line field extraction available in SPLUNK web.

0 Karma

yuanliu
SplunkTrust
SplunkTrust

This is confusing.  Why cannot field-by-field extraction (as @jdunlea suggested) be used in inline field extraction?  You just enter them one by one

For timestamp, enter

"Timestamp:\s*(?P<TIME_STAMP>[^,]+)"

Similarly, enter

"Type:\s*(?P<TYPE>[^,]+)"
"EType:\s*(?P<EType>[^,]+)"
"TCode:\s*(?P<TCode>[^,]+)"
"EventId:\s*(?P<EventId>[^,]+)"
"Id:\s*(?P<Id>[^,]+)"
"SAddress:\s*(?P<SAddress>[^,]+)"
"System:\s*(?P<System>[^,]+)"
"SId:\s*(?P<SId>[^,]+)"
"eSignCode:\s*(?P<eSignCode>[^,]+)"
"RCode:\s*(?P<RCode>[^,]+)"

and

"Error:\s*(?P<Error>[^,]+)"

 

SplunkDash
Motivator

Hello,

Yes, we can use that approach, go with field by field. But sometime source fields are created dynamically, and, in that case, we don't know the field value pairs; also, we need to create around 10 to 12 separate extractions. How would we address that? Thank you again.

Tags (1)
0 Karma

yuanliu
SplunkTrust
SplunkTrust

That is where @johnhuang's suggestion comes to play.

| extract pairdelim=",",kvdelim=":"

In search line, of course, not automatic.  Also, it doesn't work with Timestamp field.  If your developers refuse to maintain an agreed-upon log format - yes, I know that happens, you are left with few choices.

Speaking of log format, the existing format is regular enough that they could have simply used "=" instead of ":" and you would have no problem of this sort.  It may be worth exerting any influence that you can.

In the meantime, you can put either johnhua's or jdunlea's solution in a macro and insert it whenever needed.

SplunkDash
Motivator

Hello @yuanliu,

Thank you so much again and sounds good to me. I have one more question, is there any way we can use props and transforms configurations to implement this extraction?

0 Karma

yuanliu
SplunkTrust
SplunkTrust

If there is any, I haven't found it. (And not for lack of trying.)  You can still extract individual fields automatically as jdunlea suggested.

johnhuang
Motivator

Looks like a good use case for kv extraction:

| makeresults
| eval _raw="2022-10-12 06:42:36.591 { INFO } [default task-79] - Timestamp: 2022-10-12T11:42:39.591Z, eSignCode: 3012, Type: REGT, EType: ESIGN, TCode: 23005, EventId: GET_SIGN, Id: 12045, SAddress: 35.168.40.67,  System: EIVES, SId: =/=S()A.b(X(-yJrV/+do)f(Q_)uca-/6+o_v.k|39OYc+Fh_=YOX-iDA++===,  RCode: 000, Error: nullm"
| append [| makeresults 
| eval _raw="2022-10-12 06:42:30.591 { INFO } [default task-79] - Timestamp: 2022-10-12T11:42:30.591Z, Type: REGT, TCode: 23305,  Id: 12045, SAddress: 35.168.40.67, System: EIVES, SId: =/=S()A.b(X(-yJrV/+do)f(Q_)uca-/6+o_v.k|39OYc+Fh_=YOX-iDA++===, eSignCode: 3012, EventId: GET_SIGN, Error: nullm"]
| append [| makeresults 
| eval _raw="   2022-10-14 06:42:26.591 { INFO } [default task-79] - Timestamp: 2022-10-12T11:42:26.591Z, Type: REGT, TCode: 23015, EventId: GET_SIGN, RCode: 010, Id: 12045, SAddress: 35.168.40.65, System: EIVES, SId: =/=S()A.b(X(-yJrV/98do)f(Q_)tca-/6+o_v.k|39OYc+Fh_=YOX-iDA++===, EventId: GET_SIGN,  Error: nullm"]
| extract pairdelim=",",kvdelim=":"

 

jdunlea
Contributor

I would recommend breaking up your rex statement into a few different regexes. This way, you can anchor on the items that are closer to the data you want to extract. 

For example:

| rex field=_raw "<TYPE regex here>"
| rex field=_raw "<EType regex here>"
| rex field=_raw "<TCode regex here>"
etc. 

 

Alternatively you can construct 2 or 3 large regexes that can accommodate the different event structures you have, and in each regex, call the fields slightly different names. 

I.E. Regex 1 would extract TCode1, and regex 2 would extract TCode2. 

Then you can use the eval command with the coalesce function to merge these fields together later on to TCode.

For example:

| eval TCode=coalesce(Tcode1,TCode2)  
Get Updates on the Splunk Community!

Splunk MCP & Agentic AI: Machine Data Without Limits

Discover how the Splunk Model Context Protocol (MCP) Server can revolutionize the way your organization uses ...

Application management with Targeted Application Install for Victoria Experience

Experience a new era of flexibility in managing your Splunk Cloud Platform apps! With Targeted Application ...

[Puzzles] Solve, Learn, Repeat: Dynamic formatting from XML events

This challenge was first posted on Slack #puzzles channelFor a previous puzzle, I needed a set of fixed-length ...