Splunk Search

Field Extraction with Dynamic Data Structure (Field/Value Pair)?

SplunkDash
Motivator

Hello,

I have a data source with dynamic structure, position of comma separated field/value changes for some of the events. A few sample events and the extraction I used are giving below. My extraction is working for event one, but not working for other 2 events as field/values position changes there. Is there any way we can use one field extraction code to address this issue will be highly appreciated. Thank you so much.

Timestamp:(?P<TIME_STAMP>.+), Type:(?P<TYPE>.+), EType:(?P<EType>.+), TCode:(?P <TCode>.+), EventId: (?P<EventId>.+), Id: (?P<Id>.+),  SAddress: (?P<SAddress>.+), System: (?P< System >.+), SId: (?P<SId>.+), eSignCode: (?P< eSignCode >.+), RCode: (?P< RCode >.+), Error: (?P< Error >.+)

2022-10-12 06:42:36.591 { INFO } [default task-79] - Timestamp: 2022-10-12T11:42:39.591Z, eSignCode: 3012, Type: REGT, EType: ESIGN, TCode: 23005, EventId: GET_SIGN, Id: 12045, SAddress: 35.168.40.67,  System: EIVES, SId: =/=S()A.b(X(-yJrV/+do)f(Q_)uca-/6+o_v.k|39OYc+Fh_=YOX-iDA++===,  RCode: 000, Error: nullm

2022-10-12 06:42:30.591 { INFO } [default task-79] - Timestamp: 2022-10-12T11:42:30.591Z, Type: REGT, TCode: 23305,  Id: 12045, SAddress: 35.168.40.67, System: EIVES, SId: =/=S()A.b(X(-yJrV/+do)f(Q_)uca-/6+o_v.k|39OYc+Fh_=YOX-iDA++===, eSignCode: 3012, EventId: GET_SIGN, Error: nullm

2022-10-14 06:42:26.591 { INFO } [default task-79] - Timestamp: 2022-10-12T11:42:26.591Z, Type: REGT, TCode: 23015, EventId: GET_SIGN, RCode: 010, Id: 12045, SAddress: 35.168.40.65, System: EIVES, SId: =/=S()A.b(X(-yJrV/98do)f(Q_)tca-/6+o_v.k|39OYc+Fh_=YOX-iDA++===, EventId: GET_SIGN,  Error: nullm

 

Labels (1)
0 Karma

SplunkDash
Motivator

Hello all, 

Thank you so much for your quick response, but any of them I cannot use in In-Line field extraction available in SPLUNK web.

0 Karma

yuanliu
SplunkTrust
SplunkTrust

This is confusing.  Why cannot field-by-field extraction (as @jdunlea suggested) be used in inline field extraction?  You just enter them one by one

For timestamp, enter

"Timestamp:\s*(?P<TIME_STAMP>[^,]+)"

Similarly, enter

"Type:\s*(?P<TYPE>[^,]+)"
"EType:\s*(?P<EType>[^,]+)"
"TCode:\s*(?P<TCode>[^,]+)"
"EventId:\s*(?P<EventId>[^,]+)"
"Id:\s*(?P<Id>[^,]+)"
"SAddress:\s*(?P<SAddress>[^,]+)"
"System:\s*(?P<System>[^,]+)"
"SId:\s*(?P<SId>[^,]+)"
"eSignCode:\s*(?P<eSignCode>[^,]+)"
"RCode:\s*(?P<RCode>[^,]+)"

and

"Error:\s*(?P<Error>[^,]+)"

 

SplunkDash
Motivator

Hello,

Yes, we can use that approach, go with field by field. But sometime source fields are created dynamically, and, in that case, we don't know the field value pairs; also, we need to create around 10 to 12 separate extractions. How would we address that? Thank you again.

Tags (1)
0 Karma

yuanliu
SplunkTrust
SplunkTrust

That is where @johnhuang's suggestion comes to play.

| extract pairdelim=",",kvdelim=":"

In search line, of course, not automatic.  Also, it doesn't work with Timestamp field.  If your developers refuse to maintain an agreed-upon log format - yes, I know that happens, you are left with few choices.

Speaking of log format, the existing format is regular enough that they could have simply used "=" instead of ":" and you would have no problem of this sort.  It may be worth exerting any influence that you can.

In the meantime, you can put either johnhua's or jdunlea's solution in a macro and insert it whenever needed.

SplunkDash
Motivator

Hello @yuanliu,

Thank you so much again and sounds good to me. I have one more question, is there any way we can use props and transforms configurations to implement this extraction?

0 Karma

yuanliu
SplunkTrust
SplunkTrust

If there is any, I haven't found it. (And not for lack of trying.)  You can still extract individual fields automatically as jdunlea suggested.

johnhuang
Motivator

Looks like a good use case for kv extraction:

| makeresults
| eval _raw="2022-10-12 06:42:36.591 { INFO } [default task-79] - Timestamp: 2022-10-12T11:42:39.591Z, eSignCode: 3012, Type: REGT, EType: ESIGN, TCode: 23005, EventId: GET_SIGN, Id: 12045, SAddress: 35.168.40.67,  System: EIVES, SId: =/=S()A.b(X(-yJrV/+do)f(Q_)uca-/6+o_v.k|39OYc+Fh_=YOX-iDA++===,  RCode: 000, Error: nullm"
| append [| makeresults 
| eval _raw="2022-10-12 06:42:30.591 { INFO } [default task-79] - Timestamp: 2022-10-12T11:42:30.591Z, Type: REGT, TCode: 23305,  Id: 12045, SAddress: 35.168.40.67, System: EIVES, SId: =/=S()A.b(X(-yJrV/+do)f(Q_)uca-/6+o_v.k|39OYc+Fh_=YOX-iDA++===, eSignCode: 3012, EventId: GET_SIGN, Error: nullm"]
| append [| makeresults 
| eval _raw="   2022-10-14 06:42:26.591 { INFO } [default task-79] - Timestamp: 2022-10-12T11:42:26.591Z, Type: REGT, TCode: 23015, EventId: GET_SIGN, RCode: 010, Id: 12045, SAddress: 35.168.40.65, System: EIVES, SId: =/=S()A.b(X(-yJrV/98do)f(Q_)tca-/6+o_v.k|39OYc+Fh_=YOX-iDA++===, EventId: GET_SIGN,  Error: nullm"]
| extract pairdelim=",",kvdelim=":"

 

jdunlea
Contributor

I would recommend breaking up your rex statement into a few different regexes. This way, you can anchor on the items that are closer to the data you want to extract. 

For example:

| rex field=_raw "<TYPE regex here>"
| rex field=_raw "<EType regex here>"
| rex field=_raw "<TCode regex here>"
etc. 

 

Alternatively you can construct 2 or 3 large regexes that can accommodate the different event structures you have, and in each regex, call the fields slightly different names. 

I.E. Regex 1 would extract TCode1, and regex 2 would extract TCode2. 

Then you can use the eval command with the coalesce function to merge these fields together later on to TCode.

For example:

| eval TCode=coalesce(Tcode1,TCode2)  
Get Updates on the Splunk Community!

[Puzzles] Solve, Learn, Repeat: Dynamic formatting from XML events

This challenge was first posted on Slack #puzzles channelFor a previous puzzle, I needed a set of fixed-length ...

Enter the Agentic Era with Splunk AI Assistant for SPL 1.4

  &#x1f680; Your data just got a serious AI upgrade — are you ready? Say hello to the Agentic Era with the ...

Stronger Security with Federated Search for S3, GCP SQL & Australian Threat ...

Splunk Lantern is a Splunk customer success center that provides advice from Splunk experts on valuable data ...