Splunk Search

Extract Key value data from raw events

bijodev1
Communicator

Hi Team,

I am trying to pull the data for the below raw events.

{"name":"Content-Length","valueList":["94"]}
{"name":"Referer","valueList":["https://www.abc.com/xyz/pageID"]}

 

The end result I am looking for is :

Content-Length :  94

Referer : /xyz/pageID

 

I am not sure if this is really possible to pull make a value to a key pair.  Still trying to understand the regex but no luck. if someone could please give a hint or help on how to work with this.

Labels (1)
0 Karma
1 Solution

kamlesh_vaghela
SplunkTrust
SplunkTrust

@bijodev1 

Can you please try this?

| makeresults 
| eval _raw="\"Header\":[{\"name\":\"orderID\",\"valueList\":[\"101\"]},{\"name\":\"Content-Length\",\"valueList\":[\"94\"]},{\"name\":\"Referer\",\"valueList\":[\"https://www.abc.com/xyz/pageID\"]},{\"name\":\"User-Agent\",\"valueList\":[\"Androidv11\"]}." 
| rex field=_raw "\"Header\":\[(?<raw>.*).$" 
| rex field=raw mode=sed "s/},{/}|{/g"
| eval raw=split(raw,"|") |eval a=1 | accum a| mvexpand raw| rename raw  as _raw
| extract | rename valueList{} as value  | eval result=name.": ".value
| eval orderID = if(name="orderID",value,null())
| stats values(result) as result values(orderID) as orderID by a
| table orderID result

 

Thanks
KV
▄︻̷̿┻̿═━一

If any of my reply helps you to solve the problem Or gain knowledge, an upvote would be appreciated.

View solution in original post

bijodev1
Communicator

@kamlesh_vaghela 

Hi Kamlesh, I am stuck with pulling the stats for multiple fields with the same regex. It doesn't provides me the data which I am looking for. It gives the number of counts.

search .. same regex | eval orderID = if(name="OrderID",value,null()) || eval sessionID = if(name="session-ID",value,null()) | stats count by orderID, sessionID

doesn't works. I have tried all means  no luck. 

I am looking for , when the orderID and Jession ID are same what's the count.

 

for eg.

OrderID   | sessionID | Count

ab12 | gtyhd34 | 55
ab13 | gtyd54d | 13

0 Karma

kamlesh_vaghela
SplunkTrust
SplunkTrust

@bijodev1 

Can you please share sample events with OrderID and SessionId?

KV

0 Karma

bijodev1
Communicator

@kamlesh_vaghela 

| makeresults 
| eval _raw="\"Header\":[{\"name\":\"orderID\",\"valueList\":[\"101\"]},{\"name\":\"Content-Length\",\"valueList\":[\"94\"]},{\"name\":\"Referer\",\"valueList\":[\"https://www.abc.com/xyz/pageID\"]},{\"name\":\"User-Agent\",\"valueList\":[\"Androidv11\"]}." 
| rex field=_raw "\"Header\":\[(?<raw>.*).$" 
| rex field=raw mode=sed "s/},{/}|{/g"
| eval raw=split(raw,"|") |eval a=1 | accum a| mvexpand raw| rename raw  as _raw
| extract | rename valueList{} as value  | eval result=name.": ".value
| eval orderID = if(name="orderID",value,null())

along with the orderID. I am trying to eval sessionID = if (name="session-ID", value,null()).

When I combine both the orderID and sessionID to pull the stats count, it doesn't works like this - stats count by orderID,sessionID

0 Karma

kamlesh_vaghela
SplunkTrust
SplunkTrust

@bijodev1 

Can you please try this?

 

| makeresults 
| eval _raw="\"Header\":[{\"name\":\"session-ID\",\"valueList\":[\"101\"]},{\"name\":\"orderID\",\"valueList\":[\"101\"]},{\"name\":\"Content-Length\",\"valueList\":[\"94\"]},{\"name\":\"Referer\",\"valueList\":[\"https://www.abc.com/xyz/pageID\"]},{\"name\":\"User-Agent\",\"valueList\":[\"Androidv11\"]}." 
| append [| makeresults 
| eval _raw="\"Header\":[{\"name\":\"session-ID\",\"valueList\":[\"1011\"]},{\"name\":\"orderID\",\"valueList\":[\"1011\"]},{\"name\":\"Content-Length\",\"valueList\":[\"94\"]},{\"name\":\"Referer\",\"valueList\":[\"https://www.abc.com/xyz/pageID\"]},{\"name\":\"User-Agent\",\"valueList\":[\"Androidv11\"]}." ]
| append [| makeresults 
| eval _raw="\"Header\":[{\"name\":\"session-ID\",\"valueList\":[\"101\"]},{\"name\":\"orderID\",\"valueList\":[\"101\"]},{\"name\":\"Content-Length\",\"valueList\":[\"94\"]},{\"name\":\"Referer\",\"valueList\":[\"https://www.abc.com/xyz/pageID\"]},{\"name\":\"User-Agent\",\"valueList\":[\"Androidv11\"]}." ]
| rex field=_raw "\"Header\":\[(?<raw>.*).$" | rex field=raw mode=sed "s/},{/}|{/g" | eval raw=split(raw,"|") | eval a=1 | accum a | mvexpand raw | rename raw as _raw | extract | rename valueList{} as value | eval result=name.": ".value 
| eval orderID = if(name="orderID",value,null()) 
| eval sessionID = if (name="session-ID", value,null())
| stats values(orderID) as orderID values(sessionID) as sessionID by a
| stats count by orderID, sessionID

 

Thanks
KV
▄︻̷̿┻̿═━一

If any of my reply helps you to solve the problem Or gain knowledge, an upvote would be appreciated.

bijodev1
Communicator

@kamlesh_vaghela 

can we try to pull the data without using (stats .......... by a).

by a - only picks one orderID.

 

for this query output looks like this :

orderID  | SESSIONID | COUNT

A123 | ADRFE345 | 55
A123 | AVFRD563 | 19
A123 | ATRDF637 |  20

 

what I am actually looking is for all the orderID which are there too.

orderID  | SESSIONID | COUNT

A123 | ADRFE345 | 55
A123 | AVFRD563 | 19
A123 | ATRDF637 |  20
A121 | ADRFE335 | 59
A121 | AVFRD564 | 29
A124 | ATRDF627 |  30

0 Karma

kamlesh_vaghela
SplunkTrust
SplunkTrust

@bijodev1 

 

| stats values(orderID) as orderID values(sessionID) as sessionID by a

will give you ```orderID``` and ```sessionID`` both..  ```a``` field represents unique events.  So by doing ```by a`` we are making data with unique event.

Check this.

 

Screenshot 2021-06-15 at 8.07.17 PM.png

 

After processing these events , upto by clause it will show like this.

 

Screenshot 2021-06-15 at 8.08.52 PM.png

So when multiple event has same OrderId it will consider separately.

like ```OderId 101``` this.

Screenshot 2021-06-15 at 8.10.33 PM.png

 So I think this logic will work with your requirement.

In case still it is not working, please send me OP of below search. 🙂

YOUR_SEARCH
| rex field=_raw "\"Header\":\[(?<raw>.*).$" | rex field=raw mode=sed "s/},{/}|{/g" | eval raw=split(raw,"|") | eval a=1 | accum a | mvexpand raw | rename raw as _raw | extract | rename valueList{} as value | eval result=name.": ".value 
| eval orderID = if(name="orderID",value,null()) 
| eval sessionID = if (name="session-ID", value,null())
| table orderID sessionID a 

 

 

Thanks
KV
▄︻̷̿┻̿═━一

If any of my reply helps you to solve the problem Or gain knowledge, an upvote would be appreciated.

0 Karma

bijodev1
Communicator

@kamlesh_vaghela 

sorry to bother you again. for the same query 

{"name":"Content-Length","valueList":["94"]}
{"name":"Referer","valueList":["https://www.abc.com/xyz/pageID"]}
{"name":"Cookie","valueList:[abc=123;s_id=9wefdrtunhfkd6; df_id=xijuhygsdd342;data=helloworld]}

 

I want to pull the complete value List for the name Cookie. Based on that I need to run the query

stats count by Cookie - where it display how much was the count of ----  abc,  s_id, df_id, data 

 

0 Karma

kamlesh_vaghela
SplunkTrust
SplunkTrust

@bijodev1 

Can you please try this?

| makeresults 
| eval raw="{\"name\":\"Content-Length\",\"valueList\":[\"94\"]}|{\"name\":\"Referer\",\"valueList\":[\"https://www.abc.com/xyz/pageID\"]}|{\"name\":\"Cookie\",\"valueList\":[\"abc=123;s_id=9wefdrtunhfkd6; df_id=xijuhygsdd342;data=helloworld\"]}|{\"name\":\"Content-Length\",\"valueList\":[\"94\"]}|{\"name\":\"Referer\",\"valueList\":[\"https://www.abc.com/xyz/pageID]\"}|{\"name\":\"Cookie\",\"valueList\":[\"abc=123;s_id=9wefdrtunhfkd6; df_id=xijuhygsdd342;data=helloworld]\"}", raw=split(raw,"|") 
| mvexpand raw | rename raw as _raw | extract | where name= "Cookie" | rename valueList{} as _raw  | extract | stats count(eval(isnotnull(abc))) as abc, count(eval(isnotnull(s_id))) as s_id, count(eval(isnotnull(df_id))) as df_id, count(eval(isnotnull(data))) as data

 

Thanks
KV
▄︻̷̿┻̿═━一

If any of my reply helps you to solve the problem Or gain knowledge, an upvote would be appreciated.

0 Karma

bijodev1
Communicator

@kamlesh_vaghela 

can't we make something in the same regex.

| rex field=_raw "\"Headers\":\[(?<raw>.*).$"

| rex field=raw mode=sed "s/},{/}|{/g"

 

Because the "cookie" field is the part of Headers and it's values are separated with semi colon ;. There are many values under Cookie fields which uses semi colon as separator.

"Headers" :{"name":"Content-Length","valueList":["94"]}, [{"name":"Cookie","valueList":["xrm7=-762BMB0; exp-ck=1; CQTEST=1; xTpYacs=; DQ=Y; DX=wsdaquijhs; S_ID=xyat; Latency=1; TB_N=10; TB_SFOU-100=; C_Flag=0; vct_id=9XgVPsnKid7aaiY; bct_id=X89wgVnSdKdiU1gqaa]},…………….

 

0 Karma

kamlesh_vaghela
SplunkTrust
SplunkTrust

@bijodev1 

Can you please try this?

 

| makeresults 
| eval _raw="\"Headers\" :{\"name\":\"Content-Length\",\"valueList\":[\"94\"]}, [{\"name\":\"Cookie\",\"valueList\":[\"xrm7=-762BMB0; exp-ck=1; CQTEST=1; xTpYacs=; DQ=Y; DX=wsdaquijhs; S_ID=xyat; Latency=1; TB_N=10; TB_SFOU-100=; C_Flag=0; vct_id=9XgVPsnKid7aaiY; bct_id=X89wgVnSdKdiU1gqaa]\"}" 
| rex field=_raw "\[\{\"name\":\"Cookie\",\"valueList\":\[\"(?<coockie_value>[^\]]+)" 
| search coockie_value=* 
| rename coockie_value as _raw 
| extract pairdelim=";" kvdelim="=" 
| stats count(*) as *

 

Thanks
KV
▄︻̷̿┻̿═━一

If any of my reply helps you to solve the problem Or gain knowledge, an upvote would be appreciated.

bijodev1
Communicator

@kamlesh_vaghela 

thank you so much Kamlesh. It was worth look into this.

0 Karma

bijodev1
Communicator

@kamlesh_vaghela 

Can you put some light on the regex which you have wrote :

| rex field=_raw "\"requestHeaderList\":\[(?<raw>.*).$"

| rex field=raw mode=sed "s/},{/}|{/g"

| eval raw=split(raw,"|")| mvexpand raw | rename raw  as _raw

 

if you can just guide what these query basically doing. It would be really helpful. 

Thank you so much.

0 Karma

kamlesh_vaghela
SplunkTrust
SplunkTrust

@bijodev1 

| rex field=_raw "\"Header\":\[(?<raw>.*).$" 

 This will extract JSON data from _raw event and assign into new field raw.

From: 

"Header":[{"name":"Content-Length","valueList":["94"]},{"name":"Referer","valueList":["https://www.abc.com/xyz/pageID"]},{"name":"User-Agent","valueList":["Androidv11"]}.

to:

{"name":"Content-Length","valueList":["94"]},{"name":"Referer","valueList":["https://www.abc.com/xyz/pageID"]},{"name":"User-Agent","valueList":["Androidv11"]}

--

 

| rex field=raw mode=sed "s/},{/}|{/g"

This will replace commas between different json with pipe(|). It is required for next operation 

From:

{"name":"Content-Length","valueList":["94"]},{"name":"Referer","valueList":["https://www.abc.com/xyz/pageID"]},{"name":"User-Agent","valueList":["Androidv11"]}

To:

{"name":"Content-Length","valueList":["94"]}|{"name":"Referer","valueList":["https://www.abc.com/xyz/pageID"]}|{"name":"User-Agent","valueList":["Androidv11"]}

--

| eval raw=split(raw,"|") |eval a=1 | accum a| mvexpand raw| rename raw  as _raw

This will split raw into multiple events and assign into _raw and keep unique value, here it is field a.

From

{"name":"Content-Length","valueList":["94"]}|{"name":"Referer","valueList":["https://www.abc.com/xyz/pageID"]}|{"name":"User-Agent","valueList":["Androidv11"]}


To:

{"name":"Content-Length","valueList":["94"]}
{"name":"Referer","valueList":["https://www.abc.com/xyz/pageID"]}
{"name":"User-Agent","valueList":["Androidv11"]}

 

Upto now we cab easily access the name and valueList fields to perform next operation

😀

 

Thanks
KV
▄︻̷̿┻̿═━一

If any of my reply helps you to solve the problem Or gain knowledge, an upvote would be appreciated.

 

bijodev1
Communicator

thanks @kamlesh_vaghela  for the response. The problem is the events are very dense and each field has it's own nested key value pairs. So I am not sure how to pull this data.

"Header":[{"name":"Content-Length","valueList":["94"]},{"name":"Referer","valueList":["https://www.abc.com/xyz/pageID"]},{"name":"User-Agent","valueList":["Androidv11"]}.

like this the entire events consist of name and valueList. I am not sure how to remove this from their raw events.

0 Karma

kamlesh_vaghela
SplunkTrust
SplunkTrust

@bijodev1 

Can you please try this?

YOUR_SEARCH
| rex field=_raw "\"Header\":\[(?<raw>.*).$" 
| rex field=raw mode=sed "s/},{/}|{/g"
| eval raw=split(raw,"|")| mvexpand raw| rename raw  as _raw
| extract | rename valueList{} as value | table name value | eval result=name.": ".value

 

My Sample Search :

| makeresults 
| eval _raw="\"Header\":[{\"name\":\"Content-Length\",\"valueList\":[\"94\"]},{\"name\":\"Referer\",\"valueList\":[\"https://www.abc.com/xyz/pageID\"]},{\"name\":\"User-Agent\",\"valueList\":[\"Androidv11\"]}." 
| rex field=_raw "\"Header\":\[(?<raw>.*).$" 
| rex field=raw mode=sed "s/},{/}|{/g"
| eval raw=split(raw,"|")| mvexpand raw| rename raw  as _raw
| extract | rename valueList{} as value | table name value | eval result=name.": ".value

 

Sample event:

"Header":[{"name":"Content-Length","valueList":["94"]},{"name":"Referer","valueList":["https://www.abc.com/xyz/pageID"]},{"name":"User-Agent","valueList":["Androidv11"]}.

 

Thanks
KV
▄︻̷̿┻̿═━一

If any of my reply helps you to solve the problem Or gain knowledge, an upvote would be appreciated.

bijodev1
Communicator

Hi @kamlesh_vaghela  thank you so much for the query. Is it possible to display the table for the "path" and have the "result" column. 

PATH | Result.

like PATH - should contain only single row and Result can have multiple data.

 

 

0 Karma

kamlesh_vaghela
SplunkTrust
SplunkTrust

@bijodev1 

PATH means Referer field?? Can you please share your expected op from your given sample?

0 Karma

bijodev1
Communicator

@kamlesh_vaghela 

thanks for your query. It was really helpful. I was able to extract the data exactly the way I wanted. Inside that headerlist - we have one more field orderID. I want to use that as one column and the rest of them as the result column.

 

0 Karma

kamlesh_vaghela
SplunkTrust
SplunkTrust

@bijodev1 

Can you please try this?

| makeresults 
| eval _raw="\"Header\":[{\"name\":\"orderID\",\"valueList\":[\"101\"]},{\"name\":\"Content-Length\",\"valueList\":[\"94\"]},{\"name\":\"Referer\",\"valueList\":[\"https://www.abc.com/xyz/pageID\"]},{\"name\":\"User-Agent\",\"valueList\":[\"Androidv11\"]}." 
| rex field=_raw "\"Header\":\[(?<raw>.*).$" 
| rex field=raw mode=sed "s/},{/}|{/g"
| eval raw=split(raw,"|") |eval a=1 | accum a| mvexpand raw| rename raw  as _raw
| extract | rename valueList{} as value  | eval result=name.": ".value
| eval orderID = if(name="orderID",value,null())
| stats values(result) as result values(orderID) as orderID by a
| table orderID result

 

Thanks
KV
▄︻̷̿┻̿═━一

If any of my reply helps you to solve the problem Or gain knowledge, an upvote would be appreciated.

bijodev1
Communicator

@kamlesh_vaghela 

It's kind of working but not fully. Let's say there are 20000+ events.

Like in a raw events there can be request where the order ID would be same for many of the request.

I will like to pull the logs for specific orderID and the second column would be list of headers with it.

 

It would look like :

 

orderID |  Result

a123 | content-length, referer, etc

a214 | content-length, referer, accept-language

0 Karma
Get Updates on the Splunk Community!

Splunk Decoded: Service Maps vs Service Analyzer Tree View vs Flow Maps

It’s Monday morning, and your phone is buzzing with alert escalations – your customer-facing portal is running ...

What’s New in Splunk Observability – September 2025

What's NewWe are excited to announce the latest enhancements to Splunk Observability, designed to help ITOps ...

Fun with Regular Expression - multiples of nine

Fun with Regular Expression - multiples of nineThis challenge was first posted on Slack #regex channel ...