Splunk Search

Extract Key value data from raw events

bijodev1
Communicator

Hi Team,

I am trying to pull the data for the below raw events.

{"name":"Content-Length","valueList":["94"]}
{"name":"Referer","valueList":["https://www.abc.com/xyz/pageID"]}

 

The end result I am looking for is :

Content-Length :  94

Referer : /xyz/pageID

 

I am not sure if this is really possible to pull make a value to a key pair.  Still trying to understand the regex but no luck. if someone could please give a hint or help on how to work with this.

Labels (1)
0 Karma
1 Solution

kamlesh_vaghela
SplunkTrust
SplunkTrust

@bijodev1 

Can you please try this?

| makeresults 
| eval _raw="\"Header\":[{\"name\":\"orderID\",\"valueList\":[\"101\"]},{\"name\":\"Content-Length\",\"valueList\":[\"94\"]},{\"name\":\"Referer\",\"valueList\":[\"https://www.abc.com/xyz/pageID\"]},{\"name\":\"User-Agent\",\"valueList\":[\"Androidv11\"]}." 
| rex field=_raw "\"Header\":\[(?<raw>.*).$" 
| rex field=raw mode=sed "s/},{/}|{/g"
| eval raw=split(raw,"|") |eval a=1 | accum a| mvexpand raw| rename raw  as _raw
| extract | rename valueList{} as value  | eval result=name.": ".value
| eval orderID = if(name="orderID",value,null())
| stats values(result) as result values(orderID) as orderID by a
| table orderID result

 

Thanks
KV
▄︻̷̿┻̿═━一

If any of my reply helps you to solve the problem Or gain knowledge, an upvote would be appreciated.

View solution in original post

bijodev1
Communicator

@kamlesh_vaghela 

Hi Kamlesh, I am stuck with pulling the stats for multiple fields with the same regex. It doesn't provides me the data which I am looking for. It gives the number of counts.

search .. same regex | eval orderID = if(name="OrderID",value,null()) || eval sessionID = if(name="session-ID",value,null()) | stats count by orderID, sessionID

doesn't works. I have tried all means  no luck. 

I am looking for , when the orderID and Jession ID are same what's the count.

 

for eg.

OrderID   | sessionID | Count

ab12 | gtyhd34 | 55
ab13 | gtyd54d | 13

0 Karma

kamlesh_vaghela
SplunkTrust
SplunkTrust

@bijodev1 

Can you please share sample events with OrderID and SessionId?

KV

0 Karma

bijodev1
Communicator

@kamlesh_vaghela 

| makeresults 
| eval _raw="\"Header\":[{\"name\":\"orderID\",\"valueList\":[\"101\"]},{\"name\":\"Content-Length\",\"valueList\":[\"94\"]},{\"name\":\"Referer\",\"valueList\":[\"https://www.abc.com/xyz/pageID\"]},{\"name\":\"User-Agent\",\"valueList\":[\"Androidv11\"]}." 
| rex field=_raw "\"Header\":\[(?<raw>.*).$" 
| rex field=raw mode=sed "s/},{/}|{/g"
| eval raw=split(raw,"|") |eval a=1 | accum a| mvexpand raw| rename raw  as _raw
| extract | rename valueList{} as value  | eval result=name.": ".value
| eval orderID = if(name="orderID",value,null())

along with the orderID. I am trying to eval sessionID = if (name="session-ID", value,null()).

When I combine both the orderID and sessionID to pull the stats count, it doesn't works like this - stats count by orderID,sessionID

0 Karma

kamlesh_vaghela
SplunkTrust
SplunkTrust

@bijodev1 

Can you please try this?

 

| makeresults 
| eval _raw="\"Header\":[{\"name\":\"session-ID\",\"valueList\":[\"101\"]},{\"name\":\"orderID\",\"valueList\":[\"101\"]},{\"name\":\"Content-Length\",\"valueList\":[\"94\"]},{\"name\":\"Referer\",\"valueList\":[\"https://www.abc.com/xyz/pageID\"]},{\"name\":\"User-Agent\",\"valueList\":[\"Androidv11\"]}." 
| append [| makeresults 
| eval _raw="\"Header\":[{\"name\":\"session-ID\",\"valueList\":[\"1011\"]},{\"name\":\"orderID\",\"valueList\":[\"1011\"]},{\"name\":\"Content-Length\",\"valueList\":[\"94\"]},{\"name\":\"Referer\",\"valueList\":[\"https://www.abc.com/xyz/pageID\"]},{\"name\":\"User-Agent\",\"valueList\":[\"Androidv11\"]}." ]
| append [| makeresults 
| eval _raw="\"Header\":[{\"name\":\"session-ID\",\"valueList\":[\"101\"]},{\"name\":\"orderID\",\"valueList\":[\"101\"]},{\"name\":\"Content-Length\",\"valueList\":[\"94\"]},{\"name\":\"Referer\",\"valueList\":[\"https://www.abc.com/xyz/pageID\"]},{\"name\":\"User-Agent\",\"valueList\":[\"Androidv11\"]}." ]
| rex field=_raw "\"Header\":\[(?<raw>.*).$" | rex field=raw mode=sed "s/},{/}|{/g" | eval raw=split(raw,"|") | eval a=1 | accum a | mvexpand raw | rename raw as _raw | extract | rename valueList{} as value | eval result=name.": ".value 
| eval orderID = if(name="orderID",value,null()) 
| eval sessionID = if (name="session-ID", value,null())
| stats values(orderID) as orderID values(sessionID) as sessionID by a
| stats count by orderID, sessionID

 

Thanks
KV
▄︻̷̿┻̿═━一

If any of my reply helps you to solve the problem Or gain knowledge, an upvote would be appreciated.

bijodev1
Communicator

@kamlesh_vaghela 

can we try to pull the data without using (stats .......... by a).

by a - only picks one orderID.

 

for this query output looks like this :

orderID  | SESSIONID | COUNT

A123 | ADRFE345 | 55
A123 | AVFRD563 | 19
A123 | ATRDF637 |  20

 

what I am actually looking is for all the orderID which are there too.

orderID  | SESSIONID | COUNT

A123 | ADRFE345 | 55
A123 | AVFRD563 | 19
A123 | ATRDF637 |  20
A121 | ADRFE335 | 59
A121 | AVFRD564 | 29
A124 | ATRDF627 |  30

0 Karma

kamlesh_vaghela
SplunkTrust
SplunkTrust

@bijodev1 

 

| stats values(orderID) as orderID values(sessionID) as sessionID by a

will give you ```orderID``` and ```sessionID`` both..  ```a``` field represents unique events.  So by doing ```by a`` we are making data with unique event.

Check this.

 

Screenshot 2021-06-15 at 8.07.17 PM.png

 

After processing these events , upto by clause it will show like this.

 

Screenshot 2021-06-15 at 8.08.52 PM.png

So when multiple event has same OrderId it will consider separately.

like ```OderId 101``` this.

Screenshot 2021-06-15 at 8.10.33 PM.png

 So I think this logic will work with your requirement.

In case still it is not working, please send me OP of below search. 🙂

YOUR_SEARCH
| rex field=_raw "\"Header\":\[(?<raw>.*).$" | rex field=raw mode=sed "s/},{/}|{/g" | eval raw=split(raw,"|") | eval a=1 | accum a | mvexpand raw | rename raw as _raw | extract | rename valueList{} as value | eval result=name.": ".value 
| eval orderID = if(name="orderID",value,null()) 
| eval sessionID = if (name="session-ID", value,null())
| table orderID sessionID a 

 

 

Thanks
KV
▄︻̷̿┻̿═━一

If any of my reply helps you to solve the problem Or gain knowledge, an upvote would be appreciated.

0 Karma

bijodev1
Communicator

@kamlesh_vaghela 

sorry to bother you again. for the same query 

{"name":"Content-Length","valueList":["94"]}
{"name":"Referer","valueList":["https://www.abc.com/xyz/pageID"]}
{"name":"Cookie","valueList:[abc=123;s_id=9wefdrtunhfkd6; df_id=xijuhygsdd342;data=helloworld]}

 

I want to pull the complete value List for the name Cookie. Based on that I need to run the query

stats count by Cookie - where it display how much was the count of ----  abc,  s_id, df_id, data 

 

0 Karma

kamlesh_vaghela
SplunkTrust
SplunkTrust

@bijodev1 

Can you please try this?

| makeresults 
| eval raw="{\"name\":\"Content-Length\",\"valueList\":[\"94\"]}|{\"name\":\"Referer\",\"valueList\":[\"https://www.abc.com/xyz/pageID\"]}|{\"name\":\"Cookie\",\"valueList\":[\"abc=123;s_id=9wefdrtunhfkd6; df_id=xijuhygsdd342;data=helloworld\"]}|{\"name\":\"Content-Length\",\"valueList\":[\"94\"]}|{\"name\":\"Referer\",\"valueList\":[\"https://www.abc.com/xyz/pageID]\"}|{\"name\":\"Cookie\",\"valueList\":[\"abc=123;s_id=9wefdrtunhfkd6; df_id=xijuhygsdd342;data=helloworld]\"}", raw=split(raw,"|") 
| mvexpand raw | rename raw as _raw | extract | where name= "Cookie" | rename valueList{} as _raw  | extract | stats count(eval(isnotnull(abc))) as abc, count(eval(isnotnull(s_id))) as s_id, count(eval(isnotnull(df_id))) as df_id, count(eval(isnotnull(data))) as data

 

Thanks
KV
▄︻̷̿┻̿═━一

If any of my reply helps you to solve the problem Or gain knowledge, an upvote would be appreciated.

0 Karma

bijodev1
Communicator

@kamlesh_vaghela 

can't we make something in the same regex.

| rex field=_raw "\"Headers\":\[(?<raw>.*).$"

| rex field=raw mode=sed "s/},{/}|{/g"

 

Because the "cookie" field is the part of Headers and it's values are separated with semi colon ;. There are many values under Cookie fields which uses semi colon as separator.

"Headers" :{"name":"Content-Length","valueList":["94"]}, [{"name":"Cookie","valueList":["xrm7=-762BMB0; exp-ck=1; CQTEST=1; xTpYacs=; DQ=Y; DX=wsdaquijhs; S_ID=xyat; Latency=1; TB_N=10; TB_SFOU-100=; C_Flag=0; vct_id=9XgVPsnKid7aaiY; bct_id=X89wgVnSdKdiU1gqaa]},…………….

 

0 Karma

kamlesh_vaghela
SplunkTrust
SplunkTrust

@bijodev1 

Can you please try this?

 

| makeresults 
| eval _raw="\"Headers\" :{\"name\":\"Content-Length\",\"valueList\":[\"94\"]}, [{\"name\":\"Cookie\",\"valueList\":[\"xrm7=-762BMB0; exp-ck=1; CQTEST=1; xTpYacs=; DQ=Y; DX=wsdaquijhs; S_ID=xyat; Latency=1; TB_N=10; TB_SFOU-100=; C_Flag=0; vct_id=9XgVPsnKid7aaiY; bct_id=X89wgVnSdKdiU1gqaa]\"}" 
| rex field=_raw "\[\{\"name\":\"Cookie\",\"valueList\":\[\"(?<coockie_value>[^\]]+)" 
| search coockie_value=* 
| rename coockie_value as _raw 
| extract pairdelim=";" kvdelim="=" 
| stats count(*) as *

 

Thanks
KV
▄︻̷̿┻̿═━一

If any of my reply helps you to solve the problem Or gain knowledge, an upvote would be appreciated.

bijodev1
Communicator

@kamlesh_vaghela 

thank you so much Kamlesh. It was worth look into this.

0 Karma

bijodev1
Communicator

@kamlesh_vaghela 

Can you put some light on the regex which you have wrote :

| rex field=_raw "\"requestHeaderList\":\[(?<raw>.*).$"

| rex field=raw mode=sed "s/},{/}|{/g"

| eval raw=split(raw,"|")| mvexpand raw | rename raw  as _raw

 

if you can just guide what these query basically doing. It would be really helpful. 

Thank you so much.

0 Karma

kamlesh_vaghela
SplunkTrust
SplunkTrust

@bijodev1 

| rex field=_raw "\"Header\":\[(?<raw>.*).$" 

 This will extract JSON data from _raw event and assign into new field raw.

From: 

"Header":[{"name":"Content-Length","valueList":["94"]},{"name":"Referer","valueList":["https://www.abc.com/xyz/pageID"]},{"name":"User-Agent","valueList":["Androidv11"]}.

to:

{"name":"Content-Length","valueList":["94"]},{"name":"Referer","valueList":["https://www.abc.com/xyz/pageID"]},{"name":"User-Agent","valueList":["Androidv11"]}

--

 

| rex field=raw mode=sed "s/},{/}|{/g"

This will replace commas between different json with pipe(|). It is required for next operation 

From:

{"name":"Content-Length","valueList":["94"]},{"name":"Referer","valueList":["https://www.abc.com/xyz/pageID"]},{"name":"User-Agent","valueList":["Androidv11"]}

To:

{"name":"Content-Length","valueList":["94"]}|{"name":"Referer","valueList":["https://www.abc.com/xyz/pageID"]}|{"name":"User-Agent","valueList":["Androidv11"]}

--

| eval raw=split(raw,"|") |eval a=1 | accum a| mvexpand raw| rename raw  as _raw

This will split raw into multiple events and assign into _raw and keep unique value, here it is field a.

From

{"name":"Content-Length","valueList":["94"]}|{"name":"Referer","valueList":["https://www.abc.com/xyz/pageID"]}|{"name":"User-Agent","valueList":["Androidv11"]}


To:

{"name":"Content-Length","valueList":["94"]}
{"name":"Referer","valueList":["https://www.abc.com/xyz/pageID"]}
{"name":"User-Agent","valueList":["Androidv11"]}

 

Upto now we cab easily access the name and valueList fields to perform next operation

😀

 

Thanks
KV
▄︻̷̿┻̿═━一

If any of my reply helps you to solve the problem Or gain knowledge, an upvote would be appreciated.

 

bijodev1
Communicator

thanks @kamlesh_vaghela  for the response. The problem is the events are very dense and each field has it's own nested key value pairs. So I am not sure how to pull this data.

"Header":[{"name":"Content-Length","valueList":["94"]},{"name":"Referer","valueList":["https://www.abc.com/xyz/pageID"]},{"name":"User-Agent","valueList":["Androidv11"]}.

like this the entire events consist of name and valueList. I am not sure how to remove this from their raw events.

0 Karma

kamlesh_vaghela
SplunkTrust
SplunkTrust

@bijodev1 

Can you please try this?

YOUR_SEARCH
| rex field=_raw "\"Header\":\[(?<raw>.*).$" 
| rex field=raw mode=sed "s/},{/}|{/g"
| eval raw=split(raw,"|")| mvexpand raw| rename raw  as _raw
| extract | rename valueList{} as value | table name value | eval result=name.": ".value

 

My Sample Search :

| makeresults 
| eval _raw="\"Header\":[{\"name\":\"Content-Length\",\"valueList\":[\"94\"]},{\"name\":\"Referer\",\"valueList\":[\"https://www.abc.com/xyz/pageID\"]},{\"name\":\"User-Agent\",\"valueList\":[\"Androidv11\"]}." 
| rex field=_raw "\"Header\":\[(?<raw>.*).$" 
| rex field=raw mode=sed "s/},{/}|{/g"
| eval raw=split(raw,"|")| mvexpand raw| rename raw  as _raw
| extract | rename valueList{} as value | table name value | eval result=name.": ".value

 

Sample event:

"Header":[{"name":"Content-Length","valueList":["94"]},{"name":"Referer","valueList":["https://www.abc.com/xyz/pageID"]},{"name":"User-Agent","valueList":["Androidv11"]}.

 

Thanks
KV
▄︻̷̿┻̿═━一

If any of my reply helps you to solve the problem Or gain knowledge, an upvote would be appreciated.

bijodev1
Communicator

Hi @kamlesh_vaghela  thank you so much for the query. Is it possible to display the table for the "path" and have the "result" column. 

PATH | Result.

like PATH - should contain only single row and Result can have multiple data.

 

 

0 Karma

kamlesh_vaghela
SplunkTrust
SplunkTrust

@bijodev1 

PATH means Referer field?? Can you please share your expected op from your given sample?

0 Karma

bijodev1
Communicator

@kamlesh_vaghela 

thanks for your query. It was really helpful. I was able to extract the data exactly the way I wanted. Inside that headerlist - we have one more field orderID. I want to use that as one column and the rest of them as the result column.

 

0 Karma

kamlesh_vaghela
SplunkTrust
SplunkTrust

@bijodev1 

Can you please try this?

| makeresults 
| eval _raw="\"Header\":[{\"name\":\"orderID\",\"valueList\":[\"101\"]},{\"name\":\"Content-Length\",\"valueList\":[\"94\"]},{\"name\":\"Referer\",\"valueList\":[\"https://www.abc.com/xyz/pageID\"]},{\"name\":\"User-Agent\",\"valueList\":[\"Androidv11\"]}." 
| rex field=_raw "\"Header\":\[(?<raw>.*).$" 
| rex field=raw mode=sed "s/},{/}|{/g"
| eval raw=split(raw,"|") |eval a=1 | accum a| mvexpand raw| rename raw  as _raw
| extract | rename valueList{} as value  | eval result=name.": ".value
| eval orderID = if(name="orderID",value,null())
| stats values(result) as result values(orderID) as orderID by a
| table orderID result

 

Thanks
KV
▄︻̷̿┻̿═━一

If any of my reply helps you to solve the problem Or gain knowledge, an upvote would be appreciated.

bijodev1
Communicator

@kamlesh_vaghela 

It's kind of working but not fully. Let's say there are 20000+ events.

Like in a raw events there can be request where the order ID would be same for many of the request.

I will like to pull the logs for specific orderID and the second column would be list of headers with it.

 

It would look like :

 

orderID |  Result

a123 | content-length, referer, etc

a214 | content-length, referer, accept-language

0 Karma
Get Updates on the Splunk Community!

.conf24 | Registration Open!

Hello, hello! I come bearing good news: Registration for .conf24 is now open!   conf is Splunk’s rad annual ...

ICYMI - Check out the latest releases of Splunk Edge Processor

Splunk is pleased to announce the latest enhancements to Splunk Edge Processor.  HEC Receiver authorization ...

Introducing the 2024 SplunkTrust!

Hello, Splunk Community! We are beyond thrilled to announce our newest group of SplunkTrust members!  The ...