Splunk Search

How to extract fields from txt format?

drogo
Explorer

Hello, I want to extract fiends from below log format. Can someone please help.

Log format -

2023-03-21 04:14:13.859, queue_name:stream-AccountProfile, messages: 16, bytes: 13 KiB, actCusumers: 4, numSubjects: 1
2023-03-21 04:14:13.859, queue_name:stream-SampleProfile, messages: 3,522, bytes: 2.4 MiB, actCusumers: 4, numSubjects: 1

Fields I want to extract are queue name, messages, actCusumers, numSubjects. 

I am using below eval commands but looks like I am not getting all logs, also getting duplicate events.

I am want to extract only latest ones.

Query - 

| eval ArrayAttrib=split(_raw,",")
| eval numSubjects=mvindex(split(mvindex(ArrayAttrib,-1) ,": "),1)
| eval actConsumers=mvindex(split(mvindex(ArrayAttrib,-2) ,": "),1)
| eval bytes=mvindex(split(mvindex(ArrayAttrib,-3) ,": "),1)
| eval messages=mvindex(split(mvindex(ArrayAttrib,-4) ,": "),1)
| eval stream=mvindex(split(mvindex(ArrayAttrib,-5) ,":"),1)
| eval dtm=strftime(_time,"%Y-%m-%d %H:%M")
| stats max(dtm) by stream numSubjects actConsumers bytes messages
| fields "stream", "messages", "actConsumers", "numSubjects", "max(dtm)"
| dedup "messages" | dedup "stream" | sort "stream"

 

 

 

 

 

Labels (2)
0 Karma
1 Solution

gcusello
SplunkTrust
SplunkTrust

Hi @drogo ,

what's your problem rexes to extract fields?

if this is your issue, you can use this regex

| rex "queue_name:\s*(?<queue_name>[^,]+),\s+messages:\s*(?<messages>[^,]+),.*bytes:\s*(?<bytes>[^,]+),\s*actCusumers:\s*(?<actCusumers>[^,]+),\s*numSubjects:\s*(?<numSubjects>\d+)"

that you can test at https://regex101.com/r/aPEZ6B/1

Ciao.

Giuseppe

View solution in original post

ITWhisperer
SplunkTrust
SplunkTrust

Try extracting the fields this way (note the renames are required because your sample data doesn't match the field names you are using and I have assumed _time has been extracted properly already)

| extract pairdelim="," kvdelim=":"
| rename queue_name as stream
| rename actCusumers as actConsumers
| stats max(_time) as _time by stream numSubjects actConsumers bytes messages

 The dedups you have used would have kept the first event for each messages (given that this appears to be just a count(?) you will have lost some data here). This could have been further reduced by the next dedup if you had more than one different messages value for a stream.

What is it that you are actually trying to determine from your events?

gcusello
SplunkTrust
SplunkTrust

Hi @drogo ,

what's your problem rexes to extract fields?

if this is your issue, you can use this regex

| rex "queue_name:\s*(?<queue_name>[^,]+),\s+messages:\s*(?<messages>[^,]+),.*bytes:\s*(?<bytes>[^,]+),\s*actCusumers:\s*(?<actCusumers>[^,]+),\s*numSubjects:\s*(?<numSubjects>\d+)"

that you can test at https://regex101.com/r/aPEZ6B/1

Ciao.

Giuseppe

drogo
Explorer

Thanks gcusello, this really helps.
I am getting values which are prior to , in messages but messages are having thousands of count and those in below pattern. How can I get whole value. Update value on below page.
https://regex101.com/r/aPEZ6B/1

Sample -
2023-03-21 04:14:13.859, queue_name:stream-AccountProfile, messages: 16,2303, bytes: 13 KiB, actCusumers: 4, numSubjects: 1
2023-03-21 04:14:13.859, queue_name:stream-SampleProfile, messages: 3,522, bytes: 2.4 MiB, actCusumers: 4, numSubjects: 1

0 Karma

gcusello
SplunkTrust
SplunkTrust

Hi @drogo,

please try this:

^(?<_time>[^,]+),\s+queue_name:\s*(?<queue_name>[^,]+),\s+messages:\s*(?<messages>.+),.*bytes:\s*(?<bytes>[^,]+),\s*actCusumers:\s*(?<actCusumers>[^,]+),\s*numSubjects:\s*(?<numSubjects>\d+)

that you can test at https://regex101.com/r/aPEZ6B/2

Ciao.

Giuseppe

0 Karma

drogo
Explorer

Hi @gcusello,
I got the solution, thanks for your help!
https://regex101.com/r/aPEZ6B/1 

0 Karma
Get Updates on the Splunk Community!

Stay Connected: Your Guide to May Tech Talks, Office Hours, and Webinars!

Take a look below to explore our upcoming Community Office Hours, Tech Talks, and Webinars this month. This ...

They're back! Join the SplunkTrust and MVP at .conf24

With our highly anticipated annual conference, .conf, comes the fez-wearers you can trust! The SplunkTrust, as ...

Enterprise Security Content Update (ESCU) | New Releases

Last month, the Splunk Threat Research Team had two releases of new security content via the Enterprise ...