Splunk Search

How to extract fields from txt format?

drogo
Explorer

Hello, I want to extract fiends from below log format. Can someone please help.

Log format -

2023-03-21 04:14:13.859, queue_name:stream-AccountProfile, messages: 16, bytes: 13 KiB, actCusumers: 4, numSubjects: 1
2023-03-21 04:14:13.859, queue_name:stream-SampleProfile, messages: 3,522, bytes: 2.4 MiB, actCusumers: 4, numSubjects: 1

Fields I want to extract are queue name, messages, actCusumers, numSubjects. 

I am using below eval commands but looks like I am not getting all logs, also getting duplicate events.

I am want to extract only latest ones.

Query - 

| eval ArrayAttrib=split(_raw,",")
| eval numSubjects=mvindex(split(mvindex(ArrayAttrib,-1) ,": "),1)
| eval actConsumers=mvindex(split(mvindex(ArrayAttrib,-2) ,": "),1)
| eval bytes=mvindex(split(mvindex(ArrayAttrib,-3) ,": "),1)
| eval messages=mvindex(split(mvindex(ArrayAttrib,-4) ,": "),1)
| eval stream=mvindex(split(mvindex(ArrayAttrib,-5) ,":"),1)
| eval dtm=strftime(_time,"%Y-%m-%d %H:%M")
| stats max(dtm) by stream numSubjects actConsumers bytes messages
| fields "stream", "messages", "actConsumers", "numSubjects", "max(dtm)"
| dedup "messages" | dedup "stream" | sort "stream"

 

 

 

 

 

Labels (2)
0 Karma
1 Solution

gcusello
SplunkTrust
SplunkTrust

Hi @drogo ,

what's your problem rexes to extract fields?

if this is your issue, you can use this regex

| rex "queue_name:\s*(?<queue_name>[^,]+),\s+messages:\s*(?<messages>[^,]+),.*bytes:\s*(?<bytes>[^,]+),\s*actCusumers:\s*(?<actCusumers>[^,]+),\s*numSubjects:\s*(?<numSubjects>\d+)"

that you can test at https://regex101.com/r/aPEZ6B/1

Ciao.

Giuseppe

View solution in original post

ITWhisperer
SplunkTrust
SplunkTrust

Try extracting the fields this way (note the renames are required because your sample data doesn't match the field names you are using and I have assumed _time has been extracted properly already)

| extract pairdelim="," kvdelim=":"
| rename queue_name as stream
| rename actCusumers as actConsumers
| stats max(_time) as _time by stream numSubjects actConsumers bytes messages

 The dedups you have used would have kept the first event for each messages (given that this appears to be just a count(?) you will have lost some data here). This could have been further reduced by the next dedup if you had more than one different messages value for a stream.

What is it that you are actually trying to determine from your events?

gcusello
SplunkTrust
SplunkTrust

Hi @drogo ,

what's your problem rexes to extract fields?

if this is your issue, you can use this regex

| rex "queue_name:\s*(?<queue_name>[^,]+),\s+messages:\s*(?<messages>[^,]+),.*bytes:\s*(?<bytes>[^,]+),\s*actCusumers:\s*(?<actCusumers>[^,]+),\s*numSubjects:\s*(?<numSubjects>\d+)"

that you can test at https://regex101.com/r/aPEZ6B/1

Ciao.

Giuseppe

drogo
Explorer

Thanks gcusello, this really helps.
I am getting values which are prior to , in messages but messages are having thousands of count and those in below pattern. How can I get whole value. Update value on below page.
https://regex101.com/r/aPEZ6B/1

Sample -
2023-03-21 04:14:13.859, queue_name:stream-AccountProfile, messages: 16,2303, bytes: 13 KiB, actCusumers: 4, numSubjects: 1
2023-03-21 04:14:13.859, queue_name:stream-SampleProfile, messages: 3,522, bytes: 2.4 MiB, actCusumers: 4, numSubjects: 1

0 Karma

gcusello
SplunkTrust
SplunkTrust

Hi @drogo,

please try this:

^(?<_time>[^,]+),\s+queue_name:\s*(?<queue_name>[^,]+),\s+messages:\s*(?<messages>.+),.*bytes:\s*(?<bytes>[^,]+),\s*actCusumers:\s*(?<actCusumers>[^,]+),\s*numSubjects:\s*(?<numSubjects>\d+)

that you can test at https://regex101.com/r/aPEZ6B/2

Ciao.

Giuseppe

0 Karma

drogo
Explorer

Hi @gcusello,
I got the solution, thanks for your help!
https://regex101.com/r/aPEZ6B/1 

0 Karma
Get Updates on the Splunk Community!

Enterprise Security Content Update (ESCU) | New Releases

In November, the Splunk Threat Research Team had one release of new security content via the Enterprise ...

Index This | Divide 100 by half. What do you get?

November 2024 Edition Hayyy Splunk Education Enthusiasts and the Eternally Curious!  We’re back with this ...

Stay Connected: Your Guide to December Tech Talks, Office Hours, and Webinars!

❄️ Celebrate the season with our December lineup of Community Office Hours, Tech Talks, and Webinars! ...