Splunk Search

How to extract fields from txt format?

drogo
Explorer

Hello, I want to extract fiends from below log format. Can someone please help.

Log format -

2023-03-21 04:14:13.859, queue_name:stream-AccountProfile, messages: 16, bytes: 13 KiB, actCusumers: 4, numSubjects: 1
2023-03-21 04:14:13.859, queue_name:stream-SampleProfile, messages: 3,522, bytes: 2.4 MiB, actCusumers: 4, numSubjects: 1

Fields I want to extract are queue name, messages, actCusumers, numSubjects. 

I am using below eval commands but looks like I am not getting all logs, also getting duplicate events.

I am want to extract only latest ones.

Query - 

| eval ArrayAttrib=split(_raw,",")
| eval numSubjects=mvindex(split(mvindex(ArrayAttrib,-1) ,": "),1)
| eval actConsumers=mvindex(split(mvindex(ArrayAttrib,-2) ,": "),1)
| eval bytes=mvindex(split(mvindex(ArrayAttrib,-3) ,": "),1)
| eval messages=mvindex(split(mvindex(ArrayAttrib,-4) ,": "),1)
| eval stream=mvindex(split(mvindex(ArrayAttrib,-5) ,":"),1)
| eval dtm=strftime(_time,"%Y-%m-%d %H:%M")
| stats max(dtm) by stream numSubjects actConsumers bytes messages
| fields "stream", "messages", "actConsumers", "numSubjects", "max(dtm)"
| dedup "messages" | dedup "stream" | sort "stream"

 

 

 

 

 

Labels (2)
0 Karma
1 Solution

gcusello
SplunkTrust
SplunkTrust

Hi @drogo ,

what's your problem rexes to extract fields?

if this is your issue, you can use this regex

| rex "queue_name:\s*(?<queue_name>[^,]+),\s+messages:\s*(?<messages>[^,]+),.*bytes:\s*(?<bytes>[^,]+),\s*actCusumers:\s*(?<actCusumers>[^,]+),\s*numSubjects:\s*(?<numSubjects>\d+)"

that you can test at https://regex101.com/r/aPEZ6B/1

Ciao.

Giuseppe

View solution in original post

ITWhisperer
SplunkTrust
SplunkTrust

Try extracting the fields this way (note the renames are required because your sample data doesn't match the field names you are using and I have assumed _time has been extracted properly already)

| extract pairdelim="," kvdelim=":"
| rename queue_name as stream
| rename actCusumers as actConsumers
| stats max(_time) as _time by stream numSubjects actConsumers bytes messages

 The dedups you have used would have kept the first event for each messages (given that this appears to be just a count(?) you will have lost some data here). This could have been further reduced by the next dedup if you had more than one different messages value for a stream.

What is it that you are actually trying to determine from your events?

gcusello
SplunkTrust
SplunkTrust

Hi @drogo ,

what's your problem rexes to extract fields?

if this is your issue, you can use this regex

| rex "queue_name:\s*(?<queue_name>[^,]+),\s+messages:\s*(?<messages>[^,]+),.*bytes:\s*(?<bytes>[^,]+),\s*actCusumers:\s*(?<actCusumers>[^,]+),\s*numSubjects:\s*(?<numSubjects>\d+)"

that you can test at https://regex101.com/r/aPEZ6B/1

Ciao.

Giuseppe

drogo
Explorer

Thanks gcusello, this really helps.
I am getting values which are prior to , in messages but messages are having thousands of count and those in below pattern. How can I get whole value. Update value on below page.
https://regex101.com/r/aPEZ6B/1

Sample -
2023-03-21 04:14:13.859, queue_name:stream-AccountProfile, messages: 16,2303, bytes: 13 KiB, actCusumers: 4, numSubjects: 1
2023-03-21 04:14:13.859, queue_name:stream-SampleProfile, messages: 3,522, bytes: 2.4 MiB, actCusumers: 4, numSubjects: 1

0 Karma

gcusello
SplunkTrust
SplunkTrust

Hi @drogo,

please try this:

^(?<_time>[^,]+),\s+queue_name:\s*(?<queue_name>[^,]+),\s+messages:\s*(?<messages>.+),.*bytes:\s*(?<bytes>[^,]+),\s*actCusumers:\s*(?<actCusumers>[^,]+),\s*numSubjects:\s*(?<numSubjects>\d+)

that you can test at https://regex101.com/r/aPEZ6B/2

Ciao.

Giuseppe

0 Karma

drogo
Explorer

Hi @gcusello,
I got the solution, thanks for your help!
https://regex101.com/r/aPEZ6B/1 

0 Karma
Got questions? Get answers!

Join the Splunk Community Slack to learn, troubleshoot, and make connections with fellow Splunk practitioners in real time!

Meet up IRL or virtually!

Join Splunk User Groups to connect and learn in-person by region or remotely by topic or industry.

Get Updates on the Splunk Community!

Announcing Modern Navigation: A New Era of Splunk User Experience

We are excited to introduce the Modern Navigation feature in the Splunk Platform, available to both cloud and ...

Modernize your Splunk Apps – Introducing Python 3.13 in Splunk

We are excited to announce that the upcoming releases of Splunk Enterprise 10.2.x and Splunk Cloud Platform ...

Step into “Hunt the Insider: An Splunk ES Premier Mystery” to catch a cybercriminal ...

After a whole week of being on call, you fell asleep on your keyboard, and you hit a sequence of buttons that ...