Splunk Search

How do you extract fields from an existing field's value?

samlinsongguo
Communicator

I have a field that contains one long string looks like below

18/10/2018 03:42:26 - Chirs Lee (Work notes) commentxxx commentxxx commentxxx commentxxx 17/10/2018 23:14:04 - Sam Smith(Work notes) commentxxx commentxxx commentxxx commentxxx 17/10/2018 23:13:33 - Bob Bob(Work notes) commentxxx commentxxx commentxxx commentxxx 15/10/2018 23:13:33 - Chris Lee (Work notes) commentxxx commentxxx commentxxx commentxxx 

This message is in one event, I want to extract 3 fields from this message: time, name, and comment which will look as below

time                 name       comment 
18/10/2018 03:42:26 Chirs Lee (Work notes) commentxxx commentxxx commentxxx commentxxx 
17/10/2018 23:14:04 Sam Smith (Work notes) commentxxx commentxxx commentxxx commentxxx
17/10/2018 23:13:33 Bob Bob   (Work notes) commentxxx commentxxx commentxxx commentxxx 
15/10/2018 23:13:33 Chris Lee (Work notes) commentxxx commentxxx commentxxx commentxxx

Ideally, this one event will be split to multiple events, but I guess if it is acceptable as a field with multi value field.

My final goal is that, later on, I will be able to search what comment a user (Chirs Lee) put in the job.

Any suggestion how I can extract the string from this field?

0 Karma
1 Solution

Raschko
Communicator

Try the following search:

| makeresults count=1 
| eval test="18/10/2018 03:42:26 - Chirs Lee (Work notes) commentxxx commentxxx commentxxx commentxxx 17/10/2018 23:14:04 - Sam Smith (Work notes) commentxxx commentxxx commentxxx commentxxx 17/10/2018 23:13:33 - Bob Bob (Work notes) commentxxx commentxxx commentxxx commentxxx 15/10/2018 23:13:33 - Chris Lee (Work notes) commentxxx commentxxx commentxxx commentxxx" 
| fields - _time 
| rex field=test mode=sed "s/(\d{2}\/\d{2}\/\d{4} \d{2}:\d{2}:\d{2} -)|$/|||\1/g" 
| rex field=test max_match=10 "(?<full_comments>\d{2}/\d{2}/\d{4}.*?)(?:\|\|\|)|\$"
| fields - test
| mvexpand full_comments
| rex field=full_comments "(?<time>\d{2}/\d{2}/\d{4} \d{2}:\d{2}:\d{2})\s-\s(?<worker>[^(]+)(?<comment>\(.*)"

| table time, worker, comment, full_comments

The first rex command is prepending the delimiter ||| in front of every date and at the end of the string.

| rex field=test mode=sed "s/(\d{2}\/\d{2}\/\d{4} \d{2}:\d{2}:\d{2} -)|$/|||\1/g" 

The second rex command extracts the comments until each delimiter.

| rex field=test max_match=10 "(?<full_comments>\d{2}/\d{2}/\d{4}.*?)(?:\|\|\|)|\$"

The mvexpand splits the multi-valued comments into single-valued ones.

| mvexpand full_comments

The last rex command extracts your wanted fields.

| rex field=full_comments "(?<time>\d{2}/\d{2}/\d{4} \d{2}:\d{2}:\d{2})\s-\s(?<worker>[^(]+)(?<comment>\(.*)"

HTH...

View solution in original post

Raschko
Communicator

Try the following search:

| makeresults count=1 
| eval test="18/10/2018 03:42:26 - Chirs Lee (Work notes) commentxxx commentxxx commentxxx commentxxx 17/10/2018 23:14:04 - Sam Smith (Work notes) commentxxx commentxxx commentxxx commentxxx 17/10/2018 23:13:33 - Bob Bob (Work notes) commentxxx commentxxx commentxxx commentxxx 15/10/2018 23:13:33 - Chris Lee (Work notes) commentxxx commentxxx commentxxx commentxxx" 
| fields - _time 
| rex field=test mode=sed "s/(\d{2}\/\d{2}\/\d{4} \d{2}:\d{2}:\d{2} -)|$/|||\1/g" 
| rex field=test max_match=10 "(?<full_comments>\d{2}/\d{2}/\d{4}.*?)(?:\|\|\|)|\$"
| fields - test
| mvexpand full_comments
| rex field=full_comments "(?<time>\d{2}/\d{2}/\d{4} \d{2}:\d{2}:\d{2})\s-\s(?<worker>[^(]+)(?<comment>\(.*)"

| table time, worker, comment, full_comments

The first rex command is prepending the delimiter ||| in front of every date and at the end of the string.

| rex field=test mode=sed "s/(\d{2}\/\d{2}\/\d{4} \d{2}:\d{2}:\d{2} -)|$/|||\1/g" 

The second rex command extracts the comments until each delimiter.

| rex field=test max_match=10 "(?<full_comments>\d{2}/\d{2}/\d{4}.*?)(?:\|\|\|)|\$"

The mvexpand splits the multi-valued comments into single-valued ones.

| mvexpand full_comments

The last rex command extracts your wanted fields.

| rex field=full_comments "(?<time>\d{2}/\d{2}/\d{4} \d{2}:\d{2}:\d{2})\s-\s(?<worker>[^(]+)(?<comment>\(.*)"

HTH...

samlinsongguo
Communicator

fix the problem by change second one to

| rex field=work_notes max_match=10 "(?<full_comment>[\w\s\W\d]+?)(?:\|\|\|)"

not sure why, any suggestion?

0 Karma

Raschko
Communicator

It is hard to tell without knowing the original logs or fields. Maybe it's a problem with line end ($).

I guess with your rex line, you will miss the last workers comment.

0 Karma

samlinsongguo
Communicator

Hi Raschko
I put your code against production log, the first rex works fine, added ||| into the string, but the second rex commend didnt return any value in full_comments field, any suggestion why?

I put the string value in your original code and it works fine as well
Thank you so much for your help
Cheers
Sam

0 Karma

samlinsongguo
Communicator

This is amazing, didnt expect get answer this fast, you are great

0 Karma
Get Updates on the Splunk Community!

Routing Data to Different Splunk Indexes in the OpenTelemetry Collector

This blog post is part of an ongoing series on OpenTelemetry. The OpenTelemetry project is the second largest ...

Getting Started with AIOps: Event Correlation Basics and Alert Storm Detection in ...

Getting Started with AIOps:Event Correlation Basics and Alert Storm Detection in Splunk IT Service ...

Register to Attend BSides SPL 2022 - It's all Happening October 18!

Join like-minded individuals for technical sessions on everything Splunk!  This is a community-led and run ...