Splunk Search

How do you extract fields from an existing field's value?

samlinsongguo
Communicator

I have a field that contains one long string looks like below

18/10/2018 03:42:26 - Chirs Lee (Work notes) commentxxx commentxxx commentxxx commentxxx 17/10/2018 23:14:04 - Sam Smith(Work notes) commentxxx commentxxx commentxxx commentxxx 17/10/2018 23:13:33 - Bob Bob(Work notes) commentxxx commentxxx commentxxx commentxxx 15/10/2018 23:13:33 - Chris Lee (Work notes) commentxxx commentxxx commentxxx commentxxx 

This message is in one event, I want to extract 3 fields from this message: time, name, and comment which will look as below

time                 name       comment 
18/10/2018 03:42:26 Chirs Lee (Work notes) commentxxx commentxxx commentxxx commentxxx 
17/10/2018 23:14:04 Sam Smith (Work notes) commentxxx commentxxx commentxxx commentxxx
17/10/2018 23:13:33 Bob Bob   (Work notes) commentxxx commentxxx commentxxx commentxxx 
15/10/2018 23:13:33 Chris Lee (Work notes) commentxxx commentxxx commentxxx commentxxx

Ideally, this one event will be split to multiple events, but I guess if it is acceptable as a field with multi value field.

My final goal is that, later on, I will be able to search what comment a user (Chirs Lee) put in the job.

Any suggestion how I can extract the string from this field?

0 Karma
1 Solution

Raschko
Communicator

Try the following search:

| makeresults count=1 
| eval test="18/10/2018 03:42:26 - Chirs Lee (Work notes) commentxxx commentxxx commentxxx commentxxx 17/10/2018 23:14:04 - Sam Smith (Work notes) commentxxx commentxxx commentxxx commentxxx 17/10/2018 23:13:33 - Bob Bob (Work notes) commentxxx commentxxx commentxxx commentxxx 15/10/2018 23:13:33 - Chris Lee (Work notes) commentxxx commentxxx commentxxx commentxxx" 
| fields - _time 
| rex field=test mode=sed "s/(\d{2}\/\d{2}\/\d{4} \d{2}:\d{2}:\d{2} -)|$/|||\1/g" 
| rex field=test max_match=10 "(?<full_comments>\d{2}/\d{2}/\d{4}.*?)(?:\|\|\|)|\$"
| fields - test
| mvexpand full_comments
| rex field=full_comments "(?<time>\d{2}/\d{2}/\d{4} \d{2}:\d{2}:\d{2})\s-\s(?<worker>[^(]+)(?<comment>\(.*)"

| table time, worker, comment, full_comments

The first rex command is prepending the delimiter ||| in front of every date and at the end of the string.

| rex field=test mode=sed "s/(\d{2}\/\d{2}\/\d{4} \d{2}:\d{2}:\d{2} -)|$/|||\1/g" 

The second rex command extracts the comments until each delimiter.

| rex field=test max_match=10 "(?<full_comments>\d{2}/\d{2}/\d{4}.*?)(?:\|\|\|)|\$"

The mvexpand splits the multi-valued comments into single-valued ones.

| mvexpand full_comments

The last rex command extracts your wanted fields.

| rex field=full_comments "(?<time>\d{2}/\d{2}/\d{4} \d{2}:\d{2}:\d{2})\s-\s(?<worker>[^(]+)(?<comment>\(.*)"

HTH...

View solution in original post

Raschko
Communicator

Try the following search:

| makeresults count=1 
| eval test="18/10/2018 03:42:26 - Chirs Lee (Work notes) commentxxx commentxxx commentxxx commentxxx 17/10/2018 23:14:04 - Sam Smith (Work notes) commentxxx commentxxx commentxxx commentxxx 17/10/2018 23:13:33 - Bob Bob (Work notes) commentxxx commentxxx commentxxx commentxxx 15/10/2018 23:13:33 - Chris Lee (Work notes) commentxxx commentxxx commentxxx commentxxx" 
| fields - _time 
| rex field=test mode=sed "s/(\d{2}\/\d{2}\/\d{4} \d{2}:\d{2}:\d{2} -)|$/|||\1/g" 
| rex field=test max_match=10 "(?<full_comments>\d{2}/\d{2}/\d{4}.*?)(?:\|\|\|)|\$"
| fields - test
| mvexpand full_comments
| rex field=full_comments "(?<time>\d{2}/\d{2}/\d{4} \d{2}:\d{2}:\d{2})\s-\s(?<worker>[^(]+)(?<comment>\(.*)"

| table time, worker, comment, full_comments

The first rex command is prepending the delimiter ||| in front of every date and at the end of the string.

| rex field=test mode=sed "s/(\d{2}\/\d{2}\/\d{4} \d{2}:\d{2}:\d{2} -)|$/|||\1/g" 

The second rex command extracts the comments until each delimiter.

| rex field=test max_match=10 "(?<full_comments>\d{2}/\d{2}/\d{4}.*?)(?:\|\|\|)|\$"

The mvexpand splits the multi-valued comments into single-valued ones.

| mvexpand full_comments

The last rex command extracts your wanted fields.

| rex field=full_comments "(?<time>\d{2}/\d{2}/\d{4} \d{2}:\d{2}:\d{2})\s-\s(?<worker>[^(]+)(?<comment>\(.*)"

HTH...

samlinsongguo
Communicator

fix the problem by change second one to

| rex field=work_notes max_match=10 "(?<full_comment>[\w\s\W\d]+?)(?:\|\|\|)"

not sure why, any suggestion?

0 Karma

Raschko
Communicator

It is hard to tell without knowing the original logs or fields. Maybe it's a problem with line end ($).

I guess with your rex line, you will miss the last workers comment.

0 Karma

samlinsongguo
Communicator

Hi Raschko
I put your code against production log, the first rex works fine, added ||| into the string, but the second rex commend didnt return any value in full_comments field, any suggestion why?

I put the string value in your original code and it works fine as well
Thank you so much for your help
Cheers
Sam

0 Karma

samlinsongguo
Communicator

This is amazing, didnt expect get answer this fast, you are great

0 Karma
Get Updates on the Splunk Community!

Index This | I am a number, but when you add ‘G’ to me, I go away. What number am I?

March 2024 Edition Hayyy Splunk Education Enthusiasts and the Eternally Curious!  We’re back with another ...

What’s New in Splunk App for PCI Compliance 5.3.1?

The Splunk App for PCI Compliance allows customers to extend the power of their existing Splunk solution with ...

Extending Observability Content to Splunk Cloud

Register to join us !   In this Extending Observability Content to Splunk Cloud Tech Talk, you'll see how to ...