Splunk Search

How do you extract fields from an existing field's value?

samlinsongguo
Communicator

I have a field that contains one long string looks like below

18/10/2018 03:42:26 - Chirs Lee (Work notes) commentxxx commentxxx commentxxx commentxxx 17/10/2018 23:14:04 - Sam Smith(Work notes) commentxxx commentxxx commentxxx commentxxx 17/10/2018 23:13:33 - Bob Bob(Work notes) commentxxx commentxxx commentxxx commentxxx 15/10/2018 23:13:33 - Chris Lee (Work notes) commentxxx commentxxx commentxxx commentxxx 

This message is in one event, I want to extract 3 fields from this message: time, name, and comment which will look as below

time                 name       comment 
18/10/2018 03:42:26 Chirs Lee (Work notes) commentxxx commentxxx commentxxx commentxxx 
17/10/2018 23:14:04 Sam Smith (Work notes) commentxxx commentxxx commentxxx commentxxx
17/10/2018 23:13:33 Bob Bob   (Work notes) commentxxx commentxxx commentxxx commentxxx 
15/10/2018 23:13:33 Chris Lee (Work notes) commentxxx commentxxx commentxxx commentxxx

Ideally, this one event will be split to multiple events, but I guess if it is acceptable as a field with multi value field.

My final goal is that, later on, I will be able to search what comment a user (Chirs Lee) put in the job.

Any suggestion how I can extract the string from this field?

0 Karma
1 Solution

Raschko
Communicator

Try the following search:

| makeresults count=1 
| eval test="18/10/2018 03:42:26 - Chirs Lee (Work notes) commentxxx commentxxx commentxxx commentxxx 17/10/2018 23:14:04 - Sam Smith (Work notes) commentxxx commentxxx commentxxx commentxxx 17/10/2018 23:13:33 - Bob Bob (Work notes) commentxxx commentxxx commentxxx commentxxx 15/10/2018 23:13:33 - Chris Lee (Work notes) commentxxx commentxxx commentxxx commentxxx" 
| fields - _time 
| rex field=test mode=sed "s/(\d{2}\/\d{2}\/\d{4} \d{2}:\d{2}:\d{2} -)|$/|||\1/g" 
| rex field=test max_match=10 "(?<full_comments>\d{2}/\d{2}/\d{4}.*?)(?:\|\|\|)|\$"
| fields - test
| mvexpand full_comments
| rex field=full_comments "(?<time>\d{2}/\d{2}/\d{4} \d{2}:\d{2}:\d{2})\s-\s(?<worker>[^(]+)(?<comment>\(.*)"

| table time, worker, comment, full_comments

The first rex command is prepending the delimiter ||| in front of every date and at the end of the string.

| rex field=test mode=sed "s/(\d{2}\/\d{2}\/\d{4} \d{2}:\d{2}:\d{2} -)|$/|||\1/g" 

The second rex command extracts the comments until each delimiter.

| rex field=test max_match=10 "(?<full_comments>\d{2}/\d{2}/\d{4}.*?)(?:\|\|\|)|\$"

The mvexpand splits the multi-valued comments into single-valued ones.

| mvexpand full_comments

The last rex command extracts your wanted fields.

| rex field=full_comments "(?<time>\d{2}/\d{2}/\d{4} \d{2}:\d{2}:\d{2})\s-\s(?<worker>[^(]+)(?<comment>\(.*)"

HTH...

View solution in original post

Raschko
Communicator

Try the following search:

| makeresults count=1 
| eval test="18/10/2018 03:42:26 - Chirs Lee (Work notes) commentxxx commentxxx commentxxx commentxxx 17/10/2018 23:14:04 - Sam Smith (Work notes) commentxxx commentxxx commentxxx commentxxx 17/10/2018 23:13:33 - Bob Bob (Work notes) commentxxx commentxxx commentxxx commentxxx 15/10/2018 23:13:33 - Chris Lee (Work notes) commentxxx commentxxx commentxxx commentxxx" 
| fields - _time 
| rex field=test mode=sed "s/(\d{2}\/\d{2}\/\d{4} \d{2}:\d{2}:\d{2} -)|$/|||\1/g" 
| rex field=test max_match=10 "(?<full_comments>\d{2}/\d{2}/\d{4}.*?)(?:\|\|\|)|\$"
| fields - test
| mvexpand full_comments
| rex field=full_comments "(?<time>\d{2}/\d{2}/\d{4} \d{2}:\d{2}:\d{2})\s-\s(?<worker>[^(]+)(?<comment>\(.*)"

| table time, worker, comment, full_comments

The first rex command is prepending the delimiter ||| in front of every date and at the end of the string.

| rex field=test mode=sed "s/(\d{2}\/\d{2}\/\d{4} \d{2}:\d{2}:\d{2} -)|$/|||\1/g" 

The second rex command extracts the comments until each delimiter.

| rex field=test max_match=10 "(?<full_comments>\d{2}/\d{2}/\d{4}.*?)(?:\|\|\|)|\$"

The mvexpand splits the multi-valued comments into single-valued ones.

| mvexpand full_comments

The last rex command extracts your wanted fields.

| rex field=full_comments "(?<time>\d{2}/\d{2}/\d{4} \d{2}:\d{2}:\d{2})\s-\s(?<worker>[^(]+)(?<comment>\(.*)"

HTH...

samlinsongguo
Communicator

fix the problem by change second one to

| rex field=work_notes max_match=10 "(?<full_comment>[\w\s\W\d]+?)(?:\|\|\|)"

not sure why, any suggestion?

0 Karma

Raschko
Communicator

It is hard to tell without knowing the original logs or fields. Maybe it's a problem with line end ($).

I guess with your rex line, you will miss the last workers comment.

0 Karma

samlinsongguo
Communicator

Hi Raschko
I put your code against production log, the first rex works fine, added ||| into the string, but the second rex commend didnt return any value in full_comments field, any suggestion why?

I put the string value in your original code and it works fine as well
Thank you so much for your help
Cheers
Sam

0 Karma

samlinsongguo
Communicator

This is amazing, didnt expect get answer this fast, you are great

0 Karma
Get Updates on the Splunk Community!

Webinar Recap | Revolutionizing IT Operations: The Transformative Power of AI and ML ...

The Transformative Power of AI and ML in Enhancing Observability   In the realm of IT operations, the ...

.conf24 | Registration Open!

Hello, hello! I come bearing good news: Registration for .conf24 is now open!   conf is Splunk’s rad annual ...

ICYMI - Check out the latest releases of Splunk Edge Processor

Splunk is pleased to announce the latest enhancements to Splunk Edge Processor.  HEC Receiver authorization ...