Splunk Search

How do you extract fields from an existing field's value?

samlinsongguo
Communicator

I have a field that contains one long string looks like below

18/10/2018 03:42:26 - Chirs Lee (Work notes) commentxxx commentxxx commentxxx commentxxx 17/10/2018 23:14:04 - Sam Smith(Work notes) commentxxx commentxxx commentxxx commentxxx 17/10/2018 23:13:33 - Bob Bob(Work notes) commentxxx commentxxx commentxxx commentxxx 15/10/2018 23:13:33 - Chris Lee (Work notes) commentxxx commentxxx commentxxx commentxxx 

This message is in one event, I want to extract 3 fields from this message: time, name, and comment which will look as below

time                 name       comment 
18/10/2018 03:42:26 Chirs Lee (Work notes) commentxxx commentxxx commentxxx commentxxx 
17/10/2018 23:14:04 Sam Smith (Work notes) commentxxx commentxxx commentxxx commentxxx
17/10/2018 23:13:33 Bob Bob   (Work notes) commentxxx commentxxx commentxxx commentxxx 
15/10/2018 23:13:33 Chris Lee (Work notes) commentxxx commentxxx commentxxx commentxxx

Ideally, this one event will be split to multiple events, but I guess if it is acceptable as a field with multi value field.

My final goal is that, later on, I will be able to search what comment a user (Chirs Lee) put in the job.

Any suggestion how I can extract the string from this field?

0 Karma
1 Solution

Raschko
Communicator

Try the following search:

| makeresults count=1 
| eval test="18/10/2018 03:42:26 - Chirs Lee (Work notes) commentxxx commentxxx commentxxx commentxxx 17/10/2018 23:14:04 - Sam Smith (Work notes) commentxxx commentxxx commentxxx commentxxx 17/10/2018 23:13:33 - Bob Bob (Work notes) commentxxx commentxxx commentxxx commentxxx 15/10/2018 23:13:33 - Chris Lee (Work notes) commentxxx commentxxx commentxxx commentxxx" 
| fields - _time 
| rex field=test mode=sed "s/(\d{2}\/\d{2}\/\d{4} \d{2}:\d{2}:\d{2} -)|$/|||\1/g" 
| rex field=test max_match=10 "(?<full_comments>\d{2}/\d{2}/\d{4}.*?)(?:\|\|\|)|\$"
| fields - test
| mvexpand full_comments
| rex field=full_comments "(?<time>\d{2}/\d{2}/\d{4} \d{2}:\d{2}:\d{2})\s-\s(?<worker>[^(]+)(?<comment>\(.*)"

| table time, worker, comment, full_comments

The first rex command is prepending the delimiter ||| in front of every date and at the end of the string.

| rex field=test mode=sed "s/(\d{2}\/\d{2}\/\d{4} \d{2}:\d{2}:\d{2} -)|$/|||\1/g" 

The second rex command extracts the comments until each delimiter.

| rex field=test max_match=10 "(?<full_comments>\d{2}/\d{2}/\d{4}.*?)(?:\|\|\|)|\$"

The mvexpand splits the multi-valued comments into single-valued ones.

| mvexpand full_comments

The last rex command extracts your wanted fields.

| rex field=full_comments "(?<time>\d{2}/\d{2}/\d{4} \d{2}:\d{2}:\d{2})\s-\s(?<worker>[^(]+)(?<comment>\(.*)"

HTH...

View solution in original post

Raschko
Communicator

Try the following search:

| makeresults count=1 
| eval test="18/10/2018 03:42:26 - Chirs Lee (Work notes) commentxxx commentxxx commentxxx commentxxx 17/10/2018 23:14:04 - Sam Smith (Work notes) commentxxx commentxxx commentxxx commentxxx 17/10/2018 23:13:33 - Bob Bob (Work notes) commentxxx commentxxx commentxxx commentxxx 15/10/2018 23:13:33 - Chris Lee (Work notes) commentxxx commentxxx commentxxx commentxxx" 
| fields - _time 
| rex field=test mode=sed "s/(\d{2}\/\d{2}\/\d{4} \d{2}:\d{2}:\d{2} -)|$/|||\1/g" 
| rex field=test max_match=10 "(?<full_comments>\d{2}/\d{2}/\d{4}.*?)(?:\|\|\|)|\$"
| fields - test
| mvexpand full_comments
| rex field=full_comments "(?<time>\d{2}/\d{2}/\d{4} \d{2}:\d{2}:\d{2})\s-\s(?<worker>[^(]+)(?<comment>\(.*)"

| table time, worker, comment, full_comments

The first rex command is prepending the delimiter ||| in front of every date and at the end of the string.

| rex field=test mode=sed "s/(\d{2}\/\d{2}\/\d{4} \d{2}:\d{2}:\d{2} -)|$/|||\1/g" 

The second rex command extracts the comments until each delimiter.

| rex field=test max_match=10 "(?<full_comments>\d{2}/\d{2}/\d{4}.*?)(?:\|\|\|)|\$"

The mvexpand splits the multi-valued comments into single-valued ones.

| mvexpand full_comments

The last rex command extracts your wanted fields.

| rex field=full_comments "(?<time>\d{2}/\d{2}/\d{4} \d{2}:\d{2}:\d{2})\s-\s(?<worker>[^(]+)(?<comment>\(.*)"

HTH...

samlinsongguo
Communicator

fix the problem by change second one to

| rex field=work_notes max_match=10 "(?<full_comment>[\w\s\W\d]+?)(?:\|\|\|)"

not sure why, any suggestion?

0 Karma

Raschko
Communicator

It is hard to tell without knowing the original logs or fields. Maybe it's a problem with line end ($).

I guess with your rex line, you will miss the last workers comment.

0 Karma

samlinsongguo
Communicator

Hi Raschko
I put your code against production log, the first rex works fine, added ||| into the string, but the second rex commend didnt return any value in full_comments field, any suggestion why?

I put the string value in your original code and it works fine as well
Thank you so much for your help
Cheers
Sam

0 Karma

samlinsongguo
Communicator

This is amazing, didnt expect get answer this fast, you are great

0 Karma
Get Updates on the Splunk Community!

Introducing Splunk Enterprise 9.2

WATCH HERE! Watch this Tech Talk to learn about the latest features and enhancements shipped in the new Splunk ...

Adoption of RUM and APM at Splunk

    Unleash the power of Splunk Observability   Watch Now In this can't miss Tech Talk! The Splunk Growth ...

Routing logs with Splunk OTel Collector for Kubernetes

The Splunk Distribution of the OpenTelemetry (OTel) Collector is a product that provides a way to ingest ...