Splunk Search

rex extract field not working as expected/ miss handling ")" in regex

samlinsongguo
Communicator

Hi
I have a field with following value

16/08/2018 03:04:11 - Christian (Work notes) Remote Desktop Notes: - still unable to remote in to the machine  10/08/2018 07:11:53 - Christian (Work notes) Remote Desktop Notes: - machine is offline - 08/08/2018 01:11:53 - Sam (Work notes) Remote Desktop Notes: - machine is comprimised 

This is all job comments relate with the work and I want to get the last comment only of the job which will be the string between the first and second timestamps

 - Christian (Work notes) Remote Desktop Notes: - still unable to remote in to the machine  

I tried use following regex in regex101.com, it seems works fine.

^\d{2}\/\d{2}\/\d{4}\s\d{2}:\d{2}:\d{2}\s-\s(?<lastcomment>.+?(?=\d{2}\/\d{2}\/\d{4}\s\d{2}:\d{2}:\d{2}\s-\s))

But when I put the rex into the query it does not return anything

... | rex field=work_notes "^\d{2}\/\d{2}\/\d{4}\s\d{2}:\d{2}:\d{2}\s-\s(?<lastcomment>.+?(?=\d{2}\/\d{2}\/\d{4}\s\d{2}:\d{2}:\d{2}\s-\s))" | table number lastcomment

so I am doing some testing and find the problem is splunk miss reading the ")" as if I do following query

... rex field=work_notes "^\d{2}\/\d{2}\/\d{4}\s\d{2}:\d{2}:\d{2}\s-\s(?<lastcomment>.*)" | table number lastcomment

it return as

Christian (Work notes)

instead of the whole string as what ".*" expect to do

Christian (Work notes) Remote Desktop Notes: - still unable to remote in to the machine  10/08/2018 07:11:53 - Christian (Work notes) Remote Desktop Notes: - machine is offline - 08/08/2018 01:11:53 - Sam (Work notes) Remote Desktop Notes: - machine is comprimised 

and if I put space between * and ) like below

...| rex field=work_notes "^\d{2}\/\d{2}\/\d{4}\s\d{2}:\d{2}:\d{2}\s-\s(?<lastcomment>.* )" | table number  lastcomment

it will return as

 Christian (Work

Sorry for the long post, any suggestion what is going on there?

Tags (2)
0 Karma

richgalloway
SplunkTrust
SplunkTrust

This alternative regex string may work better.

^\d{2}\/\d{2}\/\d{4}\s\d{2}:\d{2}:\d{2}\s-\s(?<lastcomment>.+?)(?=\d{2}\/\d{2}\/\d{4}\s\d{2}:\d{2}:\d{2}\s-\s)

---
If this reply helps you, Karma would be appreciated.
0 Karma

samlinsongguo
Communicator

the regex does not make any different.

0 Karma

ddrillic
Ultra Champion

I ran the following -

index=os
| eval _raw="16/08/2018 03:04:11 - Christian (Work notes) Remote Desktop Notes: - still unable to remote in to the machine  10/08/2018 07:11:53 - Christian (Work notes) Remote Desktop Notes: - machine is offline - 08/08/2018 01:11:53 - Sam (Work notes) Remote Desktop Notes: - machine is comprimised "
| rex field=_raw  "^\d{2}\/\d{2}\/\d{4}\s\d{2}:\d{2}:\d{2}\s-\s(?<lastcomment>.+?(?=\d{2}\/\d{2}\/\d{4}\s\d{2}:\d{2}:\d{2}\s-\s))"

lastcomment came out as - Christian (Work notes) Remote Desktop Notes: - still unable to remote in to the machine.

0 Karma

samlinsongguo
Communicator

Hi ddrillic
I tried run your query it got the result as you mentioned but it is not the only string in the value, so I did
eval _raw=work_notes | rex field=_raw "^\d{2}\/\d{2}\/\d{4}\s\d{2}:\d{2}:\d{2}\s-\s(?.+?(?=\d{2}\/\d{2}\/\d{4}\s\d{2}:\d{2}:\d{2}\s-\s))"
| table lastcomment
it still come out with the same problem

0 Karma
Get Updates on the Splunk Community!

Upcoming Webinar: Unmasking Insider Threats with Slunk Enterprise Security’s UEBA

Join us on Wed, Dec 10. at 10AM PST / 1PM EST for a live webinar and demo with Splunk experts! Discover how ...

.conf25 technical session recap of Observability for Gen AI: Monitoring LLM ...

If you’re unfamiliar, .conf is Splunk’s premier event where the Splunk community, customers, partners, and ...

A Season of Skills: New Splunk Courses to Light Up Your Learning Journey

There’s something special about this time of year—maybe it’s the glow of the holidays, maybe it’s the ...