I have the following log structure:
2023-11-25T21:18:54.244444 [ info ] I am a log message request = GET /api/myendpoint request_id = ff223452
I can capture the date and time (without the 244444 part) using:
rex field=myfield "(?<timestamp>\d{4}-\d{2}-\d{2}T\d{2}:\d{2}:\d{2})\.\d+"
and timestamp is properly captured.
But if I try to extend this and want to capture the log level as well with for example:
rex field=myfield "(?<timestamp>\d{4}-\d{2}-\d{2}T\d{2}:\d{2}:\d{2})\.\d+\s+\[\s*(?<loglevel>\w+)\s*\]\s+"
It didn't work; none of the timestamp nor the loglevel is captured.
What am I doing wrong?
You don't appear to be doing anything wrong, given the example you have shared.
| makeresults
| eval _raw="2023-11-25T21:18:54.244444 [ info ] I am a log message request = GET /api/myendpoint request_id = ff223452"
| rex "(?<timestamp>\d{4}-\d{2}-\d{2}T\d{2}:\d{2}:\d{2})\.\d+\s+\[\s*(?<loglevel>\w+)\s*\]\s+"
Thanks for verifying! When I copy paste my log directly to the search box from the log message field and used your makeresults, I see that actually some of the spaces are actually character; do you know why perhaps its not shown in the results itself (and I have to copy paste)?
Assuming that the non-word characters are in the square brackets, you could try something like this
| makeresults
| eval _raw="2023-11-25T21:18:54.244444 [ info ] I am a log message request = GET /api/myendpoint request_id = ff223452"
| rex "(?<timestamp>\d{4}-\d{2}-\d{2}T\d{2}:\d{2}:\d{2})\.\d+\s+\[\W*(?<loglevel>\w+)\W*\]\s+"
but, ideally, you should ask the developers of the application to not use these characters in the first place.