I have what I think should be a simple question.... how can I find in Splunk why a regex extraction failed? I bring in a log file with events which look pretty similar, but some of the records parse correctly and others fail with an error message - "Error in 'rex' command: Regex match error, please check log". Where is the log in question? Could I see this error in index=_internal? How can I find out what Splunk thinks happened with the records that failed?
If I paste the failed data and the regex into regex101.com they match fine. So maybe some records aren't using the correct field extractor? Who knows!
Thanks for your help
Do it like this:
Your Base Search Here YourBrokenFieldNameHere!="*"
This will return all events where the field that should have been extracted does not exist. Then test these events and your RegEx with a tool like http://www.RegEx101.com. Fix your RegEx, deploy, keep looping until your search returns 0 events.
Do it like this:
Your Base Search Here YourBrokenFieldNameHere!="*"
This will return all events where the field that should have been extracted does not exist. Then test these events and your RegEx with a tool like http://www.RegEx101.com. Fix your RegEx, deploy, keep looping until your search returns 0 events.
The plot thickens...
I wonder if there is a regex expression that strips out CR/LF characters, and whether I should be doing that in Event Breaks on load so that the characters don't ever find their way into _raw?
Appreciate your help!!
Got it... (?s) puts regex into single line mode which means that the dot includes line feed characters.
Works now. Thanks for your help!!
OK, then upvote any answers that were helpful and then click Accept
on the best one to close the question.
are these settings to extract one value per event or multivalue(s)?
Thanks so much for your help. To continue, the regex is this :
(?P<log_timestamp>\d+\-\d+\-\d+\s+\d+:\d+:\d+\.\d+)\s+(?P<log_level>\w+)\s+[\[](?:[$]|[a-zA-Z\-\_]*)(?P<log_thread>[0-9 ]+)[\]]\s+(?P<log_msg>(?:.|\n|\r)+)
A raw event that works looks like this :
2016-01-29 20:32:33.724 INFO [ 1] Finished Precious Statement report (GenerateReports:128)
A raw event that failed looks like this :
2016-01-30 00:59:49.468 ERROR [ 1] Precious Account Statement Raised exception System.Data.SqlClient.SqlException (0x80131904): Login failed for user 'CAMInterfaceUser'. Reason: The password of the account must be changed.
at SBL.RB.CAM.ReportEngine.StatementHelper.GetLedgersForReportGeneration(SqlConnection sqlconn, Int64 reportTypeId, DateTime currentBusinessDate, DateTime NextBusinessDate) in d:\Code\GMO\merges\Cortex\Enterprise\CAM\Services\Win\SBL.RB.CAM.ReportEngine\SBL.RB.CAM.ReportEngine\StatementHelper.cs:line 143
at SBL.RB.CAM.ReportEngineEOD.PreciousStatement.GenerateReports() in d:\Code\GMO\merges\Cortex\Enterprise\CAM\Services\Win\SBL.RB.CAM.ReportEngine\SBL.RB.CAM.ReportEngineEOD\ReportGenerators\PreciousStatement.cs:line 30 (PreciousStatement:30)
I am afraid the log (if any) is probably just going to tell you the same error, not the reason for the error. There is no log that outputs details or indications of failed extractions when the issue is simply something wrong with the extraction method/pattern.
It is always worth checking the output of btool
to make sure there isn't some syntax error that constitutes a bigger problem. Run btool
like this from the command line.
$ <SPLUNK_HOME>/bin/splunk btool check
or this for more debug output
$ <SPLUNK_HOME>/bin/splunk btool check --debug
Also, make sure you are properly extracting fields inside of rex
using the format (?<fieldName>SOME_PATTERN)
.
Post your rex
command and some failing data if you want regex help.