Getting Data In

How to make my regex more efficient?

ebs
Communicator

Hi, I've exceeded my configured match_limit in limits.conf with this regex:

"log":\s"(?<log_source>.*?)\s(?<ISO8601>.*?)\| (?<exchangeId>.*?)\|(?<AUDIT_trackingId>.*?)\| (?<client_ip>.*?)\|(?<FAPI_ip>.*?)\|(?<AUDIT_roundTripMS>.*?) ms\| (?<AUDIT_proxyRoundTripMS>.*?) ms\| (?<AUDIT_userInfoRoundTripMS>.*?) ms\| (?<AUDIT_resource>.*?)\s\[\]\s\/(?<AUDIT_subject>.*?)\/\*\:(?<dest_port>.*?)\|(?<AUDIT_authMech>.*?)\|(?<AUDIT_scopes>.*?)\| (?<AUDIT_client>.*?)\| (?<AUDIT_method>.*?)\| (?<AUDIT_requestUri>[^\s\?"|]++)(?<uri_query>\?[^\s"]*)?.*?\| (?<AUDIT_responseCode>.*?)\|(?<AUDIT_failedRuleType>.*?)\|(?<AUDIT_failedRuleName>.*?)\| (?<AUDIT_applicationName>.*?)\| (?<AUDIT_resourceName>.*?)\| (?<AUDIT_pathPrefix>.*?)\s

Is there a way to make it more efficient? Please advise

Labels (1)
0 Karma
1 Solution

ITWhisperer
SplunkTrust
SplunkTrust

As @venkatasri  said, don't use * and be more specific e.g. (?<auditid>\S+) for non-whitespaces, but you know your data best so you should be able to define the pattern more rigorously

View solution in original post

venkatasri
SplunkTrust
SplunkTrust

Hi @ebs 

The format of logs seems PSV format why don't you use delimiter based extractions?

Do not use * be specific, the list goes on...

0 Karma

ebs
Communicator

Because its not all delimer based. If you could give me an example of making this extraction more efficient I would be apppreciative

0 Karma

venkatasri
SplunkTrust
SplunkTrust

Can you share sample event?

Tags (1)
0 Karma

ITWhisperer
SplunkTrust
SplunkTrust

As @venkatasri  said, don't use * and be more specific e.g. (?<auditid>\S+) for non-whitespaces, but you know your data best so you should be able to define the pattern more rigorously