Splunk Search

Why is there a Regex Character Limitation

jadengoho
Builder

I have a very long regex query (12,000) character long- it consist o different hostname and IP Address combinations.

Now when i run the regex it shows :: Regex: regular expression is too large.

 

error.png

As per checking the Regex can only accommodate - 8190 character.

In the image you can see i use "a" letter 8190 time. but if i add another letter it will show the error.

search.png

 Can somebody explain to me why is this happening and how can i execute my regex properly.

 
 
 

 

 

 

Labels (1)
Tags (1)
0 Karma

richgalloway
SplunkTrust
SplunkTrust

For reasons known only to those who wrote the code, Splunk can't handle a regular expression longer than 8190 characters.  The workaround is to make the regex short enough to fit into 8190 characters.  Sometimes a single rex command can be split into multiple smaller rex commands.

---
If this reply helps you, Karma would be appreciated.

jadengoho
Builder

Hi @richgalloway 

We tried to shorten the regex from 14,000 to 11,000 characters.

Is there any limits configuration we can tweak to override this Regex limitation

0 Karma

isoutamo
SplunkTrust
SplunkTrust
Usually that kind of tweaks can do with parameters in limits.conf, but at least I cannot found any suitable for that.
@cpetterborg have you any idea for this?

In curiosity how you can manage that regex? Usually much much shorter are already hard to update etc.
0 Karma

isoutamo
SplunkTrust
SplunkTrust
0 Karma

jadengoho
Builder

 

We have a 20,000+ combination of word/phrase that should be present on the logs to be routed to proper index.

Example"

 

CAT should have DOG - routed to sample1 index
RAT should have COUNT - routed to sample1 index.

In the transforms.conf 
REGEX = (cat.*dog|rat.*count|computer.*calculator|computer.*device.*v2)

https://goolge/sites/cat/page/dog
https://goolge/sites/rat/page/count
https://goolge/sites/computer/page/calculator
https://goolge/sites/computer/page/device/machine/v2

 

I've done all the possibilities to compress the regex but that is the best i can do. 

 

0 Karma

mtulett_splunk
Splunk Employee
Splunk Employee

In case this was never resolved, or for others who are interested, the solution here is to use multiple transforms stanzas to bring the total size under 8190, like so:

props.conf:

[my_sourcetype]
TRANSFORMS-index_routing = ruleset1, ruleset2

transforms.conf:

[ruleset1]
REGEX = (cat.*dog|rat.*count)
FORMAT = sample1
DEST_KEY = _MetaData:Index

[ruleset2]
REGEX = (computer.*calculator|computer.*device.*v2)
FORMAT = sample1
DEST_KEY = _MetaData:Index

 I would also argue in this specific case a different approach should be used as a regex this sizable will cause high CPU overhead during ingestion, especially if the source is high-volume.

Got questions? Get answers!

Join the Splunk Community Slack to learn, troubleshoot, and make connections with fellow Splunk practitioners in real time!

Meet up IRL or virtually!

Join Splunk User Groups to connect and learn in-person by region or remotely by topic or industry.

Get Updates on the Splunk Community!

[Puzzles] Solve, Learn, Repeat: Matching cron expressions

This puzzle (first published here) is based on matching timestamps to cron expressions.All the timestamps ...

Design, Compete, Win: Submit Your Best Splunk Dashboards for a .conf26 Pass

Hello Splunkers,  We’re excited to kick off a Splunk Dashboard contest! We know that dashboards are a primary ...

May 2026 Splunk Expert Sessions: Security & Observability

Level Up Your Operations: May 2026 Splunk Expert Sessions Whether you are refining your security posture or ...