Splunk Search
Highlighted

Splunk Transforms REGEX Wildcard Help

Contributor

We are routing events to some_index based on the source during parsing.

Part of the source goes to "originalindex", which is set in "inputs.conf", and part of them goes to "someother_index"

props.conf
    [source::some_part_of_source]
    TRANSFORMS-index_routing = route_to_some_other_index

transforms.conf
    [route_to_some_other_index]
    REGEX = .
    DEST_KEY = _MetaData:Index
    FORMAT = some_other_index

We receive lots of events per second and we are concerned that this transforms is causing the delay in indexing (we are seeing indexing lag).

Now the query I have is:

a) REGEX = .
b) REGEX = (.)
c) REGEX = .*
d) REGEX = .*?
e) REGEX = ^.

Does all of the above REGEX matches mean the same or that any one is better over the other, which could help speed up the transformation and reduce the indexing lag?

Highlighted

Re: Splunk Transforms REGEX Wildcard Help

SplunkTrust
SplunkTrust

You can put one of your sample log in https://regex101.com/ and test which regex runs faster and with minimum number of steps. From your above 4, I would try REGEX = ^. as well.

0 Karma
Highlighted

Re: Splunk Transforms REGEX Wildcard Help

New Member

Given the combined list:

  1. REGEX = .
  2. REGEX = (.)
  3. REGEX = .*
  4. REGEX = .*?
  5. REGEX = ^.

I'd expect that 1, and 5 will be very similar, and the best choices. 2 requires the regex engine to create a capture group, which you don't appear to need. 3, depending on the efficiency of the regex engine, may decide to consider all the characters in the event. 4 should reduce to 1, but the regex engine will have to take that extra step.

0 Karma
Highlighted

Re: Splunk Transforms REGEX Wildcard Help

SplunkTrust
SplunkTrust

What does your inputs.conf entry looks like for this? Best scenario here would be that you split the input stanza for this source from original and then assign index at inputs.conf (on forwarder) level, completely avoiding index-time processing of routing to different index.

0 Karma
Highlighted

Re: Splunk Transforms REGEX Wildcard Help

Contributor

Inputs are from Google Pubsub Queue, hence I would not be able to assign both the original index and some index from the inputs.conf.

0 Karma
Highlighted

Re: Splunk Transforms REGEX Wildcard Help

SplunkTrust
SplunkTrust

Any specific reason to separate them out by indexes?

0 Karma
Highlighted

Re: Splunk Transforms REGEX Wildcard Help

Contributor

@somesoni2 I am afraid ^. does not MATCH ALL in https://regex101.com

0 Karma
Highlighted

Re: Splunk Transforms REGEX Wildcard Help

Ultra Champion

-- if this transforms is causing the delay in indexing..
I doubt that the regex can make the difference - I would check standard delay causes...

0 Karma
Highlighted

Re: Splunk Transforms REGEX Wildcard Help

Engager

If you put one of this REGEX you will redirect all your events from your "source" in someotherindex. If you want to redirect only one part of the source, you need to use some keywords (which is only in events that you want redirect in other index) in your REGEX. The better REGEX to match "all" with only one match >> .* and without any group

0 Karma
Speak Up for Splunk Careers!

We want to better understand the impact Splunk experience and expertise has has on individuals' careers, and help highlight the growing demand for Splunk skills.