Is there an alternative to using regex in my searc...

sieutruc · ‎06-16-2016

hello,

After reading some answers, I see that if I use regex for searching events corresponding to a pattern, it will take a lot of time as Splunk reads all events from disk.

For example: I use index=X email="test@*", it will be so much faster than index=X | regex email="test@.*".

So my question is beside the * , can I use another regex term in the default search without using regex that provides the same performance as original search.

For ex:
index=X email="test@[a-z]+.com" ?
index=X email="test@[0-9]*.com" ?

jmallorquin · ‎06-17-2016

Hi,

Have you try to extract the fields of the email like @

Then if you make a search using these fields it should be faster like

index=aaaa field1=test field2=google.com

Hope i help you

sieutruc · ‎06-17-2016

it does not actually respond to my question because if field 1 contains un regular expression that is not "*" wild card, you have to use regex command and ... splunk reads all events for the comparision. I think the temporary solution maybe use the hydrid solution like the answer of @somesoni2 above

jmallorquin · ‎06-20-2016

Ok regards 🙂

jdunlea · ‎06-16-2016

From quickly scanning through some documentation, it seems that "rex" is actually a "distributed streaming" command which means it can be run on the indexer itself so you don't have to worry about innefficiencies with map-reduce.

However, to better structure your search you can provide all the "known search tokens" to your search and you could do something like this:

index=x test @ | regex email="test@.*"

What this does is it passes the known "search tokens" of "test" and "@" as search tokens to the indexer which allows the indexer to pull out only events with those two tokens anywhere in the event. THEN the "rex" will do the specific pattern match. I dont think doing the rex on it's own will allow the indexer to search for events where ONLY "test" and "@" are present. It will have to search ALL events first.

So the search above reduces inefficiencies because it puts everything that you know you need before the first pipe and then allows the rex to do the pattern matching afterwards.

sieutruc · ‎06-17-2016

your answer is just right for some specific cases. if i search for "email="test.*hello@.*", the search with the tokens like "test hello @" will return nothing.

jdunlea · ‎06-17-2016

Well, that is only true if "test" and "hello" are not individual tokens.

I.E. If I search as follows:

index=X test hello @ | rex email="test.*hello@.*"

This will NOT return any results IF the data you are looking for is something like

"testworldhello@something.com"

This is because you cannot search for "test" or "hello" on their own if they are just a part of a larger token (testworldhello).

The search above WILL return results if the data looks like:

"test.world-hello@something.com"

The main point I am trying to make is that to create better search efficiency you can provide as many actual tokens as you can, up front. Tokens are separated by things like dots, dashes, slashes, etc.

To see how tokens are identified and separated in Splunk you can research segmenters.conf which shows you how Splunk breaks out tokens in any event.

sieutruc · ‎06-17-2016

i got your point, but that's the reason i asked this question, i want to know if splunk supports more than the asterisk wild card in the base search. Thank you in anyway.

jdunlea · ‎06-17-2016

Ah I see. Looks like you got your answer above! Good luck! 🙂

somesoni2 · ‎06-16-2016

Regular expressions are not supported in base search (only wild card */ asterisk ). I would suggest to add some filters in the base search using wildcard and then use regex to do to the point filter (hybrid of both type of filter).

sieutruc · ‎06-17-2016

thank for your reply, only * is supported in base search ( cannot use ?, [0-9], or [a-z] ), is it right ?
I ask this type of question because i did not where the doc of splunk mentions all regular expressions that could be used in base search.

somesoni2 · ‎06-17-2016

The base search provides all the options a "| search" command provides (actually they are the same, it's hidden in base search). It basically uses logical expression (not regular expressions). See more info here
http://docs.splunk.com/Documentation/Splunk/6.4.1/SearchReference/Search#Usage

sieutruc · ‎06-17-2016

Thank you for your information, now i know only * is supported. I hope splunk would support more wild card in the future version 🙂 , for ex: "?" or "|".

Is there an alternative to using regex in my search for better performance?

Get Operational Insights Quickly with Natural Language on the Splunk Platform

What’s New in Splunk Observability Cloud – June 2025

Almost Too Eventful Assurance: Part 2

Are you a member of the Splunk Community?

Is there an alternative to using regex in my search for better performance?

Get Operational Insights Quickly with Natural Language on the Splunk Platform

What’s New in Splunk Observability Cloud – June 2025

Almost Too Eventful Assurance: Part 2