Splunk Search

I have large and long Regex with tens of thousand characters (approx 21k)

Path Finder

I have Regex with tens of thousand characters (approx 21k),
Its for event filtering, with config model like below:

Props.conf
[source::udp:514]
TRANSFORMS-filter&route_syslog = setnull, ip_interface, ip_xxx

Transforms.conf
[ip_interface]
REGEX = (around 21k characters)
DEST_KEY = _SYSLOG_ROUTING
FORMAT = TargetGroup

[ip_xxx]
REGEX = (around hundreds characters)
DEST_KEY = _SYSLOG_ROUTING
FORMAT = TargetGroup

[setnull]
REGEX = .
DEST_KEY = queue
FORMAT = nullQueue

I found the filtering is not working as supposed to, some other data is routed also,

is there any problem with limit?
Where can I adjust the limit?
May be I have to breakdown some of the transform, make it into few filter group (ip_interface1, ip_interface2, ...)?

In short: I was asking if there is limit on regex formula, may be like maximum total characters allowed,

0 Karma

Legend

I suggest that you install some version of syslog to do this filtering and data segregation for you. Both syslog-ng and rsyslog have the ability to filter. If you have syslog doing the initial data capture from UDP:514, then you can also set it up to split the data into multiple files. Here is a snippet from the syslog-ng documentation.

"The destination filename may include macros which get expanded when the message is written, thus a simple file() driver may create several files: for example, syslog-ng OSE can store the messages of client hosts in a separate file for each host. "

If syslog-ng is writing to a set of files, then you get 2 advantages: first, syslog gives you buffering between the network port and Splunk. Second, in Splunk you can more easily specify the routing (or the host name, etc) on a file-by-file basis in inputs.conf and/or props.conf. This is much more efficient than processing the inputs event-by-event in transforms.conf. And you won't need a 21K regular expression.

0 Karma

Path Finder

I was asking if there is limit on regex formula, may be like maximum total characters allowed,

0 Karma

Contributor

Hi,

Same question as @pyro_wood - that seems like a very, very long regex.

Could you add some background about what you're trying to achieve?

  • You have a UDP/514 input, so it sounds like you have devices sending Syslog to Splunk.
  • What is it that you want to happen next?
  • Are you then wanting to get Splunk to route these Syslog message to different onward systems (Splunk and/or something else)?
  • Or are you trying to filter some messages out so that they don't get indexed?

If you could update the question with more about your scenario and requirements, I'm sure folks here will be able to suggest some alternative approaches.

0 Karma

Path Finder

What is it that you want to happen next?
filter and route filtered events to targetgroup

Are you then wanting to get Splunk to route these Syslog message to different onward systems (Splunk and/or something else)?
yes

Or are you trying to filter some messages out so that they don't get indexed?
Trying to filter selected events and route it,

If you could update the question with more about your scenario and requirements, I'm sure folks here will be able to suggest some alternative approaches.

I want to filter and route selected events, the rules is (specific IP AND specific interface name) only. Thats it. And yes the list is large and long,

There is no generic consolidated pattern in the filtering rule,
the regex formula is this: (A1.A2)|(B1.*B2)|(C1.*C2)|............*

Thanks,

0 Karma

SplunkTrust
SplunkTrust

Just one... one question.... why do you have a regex with 21k characters?

0 Karma

Path Finder

the regex is based on specific ip address and interface name

State of Splunk Careers

Access the Splunk Careers Report to see real data that shows how Splunk mastery increases your value and job satisfaction.

Find out what your skills are worth!