Hi Splunkers, I have a problem with a Per-Event Index Routing use case.
In involved environment, there are some data currently collected in a index named ot.
Here we have some logs that must be splitted and redirect to other indexes, with naming convention ot_<tecnology>. Inputs.conf involved file is placed under a dedicated app, named simply customer_inputs.
The procedure to use is very clear for us: we created, inside above app, props.conf and transforms.conf and worked with key and regex. The strange behavior is this: if we work to redirect one kind of logs, it works perfectly. When we add another log subset, nothing works properly. Let me share you an example.
Scenario 1
In this case, we want:
We can identify involved logs based on ports; they are coming as network input on port 514 udp, with CEF format.
First, our props.conf
[source::udp:514]
TRANSFORMS-ot_windows = windows_logs
Second, our transofrms.conf
[windows_logs]
SOURCE_KEY = _raw
REGEX = <our_regex>
DEST_KEY = _MetaData:Index
FORMAT = ot_windows
This configuration works fine: Windows logs goes in ot_windows index, all remaining ones still go on ot index.
Then, we try another configuration, explained on second scenario.
Scenario 2
In this case, we want:
Again, we can identify involved logs based on ports; they are coming as network input on port 514 udp, with CEF format.
First, our props.conf
[source::udp:514]
TRANSFORMS-ot_nozomi = nozomi_logs
Second, out transforms.conf
[nozomi_logs]
SOURCE_KEY = _raw
REGEX = <our_second_regex>
DEST_KEY = _MetaData:Index
FORMAT = ot_nozomi
Again, this conf works fine: all Nozomi logs go on dedicated index, ot_nozomi, while all remaining one still go on ot index.
ISSUE
So, if we set one of above conf, we got expected behavior. By the way, when we try to merge above confs, nothing works: logs, both Windows and Nozomi, continue to go on ot index. Due they work fine when they are "single", we suspect error is not on regex used, but on how we perform merge. Currently, our merged conf files looks like this:
props.conf
[source::udp:514]
TRANSFORMS-ot_windows = windows_logs
TRANSFORMS-ot_nozomi = nozomi_logs
transforms.conf
[windows_logs]
SOURCE_KEY = _raw
REGEX = <our_regex>
DEST_KEY = _MetaData:Index
FORMAT = ot_windows
[nozomi_logs]
SOURCE_KEY = _raw
REGEX = <our_second_regex>
DEST_KEY = _MetaData:Index
FORMAT = ot_nozomi
Is our assumption right? If yes, what is the correct merge structure?
First and foremost - don't receive syslogs directly on your Splunk component (UF or HF/idx).
Its performance is sub-par, it doesn't capture reasonable metadata about the transport layer, it doesn't scale well - as you can see.
But if you're trying to do it anyway, the easier way would be to simply send your events to separate ports and associate specific sourcetypes and indexes with specific ports. That's much easier to handle than this overwriting things.
But if you insist on doing it this way, the most important thing to remember when analyzing such configurations is that:
1. Transform classes are ordered in alphabetical order (so TRANSFORMS-my_transforms_a will be used _after_ TRANSFORMS-aaaa_my_transforms_zzz)
2. Transforms within a single transform class are used in a left to right order.
3. In a simple configuration (without some fancy ruleset-based entries), _all_ matching transforms are "executed" - the processing isn't stopped just because you've already overwritten some particular metadata field. If any subsequent transform should overwrite the field it will.
So in your configuration the TRANSFORMS-ot_nozomi should be executed resulting in some events redirected to the ot_nozomi index, then Splunk would execute the TRANSFORMS-ot_windows and redirect events to ot_windows (even if some of them have already been redirected to the ot_nozomi index - the destination index will get overwritten if they match both regexes)..
It _should_ work this way. If it doesn't I'd check if there are no issues with either config file precedence or sourcetype/source/host precedence.
First and foremost - don't receive syslogs directly on your Splunk component (UF or HF/idx).
Its performance is sub-par, it doesn't capture reasonable metadata about the transport layer, it doesn't scale well - as you can see.
But if you're trying to do it anyway, the easier way would be to simply send your events to separate ports and associate specific sourcetypes and indexes with specific ports. That's much easier to handle than this overwriting things.
But if you insist on doing it this way, the most important thing to remember when analyzing such configurations is that:
1. Transform classes are ordered in alphabetical order (so TRANSFORMS-my_transforms_a will be used _after_ TRANSFORMS-aaaa_my_transforms_zzz)
2. Transforms within a single transform class are used in a left to right order.
3. In a simple configuration (without some fancy ruleset-based entries), _all_ matching transforms are "executed" - the processing isn't stopped just because you've already overwritten some particular metadata field. If any subsequent transform should overwrite the field it will.
So in your configuration the TRANSFORMS-ot_nozomi should be executed resulting in some events redirected to the ot_nozomi index, then Splunk would execute the TRANSFORMS-ot_windows and redirect events to ot_windows (even if some of them have already been redirected to the ot_nozomi index - the destination index will get overwritten if they match both regexes)..
It _should_ work this way. If it doesn't I'd check if there are no issues with either config file precedence or sourcetype/source/host precedence.