I have some logs where there are events that are like this:
Apr 5 21:16:33 myhost001.company.com key=value key2=value2 LogType=RA-User
Apr 5 21:16:33 myhost002.company.com key=value key2=value2 LogType=RA-System
Apr 5 21:16:33 myhost002.company.com key=value key2=value2 LogType=RA-Admin
Apr 5 21:16:33 myhost001.company.com key=value key2=value2 LogType=RA-Junk
What I want to accomplish is, based on the LogType=
string, have the events go to different indexes.
The layout I'm trying for is like so:
This is what I have so far. I have a directory called /opt/mylogs, where rsyslog is writing filenames like: /opt/mylogs/myhost001.example.com.log
inputs.conf:
[monitor:///opt/mylogs]
index=idx-other
sourcetype=syslog
props.conf:
[host::myhost*]
TRANSFORMS-setIdx = set_idx_other,set_idx_system,set_idx_user
transforms.conf:
[set_idx_other]
DEST_KEY = _MetaData:Index
FORMAT = idx-other
[set_idx_system]
REGEX = LogType=RA-(Admin|System)
DEST_KEY = _MetaData:Index
FORMAT = idx-system
[set_idx_user]
REGEX = LogType=RA-User
DEST_KEY = _MetaData:Index
FORMAT = idx-user
The inputs.conf is on a standard Universal Forwarder, which is sending over to our indexers. All of our indexers have the props.conf/transforms.conf files deployed out to them.
My thought here is that the hostname extraction from the syslog sourcetype might be happening after these props/transforms rules take place, meaning my rules above in props.conf for a hostname would never match.. but I don't have a way of confirming this.
Also, what's the precedence/processing order for the TRANSFORMS-stuff line in props.conf? My initial idea was that it would keep going down the list, modifying things if there was a match in the REGEX line in transforms. Based on that idea, I should put the "fallback" (set_idx_other) case first, followed by the more specific ones next to ensure everything at least gets some sort of match.
As of now, everything is currently ending up in the idx-other index.
Any ideas?
This is a great way to use indexes to restrict access to specific log types. You are close, but the TRANSFORMS
are done in left to right order. However, since you already have an index defined, it is already the fallback, and anything that matches will be used instead. So, by omitting the set_idx_other
, it should pick up the new index configuration. ( This is off the top of my head - I haven't tested this )
TRANSFORMS-setIdx = set_idx_system,set_idx_user
Other things to check:
REGEX
in transforms.conf
actually match the data correctly? no match == no indexbtool
say? Come find us on IRC (#splunk on efnet.org), Slack (apply at www.splunk402.com/chat ), or send me an email!
@Ricapar - Is there any accepted answer ? what solution was implemented at your end?
I have same scenario , where logs should go in different indexes and their access and retention is defined differently.
Your configuration (and logic/assumptions) look exactly correct to me. Have you deployed this to your Indexers and restarted all splunk instances there? If so, be aware that only data that has been indexed after the restart will be correct (old events will stay wrong).
I think you hostname theory is correct. You could apply to entire source/type and then also check for host in the RegEx. If this test works, then you have confirmed your theory.
This type of partitioning is typically done by sourcetype
or eventttype
(or tag
). Why are you overcomplicating your situation with a highly atypical approach?
Because as far as sourctypes go, these are all the same log. It's all coming in via syslog to my syslog box, which is then forwarding them to Splunk.
A single host, in a single syslog file, is writing it all to the same file for the host.
However, the end user's requirement is that there be different permissions and retention times based on the "LogType" value that comes in on the log. I can't accomplish this if they're all in the same index, so we need to do some filtering based on the _raw content of the event to determine what index it should end up in.
Otherwise, yes, I would agree with your approach - we would simply just tag the events (or tell the user to search using the already search-time extracted LogType key/value on the log).
Access and Retention variations are excellent justifications, especially Retention.