Splunk Search

How to dynamically set an index based on a string in events with a fallback option?

Ricapar
Communicator

I have some logs where there are events that are like this:

Apr  5 21:16:33 myhost001.company.com  key=value key2=value2 LogType=RA-User
Apr  5 21:16:33 myhost002.company.com key=value key2=value2 LogType=RA-System
Apr  5 21:16:33 myhost002.company.com key=value key2=value2 LogType=RA-Admin
Apr  5 21:16:33 myhost001.company.com key=value key2=value2 LogType=RA-Junk

What I want to accomplish is, based on the LogType= string, have the events go to different indexes.

The layout I'm trying for is like so:

  • LogType=RA-User goes to index=idx-user
  • LogType=RA-System OR RA-Admin goes to index=idx-system
  • LogType is NOT any of the above goes to index=idx-other

This is what I have so far. I have a directory called /opt/mylogs, where rsyslog is writing filenames like: /opt/mylogs/myhost001.example.com.log

inputs.conf:

[monitor:///opt/mylogs]
index=idx-other
sourcetype=syslog

props.conf:

[host::myhost*]
TRANSFORMS-setIdx = set_idx_other,set_idx_system,set_idx_user

transforms.conf:

[set_idx_other]
DEST_KEY = _MetaData:Index
FORMAT = idx-other

[set_idx_system]
REGEX = LogType=RA-(Admin|System)
DEST_KEY = _MetaData:Index
FORMAT = idx-system 

[set_idx_user]
REGEX = LogType=RA-User
DEST_KEY = _MetaData:Index
FORMAT = idx-user

The inputs.conf is on a standard Universal Forwarder, which is sending over to our indexers. All of our indexers have the props.conf/transforms.conf files deployed out to them.

My thought here is that the hostname extraction from the syslog sourcetype might be happening after these props/transforms rules take place, meaning my rules above in props.conf for a hostname would never match.. but I don't have a way of confirming this.

Also, what's the precedence/processing order for the TRANSFORMS-stuff line in props.conf? My initial idea was that it would keep going down the list, modifying things if there was a match in the REGEX line in transforms. Based on that idea, I should put the "fallback" (set_idx_other) case first, followed by the more specific ones next to ensure everything at least gets some sort of match.

As of now, everything is currently ending up in the idx-other index.

Any ideas?

0 Karma

alacercogitatus
SplunkTrust
SplunkTrust

This is a great way to use indexes to restrict access to specific log types. You are close, but the TRANSFORMS are done in left to right order. However, since you already have an index defined, it is already the fallback, and anything that matches will be used instead. So, by omitting the set_idx_other, it should pick up the new index configuration. ( This is off the top of my head - I haven't tested this )

TRANSFORMS-setIdx = set_idx_system,set_idx_user

Other things to check:

  1. Are all of the indexes actually created?
  2. Does the REGEX in transforms.conf actually match the data correctly? no match == no index
  3. What does the btool say?

Come find us on IRC (#splunk on efnet.org), Slack (apply at www.splunk402.com/chat ), or send me an email!

0 Karma

rashi83
Path Finder

@Ricapar - Is there any accepted answer ? what solution was implemented at your end?

I have same scenario , where logs should go in different indexes and their access and retention is defined differently.

0 Karma

woodcock
Esteemed Legend

Your configuration (and logic/assumptions) look exactly correct to me. Have you deployed this to your Indexers and restarted all splunk instances there? If so, be aware that only data that has been indexed after the restart will be correct (old events will stay wrong).

I think you hostname theory is correct. You could apply to entire source/type and then also check for host in the RegEx. If this test works, then you have confirmed your theory.

0 Karma

woodcock
Esteemed Legend

This type of partitioning is typically done by sourcetype or eventttype (or tag). Why are you overcomplicating your situation with a highly atypical approach?

0 Karma

Ricapar
Communicator

Because as far as sourctypes go, these are all the same log. It's all coming in via syslog to my syslog box, which is then forwarding them to Splunk.

A single host, in a single syslog file, is writing it all to the same file for the host.

However, the end user's requirement is that there be different permissions and retention times based on the "LogType" value that comes in on the log. I can't accomplish this if they're all in the same index, so we need to do some filtering based on the _raw content of the event to determine what index it should end up in.

Otherwise, yes, I would agree with your approach - we would simply just tag the events (or tell the user to search using the already search-time extracted LogType key/value on the log).

0 Karma

woodcock
Esteemed Legend

Access and Retention variations are excellent justifications, especially Retention.

0 Karma
Get Updates on the Splunk Community!

Splunk Smartness with Brandon Sternfield | Episode 3

Hello and welcome to another episode of "Splunk Smartness," the interview series where we explore the power of ...

Monitoring Postgres with OpenTelemetry

Behind every business-critical application, you’ll find databases. These behind-the-scenes stores power ...

Mastering Synthetic Browser Testing: Pro Tips to Keep Your Web App Running Smoothly

To start, if you're new to synthetic monitoring, I recommend exploring this synthetic monitoring overview. In ...