Getting Data In

change index metadata dynamically during ingestion

iam_ironman
Explorer

Hi all,

I'm trying to see if logs can be send to different indexes at index time depending on regex.  Is it possible to send logs to index name that is part of Source metadata?

Below are my props.conf and transforms.conf

  • props.conf:

    [test:logs]
    TRANSFORMS-new_index = new_index

  • transforms.conf

    [new_index]
    SOURCE_KEY = MetaData:Source
    REGEX = (?<index>\w+)\-\d+ 
    FORMAT = $1                                       #This needs to be dynamic 
    DEST_KEY = _MetaData:Index

Thanks in advance.

Labels (3)
0 Karma

iam_ironman
Explorer

What I meant by "dynamic" is that the value for index should be what regex finds and uses it for FORMAT. I know I can use static value but wanted to confirm it that is something possible using regex to dynamically use correct index which is part to Source.

Example of sources : phone-1234 , tablet-23456, pc-45623, pc-79954

[new_index]
SOURCE_KEY = MetaData:Source
REGEX = (\w+)\-\d+ 
FORMAT = $1                                       #This needs be either phone, tablet, pc etc. and don't want to make static
DEST_KEY = _MetaData:Index
WRITE_META = true

0 Karma

gcusello
SplunkTrust
SplunkTrust

Hi @iam_ironman ,

does it run in this way?

Ciao.

Giuseppe

0 Karma

iam_ironman
Explorer

Haven't tried yet, but wanted to confirm if it works for POC.

Thanks.

0 Karma

gcusello
SplunkTrust
SplunkTrust

Hi @iam_ironman ,

I usually use this configuration.

Ciao.

Giuseppe

PickleRick
SplunkTrust
SplunkTrust

Yes, that's exactly what that is for. Still, consider what @gcusello already said - multiplying indexes is not always a good practice. There are different mechanisms for data "separation" depending on your use case.

Unless you need

- different access permissions

- different retention period

or you have significantly different data characteristics (cardinatility, volume and "sparsity") you should leave the data in the same index and limit your searches by adding conditions.

gcusello
SplunkTrust
SplunkTrust

Hi @iam_ironman ,

only one question: why?

indexes aren't database tables, indexes are containers where logs are stored, the log categorization is done with sourcetype field.

usually custom indexes are mainly created when there are different requirements about retention and grant accesses and secondary for different log volumes.

So why do you want to create so many indexes, that you have to maintain and that after a retention time, will be empty?

Enyway, the rex you used is wrong, you don't need to extract the index field to assign a dinamic value to this field, you have to identify a group and use it for the index value:

[new_index]
SOURCE_KEY = MetaData:Source
REGEX = ^(\w+)\-\d+ 
FORMAT = $1 
DEST_KEY = _MetaData:Index

Ciao.

Giuseppe

PickleRick
SplunkTrust
SplunkTrust

While the general question is of course valid and needs to be considered properly, I saw similar cases in my experience - splitting data from a single source into separate indexes.

The most typical case is when you have a single solution providing logs for separate business entities (like a central security appliance protecting multiple divisions or even companies from a single business group).

You might want to split events so that each unit has access only to its own events (possibly with some overseeing security team having access to all those indexes).

So there are valid use cases for similar setups 🙂

PickleRick
SplunkTrust
SplunkTrust

What do you mean by "dynamic" here?

Also, you might need WRITE_META = true

Also also, you might want to use ingest actions.

 

0 Karma
Get Updates on the Splunk Community!

CX Day is Coming!

Customer Experience (CX) Day is on October 7th!! We're so excited to bring back another day full of wonderful ...

Strengthen Your Future: A Look Back at Splunk 10 Innovations and .conf25 Highlights!

The Big One: Splunk 10 is Here!  The moment many of you have been waiting for has arrived! We are thrilled to ...

Now Offering the AI Assistant Usage Dashboard in Cloud Monitoring Console

Today, we’re excited to announce the release of a brand new AI assistant usage dashboard in Cloud Monitoring ...