Splunk Cloud Platform

Routing log data to different indexes based on the source

dj064
Explorer

I have been working on routing logs based on their source into different indexes. I configured below props.conf and transforms.conf on my HF, but it didn't worked. We currently follow the naming convention below for our CloudWatch log group names:

/starflow-app-logs-<platform-name>/<team-id>/<app-name>/<app-environment-name>

--------------------------------------------------------------------------
Example sources:
--------------------------------------------------------------------------

us-east-1:/starflow-app-logs/sandbox/test/prod
us-east-1:/starflow-app-logs-dev/sandbox/test/dev
us-east-1:/starflow-app-logs-stage/sandbox/test/stage

Note: We are currently receiving log data for the above use case from the us-east-1 region.

--------------------------------------------------------------------------
Condition:
--------------------------------------------------------------------------
If the source path contains <team-id>, logs should be routed to the respective index in Splunk. If the source path contains any <team-id>, its logs will be routed to the same <team-id>-based index, which already exists in our Splunk environment.

--------------------------------------------------------------------------
props.conf
--------------------------------------------------------------------------
[source::us-east-1:/starflow-app-logs*]
TRANSFORMS-set_starflow_logging = new_sourcetype, route_to_teamid_index

--------------------------------------------------------------------------
transforms.conf
--------------------------------------------------------------------------
[new_sourcetype]
REGEX = .*
SOURCE_KEY = source
DEST_KEY = MetaData:Sourcetype
FORMAT = sourcetype::aws:kinesis:starflow
WRITE_META = true

[route_to_teamid_index]
REGEX = us-east-1:\/starflow-app-logs(?:-[a-z]+)?\/([a-zA-Z0-9]+)\/
SOURCE_KEY = source
FORMAT = index::$1
DEST_KEY = _MetaData:Index


Screenshot 2024-10-15 232943.png


I’d be grateful for any feedback or suggestions to improve this configuration. Thanks in advance!

0 Karma
1 Solution

dj064
Explorer

Hi @PickleRick, Thank you for your suggestions. 
 
After following your suggestions, the configurations are now working correctly for my use case. Here are the changes I made for [route_to_teamid_index] stanza in transforms.conf:

1) For [route_to_teamid_index]
- Set FORMAT = $1
- Updated SOURCE_KEY = MetaData:Source

Current working configs for my use cases:
-----------------------------------------------------------------------------
props
-----------------------------------------------------------------------------
#custom-props-for-starflow-logs
[source::.../starflow-app-logs...]
TRANSFORMS-set_new_sourcetype = new_sourcetype
TRANSFORMS-set_route_to_teamid_index = route_to_teamid_index

-----------------------------------------------------------------------------
transforms
-----------------------------------------------------------------------------
#custom-transforms-for-starflow-logs
[new_sourcetype]
REGEX = .*
DEST_KEY = MetaData:Sourcetype
FORMAT = sourcetype::aws:kinesis:starflow
WRITE_META = true

[route_to_teamid_index]
REGEX = .*\/starflow-app-logs(?:-[a-z]+)?\/([a-zA-Z0-9]+)\/
SOURCE_KEY = MetaData:Source
FORMAT = $1
DEST_KEY = _MetaData:Index
WRITE_META = true

Previously, the configuration had SOURCE_KEY = source, which was causing issues. The SOURCE_KEY = <field> setting essentially tells Splunk where the regex should be applied. In my configuration, it was set to "source" but Splunk might not have been able to apply the regex to just the source field. After spending time reading through transforms.conf, I noticed that under the global settings, there was a specific mention of this.

SOURCE_KEY = <string>
* NOTE: This setting is valid for both index-time and search-time field
  extractions.
* Optional. Defines the KEY that Splunk software applies the REGEX to.
* For search time extractions, you can use this setting to extract one or
  more values from the values of another field. You can use any field that
  is available at the time of the execution of this field extraction
* For index-time extractions use the KEYs described at the bottom of this
  file.
  * KEYs are case-sensitive, and should be used exactly as they appear in
    the KEYs list at the bottom of this file. (For example, you would say
    SOURCE_KEY = MetaData:Host, *not* SOURCE_KEY = metadata:host .)

Keys 

MetaData:Source     : The source associated with the event. 

 
Thank you sincerely for all of your genuine help!

View solution in original post

0 Karma

PickleRick
SplunkTrust
SplunkTrust
*   matches anything but the path separator 0 or more times.
    The path separator is '/' on unix, or '\' on Windows.
    Intended to match a partial or complete directory or filename.

So for your props.conf stanza you should rather use

... recurses through directories until the match is met
    or equivalently, matches any number of characters.

 

dj064
Explorer

Thanks @PickleRick for suggestion.  Shall I use below config?

[source::.../starflow-app-logs*/...]

0 Karma

PickleRick
SplunkTrust
SplunkTrust

That's one of the options. But "*/..." makes no sense. It's enough to just use ...

dj064
Explorer

Hi @PickleRick still it is not working.

0 Karma

PickleRick
SplunkTrust
SplunkTrust

First things first.

1. Just for the sake of completness of the info - the logs are ingested by inputs on this HF? Not forwarded from remote?

2. To debug one thing at a time I'd start with something foolproof like a simple SEDCMD adding a single letter to an event transform and attach it to a source. This way you're not wondering whether the props part is wrong or the transform itself. When you make sure the props entry is OK because your transform is actually getting called, get to debug your index overwriting.

dj064
Explorer

@PickleRick Thanks for suggestion. 

I made the following changes in transforms.conf:

1) For [new_sourcetype]
- Removed SOURCE_KEY = source

2) For [route_to_teamid_index]
- Updated the regex
- Set WRITE_META = true

After these changes, the sourcetype value successfully changed to "aws:kinesis:starflow", but the data did not route to the specified index. Instead, it went to the default index.

current configs:
-----------------------------------------------------------------------------
props
-----------------------------------------------------------------------------
#custom-props-for-starflow-logs
[source::.../starflow-app-logs...]
TRANSFORMS-set_new_sourcetype = new_sourcetype
TRANSFORMS-set_route_to_teamid_index = route_to_teamid_index

-----------------------------------------------------------------------------
transforms
-----------------------------------------------------------------------------
#custom-transforms-for-starflow-logs
[new_sourcetype]
REGEX = .*
DEST_KEY = MetaData:Sourcetype
FORMAT = sourcetype::aws:kinesis:starflow
WRITE_META = true

[route_to_teamid_index]
REGEX = .*\/starflow-app-logs(?:-[a-z]+)?\/([a-zA-Z0-9]+)\/
SOURCE_KEY = source
FORMAT = index::$1
DEST_KEY = _MetaData:Index
WRITE_META = true

I'm confident that both my props.conf and [new_sourcetype] stanza in transforms.conf are functioning correctly. The only issue seems to be with [route_to_teamid_index].

0 Karma

PickleRick
SplunkTrust
SplunkTrust

Try

FORMAT=$1
DEST=_MetaData:Index

dj064
Explorer

Hi @PickleRick, Thank you for your suggestions. 
 
After following your suggestions, the configurations are now working correctly for my use case. Here are the changes I made for [route_to_teamid_index] stanza in transforms.conf:

1) For [route_to_teamid_index]
- Set FORMAT = $1
- Updated SOURCE_KEY = MetaData:Source

Current working configs for my use cases:
-----------------------------------------------------------------------------
props
-----------------------------------------------------------------------------
#custom-props-for-starflow-logs
[source::.../starflow-app-logs...]
TRANSFORMS-set_new_sourcetype = new_sourcetype
TRANSFORMS-set_route_to_teamid_index = route_to_teamid_index

-----------------------------------------------------------------------------
transforms
-----------------------------------------------------------------------------
#custom-transforms-for-starflow-logs
[new_sourcetype]
REGEX = .*
DEST_KEY = MetaData:Sourcetype
FORMAT = sourcetype::aws:kinesis:starflow
WRITE_META = true

[route_to_teamid_index]
REGEX = .*\/starflow-app-logs(?:-[a-z]+)?\/([a-zA-Z0-9]+)\/
SOURCE_KEY = MetaData:Source
FORMAT = $1
DEST_KEY = _MetaData:Index
WRITE_META = true

Previously, the configuration had SOURCE_KEY = source, which was causing issues. The SOURCE_KEY = <field> setting essentially tells Splunk where the regex should be applied. In my configuration, it was set to "source" but Splunk might not have been able to apply the regex to just the source field. After spending time reading through transforms.conf, I noticed that under the global settings, there was a specific mention of this.

SOURCE_KEY = <string>
* NOTE: This setting is valid for both index-time and search-time field
  extractions.
* Optional. Defines the KEY that Splunk software applies the REGEX to.
* For search time extractions, you can use this setting to extract one or
  more values from the values of another field. You can use any field that
  is available at the time of the execution of this field extraction
* For index-time extractions use the KEYs described at the bottom of this
  file.
  * KEYs are case-sensitive, and should be used exactly as they appear in
    the KEYs list at the bottom of this file. (For example, you would say
    SOURCE_KEY = MetaData:Host, *not* SOURCE_KEY = metadata:host .)

Keys 

MetaData:Source     : The source associated with the event. 

 
Thank you sincerely for all of your genuine help!

0 Karma

PickleRick
SplunkTrust
SplunkTrust

Ahhh... the SOURCE_KEY part I missed 🙂  Good catch!

marnall
Motivator

Is that commented line " #to exract <team-id> from source" on the same line as the regex in your transforms.conf? If so, that should be on a separate line otherwise Splunk will consider it part of the regex.

dj064
Explorer

Nope, I only included it for clarity while writing this post; it’s not part of my actual configuration.

Note: I have removed that part from my post as well.

0 Karma
Get Updates on the Splunk Community!

New Case Study: How LSU’s Student-Powered SOCs and Splunk Are Shaping the Future of ...

Louisiana State University (LSU) is shaping the next generation of cybersecurity professionals through its ...

Splunk and Fraud

Join us on November 13 at 11 am PT / 2 pm ET!Join us for an insightful webinar where we delve into the ...

Build Your First SPL2 App!

Watch the recording now!.Do you want to SPL™, too? SPL2, Splunk's next-generation data search and preparation ...