Getting Data In

Writing regex for specific /var/log/*.log files

Na_Kang_Lim
Path Finder

The Splunk app for Linux already provided a stanza for collecting all the .log files in the /var/log folder ([monitor::///var/log]). But what if I want to write specific regex/transformations for specific .log file, given its path.

For example, I want to apply transformation by writing specific stanzas in props.conf and transforms.conf for file /var/log/abc/def.log and /var/log/abc/ghi.log. 

How to make these have the same sourcetype as "alphabet_log" and then write its regex functions?

I also have a question regarding the docs from Splunk

In the props.conf docs, it stated that:

For settings that are specified in multiple categories of matching [<spec>]
stanzas, [host::<host>] settings override [<sourcetype>] settings.
Additionally, [source::<source>] settings override both [host::<host>]
and [<sourcetype>] settings.

 What does "override" here mean? Does it override everything, or it combines and only override the duplicate configs?

0 Karma
1 Solution

PickleRick
SplunkTrust
SplunkTrust

1. Addon for *nix does contain some questionable items. They are good for demonstrating functionality but not necessarily for production use. I definitely wouldn't just bulk ingest everything under /var/log with one sourcetype. You mightjJust disable the "global" /var/log montor stanza and ingest each needed file/dir with its own sourcetype. It's easier than pulling all files at once and overwriting the sourcetype later (like @PrewinThomas showed). But you can do it using overriding settings on input.

2. In case of the sourcetype assignment, you can override the sourcetype set at input level by defining a props.conf source:: stanza with a sourcetype assignment.

If you create an entry in inputs.conf

[monitor:///var/log]
sourcetype=s1

but add to props.conf

[source::.../var/log/apache/access.log]
sourcetype=s2

All files from /var/log will be ingested with sourcetype of s1 except for the access.log which will have a sourcetype s2.

3. Overriding in that quote you posted means that if there are multiple "the same" config items coming from different stanzas or config files one of those has precedence over anotner and only one of them will be used. For example, if you have a general sourcetype setting

[s2]
LINE_BREAKER=(\r\n)+

and a source-specific one

[source::.../var/log/apache/access.log]
LINE_BREAKER=(##)

For any s2-sourcetyped file line breaker will be assigned according to the general sourcetype rule except for the access.log, which will be line broken according to the source-specified setting which takes precedence.

View solution in original post

PickleRick
SplunkTrust
SplunkTrust

1. Addon for *nix does contain some questionable items. They are good for demonstrating functionality but not necessarily for production use. I definitely wouldn't just bulk ingest everything under /var/log with one sourcetype. You mightjJust disable the "global" /var/log montor stanza and ingest each needed file/dir with its own sourcetype. It's easier than pulling all files at once and overwriting the sourcetype later (like @PrewinThomas showed). But you can do it using overriding settings on input.

2. In case of the sourcetype assignment, you can override the sourcetype set at input level by defining a props.conf source:: stanza with a sourcetype assignment.

If you create an entry in inputs.conf

[monitor:///var/log]
sourcetype=s1

but add to props.conf

[source::.../var/log/apache/access.log]
sourcetype=s2

All files from /var/log will be ingested with sourcetype of s1 except for the access.log which will have a sourcetype s2.

3. Overriding in that quote you posted means that if there are multiple "the same" config items coming from different stanzas or config files one of those has precedence over anotner and only one of them will be used. For example, if you have a general sourcetype setting

[s2]
LINE_BREAKER=(\r\n)+

and a source-specific one

[source::.../var/log/apache/access.log]
LINE_BREAKER=(##)

For any s2-sourcetyped file line breaker will be assigned according to the general sourcetype rule except for the access.log, which will be line broken according to the source-specified setting which takes precedence.

PrewinThomas
Builder

For onboarding logs from /var/log/abc/def.log and /var/log/abc/ghi.log
You can add inputs.conf with below,

[monitor:///var/log/abc/def.log]
sourcetype = alphabet_log

[monitor:///var/log/abc/ghi.log]
sourcetype = alphabet_log


And props.conf
[source::/var/log/abc/def.log]
TRANSFORMS-apply_def = def_log_transform

[source::/var/log/abc/ghi.log]
TRANSFORMS-apply_ghi = ghi_log_transform

Transforms.conf
[def_log_transform]
REGEX = your_regex_for_def_log
FORMAT = field_name::value
DEST_KEY = FIELD_NAME

[ghi_log_transform]
REGEX = your_regex_for_ghi_log
FORMAT = another_field::value
DEST_KEY = FIELD_NAME


Override means, If the same setting (e.g., TRANSFORMS-xyz) is present in both [source::...] and [sourcetype::...], then the value from [source::...] overrides the one from [sourcetype::...]

 

Regards,
Prewin
Splunk Enthusiast | Always happy to help! If this answer helped you, please consider marking it as the solution or giving a Karma. Thanks!

0 Karma

Na_Kang_Lim
Path Finder

So the stanza

[monitor:///var/log/abc/def.log]
sourcetype = alphabet_log

 will take precedence over the already present [monitor:///var/log] stanza?

0 Karma

PrewinThomas
Builder

@Na_Kang_Lim 
The above we highlighted about props and transforms. And for inputs.conf, 

Splunk matches the most specific generally [monitor://] stanza for a file.
If both stanzas exist, the more specific path (/var/log/abc/def.log) overrides the more general one (/var/log).

So answering to your question, def.log will use sourcetype = alphabet_log, even if [monitor:///var/log] has a different sourcetype or settings.

Regards,
Prewin
Splunk Enthusiast | Always happy to help! If this answer helped you, please consider marking it as the solution or giving a Karma. Thanks!

Get Updates on the Splunk Community!

Get Operational Insights Quickly with Natural Language on the Splunk Platform

In today’s fast-paced digital world, turning data into actionable insights is essential for success. With ...

Stay Connected: Your Guide to August Tech Talks, Office Hours, and Webinars!

What are Community Office Hours?Community Office Hours is an interactive 60-minute Zoom series where ...

Unleash the Power of Splunk MCP and AI, Meet Us at .Conf 2025, and Find Even More New ...

Splunk Lantern is a Splunk customer success center that provides advice from Splunk experts on valuable data ...