Getting Data In

Indexing hostname by segment issue

cvajs
Contributor

v4.3.1 on sles linux
i have a source which is a file in a dynamic path and the source is configured to use segment #4 of the path to assign the hostname to the indexed event.

/logs/syslog/linux/.../log
the real path is /logs/syslog/linux/$HOSTNAME/$YEAR/$MONTH/log

i went to Search App, Dashboards & Views, Summary and i am looking at the Hosts list. weird, in the list are hosts with abbreviated weekday names "Mon" "Tue" "Wed" "Thu" "Fri" "Sat" "Sun" ??!! i dont have any hosts or paths with these names. its all the same real host in this case, one of my linux boxes. the dates of the events match the weekday, as example, the events for host=Tue has events dated 4/3/2012 and 3/27/2012, host=Mon has 4/2 and 3/26, etc.

so where/why is Splunk indexing events as host=Tues or host=Mon ??

Tags (3)
0 Karma

BobM
Builder

Please can you tell us the sourcetype and paste a couple of lines of the log.

I suspect it is similar enough to syslog that splunk is trying to extract the host field from the data but where syslog would normally have a host, your data has the weekday. If so, setting the sourcetype to something else should fix it.

Bob

cvajs
Contributor

i changed my syslog template to ("$WEEKDAY $DATE $TZ $HOST $FACILITY [$LEVEL] $MSG\n")

0 Karma

BobM
Builder

oops. That last line should have been

TRANSFORMS =

0 Karma

cvajs
Contributor

i added those two lines to /opt/splunk/etc/system/local/props.conf and bounced the service, it is still logging host=Tue

0 Karma

cvajs
Contributor

perhaps you explained why i see what i see, but that would be completely confusing since the source page gives the options to define hostname by path segment, and offers no text explanation or options to override the default "syslog" definition as you explain it. this falls into my book of klugeyness.

0 Karma

BobM
Builder

This is the default action for syslog data. It overwrites the host with whatever it finds after the timestamp. You can get round this by changing the sourcetype to something else or turning off the default processing for syslog data.
To stop the default processing add the following two lines to a local/props.conf file.

[syslog]

TRANSFORMS

0 Karma

cvajs
Contributor

maybe the issue is my defined path of /logs/syslog/linux/.../log
perhaps Splunk is not expanding this before assigning hostname from path, so it tries to extract it from the raw data? if so this is not documented, i would expect it to expand the path before extracting the hostname from path for the log file it reads, etc. i rely on my syslog-ng config to properly store host data in a correct location regardless of how the raw data may be formatted, meaning raw data may have wrong hostname but syslog-ng puts it in correctly defined path, etc. why just this one linux host and not all my data?

0 Karma

cvajs
Contributor

ok, the data is syslog data and it does have abbreviated weekday as a field, however, this is not the definition of assigned hostname by path segment for the data source, so perhaps a bug in Splunk?

Specify which segment of the source path to set as the Host field.
For example: 3 (sets to 'hostname' for the path /var/log/hostname/)

my syslog-ng data gets written as template("$DATE $TZ $WEEKDAY $ISODATE $HOST $FACILITY [$LEVEL] $MSG\n")

example raw syslog entry:
Apr 1 00:10:01 -04:00 Sun 2012-04-01T00:10:01-04:00 host01x03 cron [info] crond[21399]: (root) CMD (/usr/lib/sa/sa1 1 1)

0 Karma
Get Updates on the Splunk Community!

How to Monitor Google Kubernetes Engine (GKE)

We’ve looked at how to integrate Kubernetes environments with Splunk Observability Cloud, but what about ...

Index This | How can you make 45 using only 4?

October 2024 Edition Hayyy Splunk Education Enthusiasts and the Eternally Curious!  We’re back with this ...

Splunk Education Goes to Washington | Splunk GovSummit 2024

If you’re in the Washington, D.C. area, this is your opportunity to take your career and Splunk skills to the ...