Hello everyone! I am trying to extract hostname from syslog-heading, and after trim it? Is it technically possible?
My props.conf:
[my_sourcetype] EXTRACT-host = my_regex_here SEDCMD-strip_heading = my_regex_here DATETIME_CONFIG = LINE_BREAKER = ([\r\n]+) NO_BINARY_CHECK = true category = Custom disabled = false pulldown_type = true
This not working. It's extracting field without trimming heading, but together this not working.
There are several things going on here.
EXTRACT directive crates a search-time extraction, so you can't use it to fetch data from the part of the event you already discarded during ingestion.
Anyway, host is an indexed field (and one of the important default fields) so you should not try to overwrite it with search-time extractions.
Typically, hostname can be overwritten by splunk in index-time with one of standard transforms (see transforms defined for the syslog sourcetype).
There are several things going on here.
EXTRACT directive crates a search-time extraction, so you can't use it to fetch data from the part of the event you already discarded during ingestion.
Anyway, host is an indexed field (and one of the important default fields) so you should not try to overwrite it with search-time extractions.
Typically, hostname can be overwritten by splunk in index-time with one of standard transforms (see transforms defined for the syslog sourcetype).
Thank you for your responding! Not obligatory extract field host, it's acceptable to extract field "hostname" for example, but if I understood correctly technically it's not possible, because I trim this part.
"Typically, hostname can be overwritten by splunk in index-time with one of standard transforms (see transforms defined for the syslog sourcetype)."
Can you give link on materials to read about?
There are plenty of materials about index-time extractions
For example - https://docs.splunk.com/Documentation/Splunk/9.0.3/Data/Configureindex-timefieldextraction
And about the example I told you, if you do
splunk btool props list syslog | grep TRANSFORMS
You'll see
TRANSFORMS = syslog-host
TRANSFORMS-syslog_auditing = linux_audit_enriched
Of those two transforms, the one that is of interest to us at this time is the "syslog-host one".
So let's see how it's defined
$ /opt/splunk/bin/splunk btool transforms list syslog-host
[syslog-host]
CAN_OPTIMIZE = True
CLEAN_KEYS = True
DEFAULT_VALUE =
DEPTH_LIMIT = 1000
DEST_KEY = MetaData:Host
FORMAT = host::$1
KEEP_EMPTY_VALS = False
LOOKAHEAD = 4096
MATCH_LIMIT = 100000
MV_ADD = False
REGEX = :\d\d\s+(?:\d+\s+|(?:user|daemon|local.?)\.\w+\s+)*\[?(\w[\w\.\-]{2,})\]?\s
SOURCE_KEY = _raw
WRITE_META = False
As you can see, it uses a REGEX to find a host part in the line ( the \[?(\w[\w\.\-]{2,})\]? sequence) to overwrite the host field (the DEST_KEY and FORMAT settings)
add: I trim syslog-header for json auto-extraction, so i need it