I'm trying to extract domain info from the host field at search time and have the following props and transforms set, but it doesnt seem to work. My example hostname would be art.mozart.apac.com and trying to extract mozart.apac.com. here's my props and transforms :
props.conf
[xyz]
REPORT-extract_domain_name = domain_name_extract
transforms.conf
[domain_name_extract]
SOURCE_KEY = host
REGEX = (\.\w+\.\w+\.\w+)
FORMAT = domain_name::$1
is my configuration correct ? and any reason why this doesnt work ?
thanks
pmr
If you want to be able to search for this field you have to either make it an indexed field (better performance)
http://www.splunk.com/base/Documentation/latest/Data/Configureindex-timefieldextraction
props.conf
[xyz]
TRANSFORMS-extract-domain = extract-domain-name
transforms.conf
[extract-domain-name]
SOURCE_KEY = MetaData:Source
REGEX=source::\w+\.([\w\.]+)$
FORMAT = domain_name::$1
WRITE_META = true
fields.conf
[domain_name]
INDEXED = true
or tell Splunk that the event content (_raw) might not contain the field value:
fields.conf
[domain_name]
INDEXED_VALUE = false
If you want to use this field just for reporting, the it should be sufficient to just extract the field:
props.conf
[xyz]
EXTRACT-domain-name = \.(?<domain_name>[\w\.]+) in source
Ah, I unintentionally wrote the examples with the source field. You just have to append "in host" instead of "in source".
ok, but how will i specify to extract from the host field in your props.conf you mention for reporting ? your last props.conf entry
You need to escape the dots and add a backslash before your "w" characters. A dot in regex is a special character meaning 'any character'.
Your regex should probably look something like this:
(\.\w+\.\w+\.\w+)$
