Getting Data In

How to extract a value in a log and use it as hostname

avoelk
Communicator

I've tried using props.conf.spec and transforms.conf.spec and some regex to extract a value from a logfile in order to use it as my hostname value. I see that with my regex I can extract the given value but I have two problems: 

 

1.) when I use the gui to get data in I can only choose a given value for hostname pre indexing or use regex only for the path in which my logfile lies. When I put in my tested regex in the hostname field it ofc doesn't work.  So I guess I first have to set up the sourcetype in props.conf and configure the extraction in transforms.conf

2.) I can't seem to find an explanation on how to configure the extraction correctly. like I said the regex seems okay but in transforms I seem to need the following fields which I don't know how to use:

SOURCE_KEY

DEST_KEY

FORMAT

 

In the Logfile it looks similar to that  (the host value is "DC1ASM1.dc1.greendotcorp.com"):

 

Sep 20 11:13:36 10.50.3.100 Sep 20 11:13:33 DC1ASM1.dc1.greendotcorp.com ASM:"MONEYPAK_WEBAPP","MONEYPAK_CLASS","Blocked","Attack signature detected","4523972057501657341","207.154.35.240","GET /Content/Images/img_logo04_module02.gif HTTP/1.1\r\nHost:...

 

mostly it's this host name. and I want to extract  and use it as hostname at indextime. 

this is what I did so far:

props.conf:

 

[f5asm]
BREAK_ONLY_BEFORE = \w+ \d+ \d+:\d+:\d+ \d+
BREAK_ONLY_BEFORE_DATE =
DATETIME_CONFIG =
LINE_BREAKER = \w+ \d+ \d+:\d+:\d+ \d+
NO_BINARY_CHECK = true
TIME_FORMAT = %b %d %H:%M:%S
TIME_PREFIX = \d+.\d+.\d+.\d+
category = Custom
disabled = false
pulldown_type = true
TRANSFORMS-hostname = changehost

 

transforms.conf

 

[changehost]
DEST_KEY = MetaData:Host
SOURCE_KEY = MetaData:Host
REGEX = ([a-zA-Z0-9]([a-zA-Z0-9\-]{0,61}[a-zA-Z0-9])?\.)+[a-zA-Z]{2,6} +?(?=ASM)FORMAT = host::$1

 

I'm fairly certain that I have to change up something in transforms.conf but I can't seem to find an answer. any ideas how to set up the FORMAT, DEST_KEY and SOURCE_KEY correctly in that case? 

Labels (4)
0 Karma
1 Solution

avoelk
Communicator

I have the solution. 

first I've changed up my regex and deleted the SOURCE_KEY in transforms.conf: 

 

[changehost]
DEST_KEY = MetaData:Host
REGEX = [\d\w\s]{7}.[\d\w\s]{3}.\w*.\w*(?=\sASM)
FORMAT = host::$1

 

 this regex is more clear in what it should do and matched the value perfectly. Given the documentation here: https://docs.splunk.com/Documentation/Splunk/8.0.6/Data/Overridedefaulthostassignments  I don't need a SOURCE_KEY and the FORMAT should be host::$1 . $1 is refering to the regex while host tells the FORMAT to put the value into the host field. (If this explanation is off - please correct me).

Still I had a little problem. when trying to input the data the hostvalue was suddenly $1 not the value I tried to extract. the reason was that I forgot to encapsulate my regex in () so it'll become a capture group. so the new regex was :

 

([\d\w\s]{7}.[\d\w\s]{3}.\w*.\w*)(?=\sASM)

 

and it worked like a charm.  

Also, since my value was right before the characters ASM I used a positive lookahead (?=\sASM)

View solution in original post

0 Karma

avoelk
Communicator

I have the solution. 

first I've changed up my regex and deleted the SOURCE_KEY in transforms.conf: 

 

[changehost]
DEST_KEY = MetaData:Host
REGEX = [\d\w\s]{7}.[\d\w\s]{3}.\w*.\w*(?=\sASM)
FORMAT = host::$1

 

 this regex is more clear in what it should do and matched the value perfectly. Given the documentation here: https://docs.splunk.com/Documentation/Splunk/8.0.6/Data/Overridedefaulthostassignments  I don't need a SOURCE_KEY and the FORMAT should be host::$1 . $1 is refering to the regex while host tells the FORMAT to put the value into the host field. (If this explanation is off - please correct me).

Still I had a little problem. when trying to input the data the hostvalue was suddenly $1 not the value I tried to extract. the reason was that I forgot to encapsulate my regex in () so it'll become a capture group. so the new regex was :

 

([\d\w\s]{7}.[\d\w\s]{3}.\w*.\w*)(?=\sASM)

 

and it worked like a charm.  

Also, since my value was right before the characters ASM I used a positive lookahead (?=\sASM)

0 Karma
Get Updates on the Splunk Community!

Stay Connected: Your Guide to May Tech Talks, Office Hours, and Webinars!

Take a look below to explore our upcoming Community Office Hours, Tech Talks, and Webinars this month. This ...

They're back! Join the SplunkTrust and MVP at .conf24

With our highly anticipated annual conference, .conf, comes the fez-wearers you can trust! The SplunkTrust, as ...

Enterprise Security Content Update (ESCU) | New Releases

Last month, the Splunk Threat Research Team had two releases of new security content via the Enterprise ...