Getting Data In

How to extract a value in a log and use it as hostname

avoelk
Communicator

I've tried using props.conf.spec and transforms.conf.spec and some regex to extract a value from a logfile in order to use it as my hostname value. I see that with my regex I can extract the given value but I have two problems: 

 

1.) when I use the gui to get data in I can only choose a given value for hostname pre indexing or use regex only for the path in which my logfile lies. When I put in my tested regex in the hostname field it ofc doesn't work.  So I guess I first have to set up the sourcetype in props.conf and configure the extraction in transforms.conf

2.) I can't seem to find an explanation on how to configure the extraction correctly. like I said the regex seems okay but in transforms I seem to need the following fields which I don't know how to use:

SOURCE_KEY

DEST_KEY

FORMAT

 

In the Logfile it looks similar to that  (the host value is "DC1ASM1.dc1.greendotcorp.com"):

 

Sep 20 11:13:36 10.50.3.100 Sep 20 11:13:33 DC1ASM1.dc1.greendotcorp.com ASM:"MONEYPAK_WEBAPP","MONEYPAK_CLASS","Blocked","Attack signature detected","4523972057501657341","207.154.35.240","GET /Content/Images/img_logo04_module02.gif HTTP/1.1\r\nHost:...

 

mostly it's this host name. and I want to extract  and use it as hostname at indextime. 

this is what I did so far:

props.conf:

 

[f5asm]
BREAK_ONLY_BEFORE = \w+ \d+ \d+:\d+:\d+ \d+
BREAK_ONLY_BEFORE_DATE =
DATETIME_CONFIG =
LINE_BREAKER = \w+ \d+ \d+:\d+:\d+ \d+
NO_BINARY_CHECK = true
TIME_FORMAT = %b %d %H:%M:%S
TIME_PREFIX = \d+.\d+.\d+.\d+
category = Custom
disabled = false
pulldown_type = true
TRANSFORMS-hostname = changehost

 

transforms.conf

 

[changehost]
DEST_KEY = MetaData:Host
SOURCE_KEY = MetaData:Host
REGEX = ([a-zA-Z0-9]([a-zA-Z0-9\-]{0,61}[a-zA-Z0-9])?\.)+[a-zA-Z]{2,6} +?(?=ASM)FORMAT = host::$1

 

I'm fairly certain that I have to change up something in transforms.conf but I can't seem to find an answer. any ideas how to set up the FORMAT, DEST_KEY and SOURCE_KEY correctly in that case? 

Labels (4)
0 Karma
1 Solution

avoelk
Communicator

I have the solution. 

first I've changed up my regex and deleted the SOURCE_KEY in transforms.conf: 

 

[changehost]
DEST_KEY = MetaData:Host
REGEX = [\d\w\s]{7}.[\d\w\s]{3}.\w*.\w*(?=\sASM)
FORMAT = host::$1

 

 this regex is more clear in what it should do and matched the value perfectly. Given the documentation here: https://docs.splunk.com/Documentation/Splunk/8.0.6/Data/Overridedefaulthostassignments  I don't need a SOURCE_KEY and the FORMAT should be host::$1 . $1 is refering to the regex while host tells the FORMAT to put the value into the host field. (If this explanation is off - please correct me).

Still I had a little problem. when trying to input the data the hostvalue was suddenly $1 not the value I tried to extract. the reason was that I forgot to encapsulate my regex in () so it'll become a capture group. so the new regex was :

 

([\d\w\s]{7}.[\d\w\s]{3}.\w*.\w*)(?=\sASM)

 

and it worked like a charm.  

Also, since my value was right before the characters ASM I used a positive lookahead (?=\sASM)

View solution in original post

0 Karma

avoelk
Communicator

I have the solution. 

first I've changed up my regex and deleted the SOURCE_KEY in transforms.conf: 

 

[changehost]
DEST_KEY = MetaData:Host
REGEX = [\d\w\s]{7}.[\d\w\s]{3}.\w*.\w*(?=\sASM)
FORMAT = host::$1

 

 this regex is more clear in what it should do and matched the value perfectly. Given the documentation here: https://docs.splunk.com/Documentation/Splunk/8.0.6/Data/Overridedefaulthostassignments  I don't need a SOURCE_KEY and the FORMAT should be host::$1 . $1 is refering to the regex while host tells the FORMAT to put the value into the host field. (If this explanation is off - please correct me).

Still I had a little problem. when trying to input the data the hostvalue was suddenly $1 not the value I tried to extract. the reason was that I forgot to encapsulate my regex in () so it'll become a capture group. so the new regex was :

 

([\d\w\s]{7}.[\d\w\s]{3}.\w*.\w*)(?=\sASM)

 

and it worked like a charm.  

Also, since my value was right before the characters ASM I used a positive lookahead (?=\sASM)

0 Karma
Get Updates on the Splunk Community!

Built-in Service Level Objectives Management to Bridge the Gap Between Service & ...

Wednesday, May 29, 2024  |  11AM PST / 2PM ESTRegister now and join us to learn more about how you can ...

Get Your Exclusive Splunk Certified Cybersecurity Defense Engineer Certification at ...

We’re excited to announce a new Splunk certification exam being released at .conf24! If you’re headed to Vegas ...

Share Your Ideas & Meet the Lantern team at .Conf! Plus All of This Month’s New ...

Splunk Lantern is Splunk’s customer success center that provides advice from Splunk experts on valuable data ...