Getting Data In

How to extract a value in a log and use it as hostname

avoelk
Path Finder

I've tried using props.conf.spec and transforms.conf.spec and some regex to extract a value from a logfile in order to use it as my hostname value. I see that with my regex I can extract the given value but I have two problems: 

 

1.) when I use the gui to get data in I can only choose a given value for hostname pre indexing or use regex only for the path in which my logfile lies. When I put in my tested regex in the hostname field it ofc doesn't work.  So I guess I first have to set up the sourcetype in props.conf and configure the extraction in transforms.conf

2.) I can't seem to find an explanation on how to configure the extraction correctly. like I said the regex seems okay but in transforms I seem to need the following fields which I don't know how to use:

SOURCE_KEY

DEST_KEY

FORMAT

 

In the Logfile it looks similar to that  (the host value is "DC1ASM1.dc1.greendotcorp.com"):

 

Sep 20 11:13:36 10.50.3.100 Sep 20 11:13:33 DC1ASM1.dc1.greendotcorp.com ASM:"MONEYPAK_WEBAPP","MONEYPAK_CLASS","Blocked","Attack signature detected","4523972057501657341","207.154.35.240","GET /Content/Images/img_logo04_module02.gif HTTP/1.1\r\nHost:...

 

mostly it's this host name. and I want to extract  and use it as hostname at indextime. 

this is what I did so far:

props.conf:

 

[f5asm]
BREAK_ONLY_BEFORE = \w+ \d+ \d+:\d+:\d+ \d+
BREAK_ONLY_BEFORE_DATE =
DATETIME_CONFIG =
LINE_BREAKER = \w+ \d+ \d+:\d+:\d+ \d+
NO_BINARY_CHECK = true
TIME_FORMAT = %b %d %H:%M:%S
TIME_PREFIX = \d+.\d+.\d+.\d+
category = Custom
disabled = false
pulldown_type = true
TRANSFORMS-hostname = changehost

 

transforms.conf

 

[changehost]
DEST_KEY = MetaData:Host
SOURCE_KEY = MetaData:Host
REGEX = ([a-zA-Z0-9]([a-zA-Z0-9\-]{0,61}[a-zA-Z0-9])?\.)+[a-zA-Z]{2,6} +?(?=ASM)FORMAT = host::$1

 

I'm fairly certain that I have to change up something in transforms.conf but I can't seem to find an answer. any ideas how to set up the FORMAT, DEST_KEY and SOURCE_KEY correctly in that case? 

Labels (4)
0 Karma
1 Solution

avoelk
Path Finder

I have the solution. 

first I've changed up my regex and deleted the SOURCE_KEY in transforms.conf: 

 

[changehost]
DEST_KEY = MetaData:Host
REGEX = [\d\w\s]{7}.[\d\w\s]{3}.\w*.\w*(?=\sASM)
FORMAT = host::$1

 

 this regex is more clear in what it should do and matched the value perfectly. Given the documentation here: https://docs.splunk.com/Documentation/Splunk/8.0.6/Data/Overridedefaulthostassignments  I don't need a SOURCE_KEY and the FORMAT should be host::$1 . $1 is refering to the regex while host tells the FORMAT to put the value into the host field. (If this explanation is off - please correct me).

Still I had a little problem. when trying to input the data the hostvalue was suddenly $1 not the value I tried to extract. the reason was that I forgot to encapsulate my regex in () so it'll become a capture group. so the new regex was :

 

([\d\w\s]{7}.[\d\w\s]{3}.\w*.\w*)(?=\sASM)

 

and it worked like a charm.  

Also, since my value was right before the characters ASM I used a positive lookahead (?=\sASM)

View solution in original post

0 Karma

avoelk
Path Finder

I have the solution. 

first I've changed up my regex and deleted the SOURCE_KEY in transforms.conf: 

 

[changehost]
DEST_KEY = MetaData:Host
REGEX = [\d\w\s]{7}.[\d\w\s]{3}.\w*.\w*(?=\sASM)
FORMAT = host::$1

 

 this regex is more clear in what it should do and matched the value perfectly. Given the documentation here: https://docs.splunk.com/Documentation/Splunk/8.0.6/Data/Overridedefaulthostassignments  I don't need a SOURCE_KEY and the FORMAT should be host::$1 . $1 is refering to the regex while host tells the FORMAT to put the value into the host field. (If this explanation is off - please correct me).

Still I had a little problem. when trying to input the data the hostvalue was suddenly $1 not the value I tried to extract. the reason was that I forgot to encapsulate my regex in () so it'll become a capture group. so the new regex was :

 

([\d\w\s]{7}.[\d\w\s]{3}.\w*.\w*)(?=\sASM)

 

and it worked like a charm.  

Also, since my value was right before the characters ASM I used a positive lookahead (?=\sASM)

0 Karma
Get Updates on the Splunk Community!

Using Machine Learning for Hunting Security Threats

WATCH NOW Seeing the exponential hike in global cyber threat spectrum, organizations are now striving more for ...

New Learning Videos on Topics Most Requested by You! Plus This Month’s New Splunk ...

Splunk Lantern is a customer success center that provides advice from Splunk experts on valuable data ...

How I Instrumented a Rust Application Without Knowing Rust

As a technical writer, I often have to edit or create code snippets for Splunk's distributions of ...