Getting Data In

How to extract a value in a log and use it as hostname

avoelk
Communicator

I've tried using props.conf.spec and transforms.conf.spec and some regex to extract a value from a logfile in order to use it as my hostname value. I see that with my regex I can extract the given value but I have two problems: 

 

1.) when I use the gui to get data in I can only choose a given value for hostname pre indexing or use regex only for the path in which my logfile lies. When I put in my tested regex in the hostname field it ofc doesn't work.  So I guess I first have to set up the sourcetype in props.conf and configure the extraction in transforms.conf

2.) I can't seem to find an explanation on how to configure the extraction correctly. like I said the regex seems okay but in transforms I seem to need the following fields which I don't know how to use:

SOURCE_KEY

DEST_KEY

FORMAT

 

In the Logfile it looks similar to that  (the host value is "DC1ASM1.dc1.greendotcorp.com"):

 

Sep 20 11:13:36 10.50.3.100 Sep 20 11:13:33 DC1ASM1.dc1.greendotcorp.com ASM:"MONEYPAK_WEBAPP","MONEYPAK_CLASS","Blocked","Attack signature detected","4523972057501657341","207.154.35.240","GET /Content/Images/img_logo04_module02.gif HTTP/1.1\r\nHost:...

 

mostly it's this host name. and I want to extract  and use it as hostname at indextime. 

this is what I did so far:

props.conf:

 

[f5asm]
BREAK_ONLY_BEFORE = \w+ \d+ \d+:\d+:\d+ \d+
BREAK_ONLY_BEFORE_DATE =
DATETIME_CONFIG =
LINE_BREAKER = \w+ \d+ \d+:\d+:\d+ \d+
NO_BINARY_CHECK = true
TIME_FORMAT = %b %d %H:%M:%S
TIME_PREFIX = \d+.\d+.\d+.\d+
category = Custom
disabled = false
pulldown_type = true
TRANSFORMS-hostname = changehost

 

transforms.conf

 

[changehost]
DEST_KEY = MetaData:Host
SOURCE_KEY = MetaData:Host
REGEX = ([a-zA-Z0-9]([a-zA-Z0-9\-]{0,61}[a-zA-Z0-9])?\.)+[a-zA-Z]{2,6} +?(?=ASM)FORMAT = host::$1

 

I'm fairly certain that I have to change up something in transforms.conf but I can't seem to find an answer. any ideas how to set up the FORMAT, DEST_KEY and SOURCE_KEY correctly in that case? 

Labels (4)
0 Karma
1 Solution

avoelk
Communicator

I have the solution. 

first I've changed up my regex and deleted the SOURCE_KEY in transforms.conf: 

 

[changehost]
DEST_KEY = MetaData:Host
REGEX = [\d\w\s]{7}.[\d\w\s]{3}.\w*.\w*(?=\sASM)
FORMAT = host::$1

 

 this regex is more clear in what it should do and matched the value perfectly. Given the documentation here: https://docs.splunk.com/Documentation/Splunk/8.0.6/Data/Overridedefaulthostassignments  I don't need a SOURCE_KEY and the FORMAT should be host::$1 . $1 is refering to the regex while host tells the FORMAT to put the value into the host field. (If this explanation is off - please correct me).

Still I had a little problem. when trying to input the data the hostvalue was suddenly $1 not the value I tried to extract. the reason was that I forgot to encapsulate my regex in () so it'll become a capture group. so the new regex was :

 

([\d\w\s]{7}.[\d\w\s]{3}.\w*.\w*)(?=\sASM)

 

and it worked like a charm.  

Also, since my value was right before the characters ASM I used a positive lookahead (?=\sASM)

View solution in original post

0 Karma

avoelk
Communicator

I have the solution. 

first I've changed up my regex and deleted the SOURCE_KEY in transforms.conf: 

 

[changehost]
DEST_KEY = MetaData:Host
REGEX = [\d\w\s]{7}.[\d\w\s]{3}.\w*.\w*(?=\sASM)
FORMAT = host::$1

 

 this regex is more clear in what it should do and matched the value perfectly. Given the documentation here: https://docs.splunk.com/Documentation/Splunk/8.0.6/Data/Overridedefaulthostassignments  I don't need a SOURCE_KEY and the FORMAT should be host::$1 . $1 is refering to the regex while host tells the FORMAT to put the value into the host field. (If this explanation is off - please correct me).

Still I had a little problem. when trying to input the data the hostvalue was suddenly $1 not the value I tried to extract. the reason was that I forgot to encapsulate my regex in () so it'll become a capture group. so the new regex was :

 

([\d\w\s]{7}.[\d\w\s]{3}.\w*.\w*)(?=\sASM)

 

and it worked like a charm.  

Also, since my value was right before the characters ASM I used a positive lookahead (?=\sASM)

0 Karma
Get Updates on the Splunk Community!

Webinar Recap | Revolutionizing IT Operations: The Transformative Power of AI and ML ...

The Transformative Power of AI and ML in Enhancing Observability   In the realm of IT operations, the ...

.conf24 | Registration Open!

Hello, hello! I come bearing good news: Registration for .conf24 is now open!   conf is Splunk’s rad annual ...

ICYMI - Check out the latest releases of Splunk Edge Processor

Splunk is pleased to announce the latest enhancements to Splunk Edge Processor.  HEC Receiver authorization ...