Getting Data In

How to extract a value in a log and use it as hostname

avoelk
Communicator

I've tried using props.conf.spec and transforms.conf.spec and some regex to extract a value from a logfile in order to use it as my hostname value. I see that with my regex I can extract the given value but I have two problems: 

 

1.) when I use the gui to get data in I can only choose a given value for hostname pre indexing or use regex only for the path in which my logfile lies. When I put in my tested regex in the hostname field it ofc doesn't work.  So I guess I first have to set up the sourcetype in props.conf and configure the extraction in transforms.conf

2.) I can't seem to find an explanation on how to configure the extraction correctly. like I said the regex seems okay but in transforms I seem to need the following fields which I don't know how to use:

SOURCE_KEY

DEST_KEY

FORMAT

 

In the Logfile it looks similar to that  (the host value is "DC1ASM1.dc1.greendotcorp.com"):

 

Sep 20 11:13:36 10.50.3.100 Sep 20 11:13:33 DC1ASM1.dc1.greendotcorp.com ASM:"MONEYPAK_WEBAPP","MONEYPAK_CLASS","Blocked","Attack signature detected","4523972057501657341","207.154.35.240","GET /Content/Images/img_logo04_module02.gif HTTP/1.1\r\nHost:...

 

mostly it's this host name. and I want to extract  and use it as hostname at indextime. 

this is what I did so far:

props.conf:

 

[f5asm]
BREAK_ONLY_BEFORE = \w+ \d+ \d+:\d+:\d+ \d+
BREAK_ONLY_BEFORE_DATE =
DATETIME_CONFIG =
LINE_BREAKER = \w+ \d+ \d+:\d+:\d+ \d+
NO_BINARY_CHECK = true
TIME_FORMAT = %b %d %H:%M:%S
TIME_PREFIX = \d+.\d+.\d+.\d+
category = Custom
disabled = false
pulldown_type = true
TRANSFORMS-hostname = changehost

 

transforms.conf

 

[changehost]
DEST_KEY = MetaData:Host
SOURCE_KEY = MetaData:Host
REGEX = ([a-zA-Z0-9]([a-zA-Z0-9\-]{0,61}[a-zA-Z0-9])?\.)+[a-zA-Z]{2,6} +?(?=ASM)FORMAT = host::$1

 

I'm fairly certain that I have to change up something in transforms.conf but I can't seem to find an answer. any ideas how to set up the FORMAT, DEST_KEY and SOURCE_KEY correctly in that case? 

Labels (4)
0 Karma
1 Solution

avoelk
Communicator

I have the solution. 

first I've changed up my regex and deleted the SOURCE_KEY in transforms.conf: 

 

[changehost]
DEST_KEY = MetaData:Host
REGEX = [\d\w\s]{7}.[\d\w\s]{3}.\w*.\w*(?=\sASM)
FORMAT = host::$1

 

 this regex is more clear in what it should do and matched the value perfectly. Given the documentation here: https://docs.splunk.com/Documentation/Splunk/8.0.6/Data/Overridedefaulthostassignments  I don't need a SOURCE_KEY and the FORMAT should be host::$1 . $1 is refering to the regex while host tells the FORMAT to put the value into the host field. (If this explanation is off - please correct me).

Still I had a little problem. when trying to input the data the hostvalue was suddenly $1 not the value I tried to extract. the reason was that I forgot to encapsulate my regex in () so it'll become a capture group. so the new regex was :

 

([\d\w\s]{7}.[\d\w\s]{3}.\w*.\w*)(?=\sASM)

 

and it worked like a charm.  

Also, since my value was right before the characters ASM I used a positive lookahead (?=\sASM)

View solution in original post

0 Karma

avoelk
Communicator

I have the solution. 

first I've changed up my regex and deleted the SOURCE_KEY in transforms.conf: 

 

[changehost]
DEST_KEY = MetaData:Host
REGEX = [\d\w\s]{7}.[\d\w\s]{3}.\w*.\w*(?=\sASM)
FORMAT = host::$1

 

 this regex is more clear in what it should do and matched the value perfectly. Given the documentation here: https://docs.splunk.com/Documentation/Splunk/8.0.6/Data/Overridedefaulthostassignments  I don't need a SOURCE_KEY and the FORMAT should be host::$1 . $1 is refering to the regex while host tells the FORMAT to put the value into the host field. (If this explanation is off - please correct me).

Still I had a little problem. when trying to input the data the hostvalue was suddenly $1 not the value I tried to extract. the reason was that I forgot to encapsulate my regex in () so it'll become a capture group. so the new regex was :

 

([\d\w\s]{7}.[\d\w\s]{3}.\w*.\w*)(?=\sASM)

 

and it worked like a charm.  

Also, since my value was right before the characters ASM I used a positive lookahead (?=\sASM)

0 Karma
Get Updates on the Splunk Community!

Index This | I am a number, but when you add ‘G’ to me, I go away. What number am I?

March 2024 Edition Hayyy Splunk Education Enthusiasts and the Eternally Curious!  We’re back with another ...

What’s New in Splunk App for PCI Compliance 5.3.1?

The Splunk App for PCI Compliance allows customers to extend the power of their existing Splunk solution with ...

Extending Observability Content to Splunk Cloud

Register to join us !   In this Extending Observability Content to Splunk Cloud Tech Talk, you'll see how to ...