Splunk Search

Need to extract fields at index time

Ajinkya1992
Path Finder

Hello Experts,
I am new to Splunk and trying to extract fields at index time.
I have distributed setup where have 2 clustered indexers, 1 Cluster Master, 1 SH, 1 DS, 1application server with UF.
So here, in this case, I have configured all below config files at :
1) inputs.conf and outputs.conf on DS at $SplunkHome/etc/deployment-apps/
2) fields.conf on SH at $SplunkHome/etc/system/local/
3) props.conf and transforms.conf on both indexers at $SplunkHome/etc/system/local/

Sample event from IIS logs :
2017-06-16 00:00:22 W3SVC1 SA-SSDWEB21 127.0.0.1 GET /WAAM/9002a/Api/v1.svc/RoleStatus() $filter=false%20or%20('Integration'%20eq%20ApplicationInstance/ApplicationGroup)&$expand=ApplicationInstance,ApplicationInstance/Application 80 - 127.0.0.1 Microsoft+ADO.NET+Data+Services - localhost 401 2 5 7000 466 0

Below are the configuration files
inputs.con
[monitor://C:\Program Files\SplunkUniversalForwarder\var\iislogs.bin]
index= iis
sourcetype = iis_generic

outputs.conf
[tcpout:xyz]
server = Indexer1IP:9997, Indexer2IP:9997

fields.conf
[http_response_code]
INDEXED = True

[response_time]
INDEXED= True

props.conf
[source::C:\Program Files\SplunkUniversalForwarder\var\iislogs.bin]
TRANSFORMS-abc = http_response_code

[source::C:\Program Files\SplunkUniversalForwarder\var\iislogs.bin]
TRANSFORMS-pqr = response_time

transforms.conf
[http_response_code]
REGEX = http_response_code = (?P\d+)\s\d+\s\d+\s\d+\s\d+\s\d+
FORMAT= http_response_code :: $1
WRITE_META =True

[response_time]
REGEX = response_time = \d+\s\d+\s\d+\s\d+\s\d+\s(?P\d+)
FORMAT= response_time :: $1
WRITE_META =True

After restarting Splunk services I could be able to fetch all events in iis logs with iis_generic source type from SH.
Not sure what's wrong but could not able to extract fields at index time only.
I have referred below links for this configuration:
http://docs.splunk.com/Documentation/Splunk/7.1.3/Data/Configureindex-timefieldextraction

0 Karma

493669
Super Champion

are you able to get fields extracted? seems to be issue with REGEX...can you share sample full event having required field for extraction

0 Karma

Ajinkya1992
Path Finder

Sample Log File
2017-06-16 00:00:22 W3SVC1 SA-SSDWEB21 127.0.0.1 GET /WAAM/9002a/Api/v1.svc/RoleStatus() $filter=false%20or%20('Integration'%20eq%20ApplicationInstance/ApplicationGroup)&$expand=ApplicationInstance,ApplicationInstance/Application 80 - 127.0.0.1 Microsoft+ADO.NET+Data+Services - localhost 401 2 5 7000 466 0
2017-06-16 00:00:22 W3SVC1 SA-SSDWEB21 127.0.0.1 GET /WAAM/9002a/Api/v1.svc/RoleStatus() $filter=false%20or%20('Integration'%20eq%20ApplicationInstance/ApplicationGroup)&$expand=ApplicationInstance,ApplicationInstance/Application 80 ZSSERVICES\SD_Stg_Shared_IDM 127.0.0.1 Microsoft+ADO.NET+Data+Services - localhost 200 0 0 1729116 608 109
2017-06-16 00:00:22 W3SVC1 SA-SSDWEB21 10.1.21.111 POST /Integration/IdentityManager/app/Service/MembershipService.asmx - 80 - 10.1.21.1 Mozilla/4.0+(compatible;+MSIE+6.0;+MS+Web+Services+Client+Protocol+4.0.30319.42000) - internal.javelin.staging.zsservices.local 200 0 0 910 1009 358
2017-06-16 00:00:22 W3SVC1 SA-SSDWEB21 10.1.21.111 GET /Integration/0019jams/WCF/v4.svc/ - 80 Stg_Integration_0012b 10.1.21.1 - - internal.0019.javelin.staging.zsservices.local 200 0 0 4145 253 374
2017-06-16 00:00:28 W3SVC1 SA-SSDWEB21 10.1.21.111 GET /Integration/0019a/isalive - 80 - 10.1.21.1 - - internal.0019.javelin.staging.zsservices.local 301 0 0 703 165 0
2017-06-16 00:00:28 W3SVC1 SA-SSDWEB21 10.1.21.111 GET /integration/0019a/web/isalive - 80 - 10.1.21.1 - - internal.0019.javelin.staging.zsservices.local 200 0 0 469 169 109

Regex which I am using for extracting these two values are
REGEX = http_response_code = (?P\d+)\s\d+\s\d+\s\d+\s\d+\s\d+
REGEX = response_time = \d+\s\d+\s\d+\s\d+\s\d+\s(?P\d+)

This is the actual REGEX but not sure community page is modifying it on itself.

If I use this regex with rex command in search query I could easily get the desired result

0 Karma

493669
Super Champion

while writing query use 101010 button or ctrl+k button after selecting query so that no special character get missed.

493669
Super Champion

Try this-

[http_response_code]
REGEX =(\d+)\s\d+\s\d+\s\d+\s\d+\s\d+
FORMAT= http_response_code :: $1
WRITE_META =True

[response_time]
REGEX =\d+\s\d+\s\d+\s\d+\s\d+\s(\d+)
FORMAT= response_time :: $1
WRITE_META =True
0 Karma

Ajinkya1992
Path Finder

No luck 😞
I have changed transform.conf with above changes on both the indexers, restarted Splunk, made some changes to the log file for the latest unindexed data but still could not extract field

0 Karma

Jeremiah
Motivator

Have you looked at just using indexed_extractions? It's been available on the forwarder for several releases now and supports the IIS format (w3c). It is much simpler to implement than indexed extractions for a single field.

https://docs.splunk.com/Documentation/Splunk/7.2.0/Data/Extractfieldsfromfileswithstructureddata

0 Karma

Ajinkya1992
Path Finder

Regex which I am using for extracting these two values are
REGEX = http_response_code = (?P<http_response_code>\d+)\s\d+\s\d+\s\d+\s\d+\s\d+
REGEX = response_time = \d+\s\d+\s\d+\s\d+\s\d+\s(?P<response_time>\d+)

Not sure you will be able to see quotation marks around the http_response_code and response status.
But if yes then it because cntrl+k in real I am not using in my conf file

0 Karma
*NEW* Splunk Love Promo!
Snag a $25 Visa Gift Card for Giving Your Review!

It's another Splunk Love Special! For a limited time, you can review one of our select Splunk products through Gartner Peer Insights and receive a $25 Visa gift card!

Review:





Or Learn More in Our Blog >>