Splunk Search

Need to extract fields at index time

Ajinkya1992
Path Finder

Hello Experts,
I am new to Splunk and trying to extract fields at index time.
I have distributed setup where have 2 clustered indexers, 1 Cluster Master, 1 SH, 1 DS, 1application server with UF.
So here, in this case, I have configured all below config files at :
1) inputs.conf and outputs.conf on DS at $SplunkHome/etc/deployment-apps/
2) fields.conf on SH at $SplunkHome/etc/system/local/
3) props.conf and transforms.conf on both indexers at $SplunkHome/etc/system/local/

Sample event from IIS logs :
2017-06-16 00:00:22 W3SVC1 SA-SSDWEB21 127.0.0.1 GET /WAAM/9002a/Api/v1.svc/RoleStatus() $filter=false%20or%20('Integration'%20eq%20ApplicationInstance/ApplicationGroup)&$expand=ApplicationInstance,ApplicationInstance/Application 80 - 127.0.0.1 Microsoft+ADO.NET+Data+Services - localhost 401 2 5 7000 466 0

Below are the configuration files
inputs.con
[monitor://C:\Program Files\SplunkUniversalForwarder\var\iislogs.bin]
index= iis
sourcetype = iis_generic

outputs.conf
[tcpout:xyz]
server = Indexer1IP:9997, Indexer2IP:9997

fields.conf
[http_response_code]
INDEXED = True

[response_time]
INDEXED= True

props.conf
[source::C:\Program Files\SplunkUniversalForwarder\var\iislogs.bin]
TRANSFORMS-abc = http_response_code

[source::C:\Program Files\SplunkUniversalForwarder\var\iislogs.bin]
TRANSFORMS-pqr = response_time

transforms.conf
[http_response_code]
REGEX = http_response_code = (?P\d+)\s\d+\s\d+\s\d+\s\d+\s\d+
FORMAT= http_response_code :: $1
WRITE_META =True

[response_time]
REGEX = response_time = \d+\s\d+\s\d+\s\d+\s\d+\s(?P\d+)
FORMAT= response_time :: $1
WRITE_META =True

After restarting Splunk services I could be able to fetch all events in iis logs with iis_generic source type from SH.
Not sure what's wrong but could not able to extract fields at index time only.
I have referred below links for this configuration:
http://docs.splunk.com/Documentation/Splunk/7.1.3/Data/Configureindex-timefieldextraction

0 Karma

493669
Super Champion

are you able to get fields extracted? seems to be issue with REGEX...can you share sample full event having required field for extraction

0 Karma

Ajinkya1992
Path Finder

Sample Log File
2017-06-16 00:00:22 W3SVC1 SA-SSDWEB21 127.0.0.1 GET /WAAM/9002a/Api/v1.svc/RoleStatus() $filter=false%20or%20('Integration'%20eq%20ApplicationInstance/ApplicationGroup)&$expand=ApplicationInstance,ApplicationInstance/Application 80 - 127.0.0.1 Microsoft+ADO.NET+Data+Services - localhost 401 2 5 7000 466 0
2017-06-16 00:00:22 W3SVC1 SA-SSDWEB21 127.0.0.1 GET /WAAM/9002a/Api/v1.svc/RoleStatus() $filter=false%20or%20('Integration'%20eq%20ApplicationInstance/ApplicationGroup)&$expand=ApplicationInstance,ApplicationInstance/Application 80 ZSSERVICES\SD_Stg_Shared_IDM 127.0.0.1 Microsoft+ADO.NET+Data+Services - localhost 200 0 0 1729116 608 109
2017-06-16 00:00:22 W3SVC1 SA-SSDWEB21 10.1.21.111 POST /Integration/IdentityManager/app/Service/MembershipService.asmx - 80 - 10.1.21.1 Mozilla/4.0+(compatible;+MSIE+6.0;+MS+Web+Services+Client+Protocol+4.0.30319.42000) - internal.javelin.staging.zsservices.local 200 0 0 910 1009 358
2017-06-16 00:00:22 W3SVC1 SA-SSDWEB21 10.1.21.111 GET /Integration/0019jams/WCF/v4.svc/ - 80 Stg_Integration_0012b 10.1.21.1 - - internal.0019.javelin.staging.zsservices.local 200 0 0 4145 253 374
2017-06-16 00:00:28 W3SVC1 SA-SSDWEB21 10.1.21.111 GET /Integration/0019a/isalive - 80 - 10.1.21.1 - - internal.0019.javelin.staging.zsservices.local 301 0 0 703 165 0
2017-06-16 00:00:28 W3SVC1 SA-SSDWEB21 10.1.21.111 GET /integration/0019a/web/isalive - 80 - 10.1.21.1 - - internal.0019.javelin.staging.zsservices.local 200 0 0 469 169 109

Regex which I am using for extracting these two values are
REGEX = http_response_code = (?P\d+)\s\d+\s\d+\s\d+\s\d+\s\d+
REGEX = response_time = \d+\s\d+\s\d+\s\d+\s\d+\s(?P\d+)

This is the actual REGEX but not sure community page is modifying it on itself.

If I use this regex with rex command in search query I could easily get the desired result

0 Karma

493669
Super Champion

while writing query use 101010 button or ctrl+k button after selecting query so that no special character get missed.

493669
Super Champion

Try this-

[http_response_code]
REGEX =(\d+)\s\d+\s\d+\s\d+\s\d+\s\d+
FORMAT= http_response_code :: $1
WRITE_META =True

[response_time]
REGEX =\d+\s\d+\s\d+\s\d+\s\d+\s(\d+)
FORMAT= response_time :: $1
WRITE_META =True
0 Karma

Ajinkya1992
Path Finder

No luck 😞
I have changed transform.conf with above changes on both the indexers, restarted Splunk, made some changes to the log file for the latest unindexed data but still could not extract field

0 Karma

Jeremiah
Motivator

Have you looked at just using indexed_extractions? It's been available on the forwarder for several releases now and supports the IIS format (w3c). It is much simpler to implement than indexed extractions for a single field.

https://docs.splunk.com/Documentation/Splunk/7.2.0/Data/Extractfieldsfromfileswithstructureddata

0 Karma

Ajinkya1992
Path Finder

Regex which I am using for extracting these two values are
REGEX = http_response_code = (?P<http_response_code>\d+)\s\d+\s\d+\s\d+\s\d+\s\d+
REGEX = response_time = \d+\s\d+\s\d+\s\d+\s\d+\s(?P<response_time>\d+)

Not sure you will be able to see quotation marks around the http_response_code and response status.
But if yes then it because cntrl+k in real I am not using in my conf file

0 Karma
Get Updates on the Splunk Community!

Routing logs with Splunk OTel Collector for Kubernetes

The Splunk Distribution of the OpenTelemetry (OTel) Collector is a product that provides a way to ingest ...

Welcome to the Splunk Community!

(view in My Videos) We're so glad you're here! The Splunk Community is place to connect, learn, give back, and ...

Tech Talk | Elevating Digital Service Excellence: The Synergy of Splunk RUM & APM

Elevating Digital Service Excellence: The Synergy of Real User Monitoring and Application Performance ...