Splunk Search

How to configure props.conf and transforms.conf for field extractions to occur at index-time, not search-time?

redc
Builder

We use a custom format for our Apache access logs. Long ago, I put together a regex to extract the fields from the custom format. At that time, I set it up as a field extraction on the indexer.

The problem is, that field extraction is applied at search time. That means if I have to search more than a couple hours' worth of Apache access logs, I start getting complaints from the search head about how the extraction is taking an excessively long time, which results in it taking forever to search the Apache access logs.

For performance reasons on the search head, I want to have it extract the fields at index time, not at search time.

I tried setting up transforms.conf and props.conf on the forwarder (which is where I thought it SHOULD go for this), but it never seemed to get applied to the data. I saw in this thread that when you're using the Universal Forwarder (which I am), you need to configure the transforms and props on the indexer. However, after doing so, I find that the extraction is being applied at search time, not at index time.

How do I get it to extract the fields at index time?

props.conf content:

[Apache_access]
REPORT-Apache_access = Apache_access_log

transforms.conf content:

[Apache_access_log]
CLEAN_KEYS = 0
REGEX = (?<site_client_dir>\S+?) (?<remote_host>\S+?) (?<remote_logname>\S+?) (?<remote_user>\S+?) [(?<request_start_time>\S+?) (?<request_time_offset>\S+?)] \"(?<request_method>.*?) (?<request>.*?) (?<request_http_version>.*?)\" (?<response_status>\S+?) (?<response_size>\S+?) \"(?<referrer>.*?)\" \"(?<user_agent>.*?)\" (?<cookie>\S+?) (?<response_time>\S+[\r\n]?)

0 Karma
1 Solution

hortonew
Builder

Did you go through the following document? I think you might be missing a few things like a fields.conf, and some more info in your props.conf pointing to your transform. I've never configured this, but I would read through the following document to see if anything jumps out at you.

http://docs.splunk.com/Documentation/Splunk/6.1.3/Data/Configureindex-timefieldextraction

View solution in original post

hortonew
Builder

Did you go through the following document? I think you might be missing a few things like a fields.conf, and some more info in your props.conf pointing to your transform. I've never configured this, but I would read through the following document to see if anything jumps out at you.

http://docs.splunk.com/Documentation/Splunk/6.1.3/Data/Configureindex-timefieldextraction

hortonew
Builder

How's your scripting? I bet you could quickly set up a script that sends syslog to splunk in the form that apache does. You could just hard code a string and spam it at the server.

0 Karma

redc
Builder

Ah, I see. Yes, that would be what I'm missing.

Having read some more "help" threads on Answers and reviewed additional similar documentation, I've concluded that maybe this isn't the way to go. My test environment, unfortunately, doesn't get enough data to be able to test the performance enhancements I might get, so I'll have to do it in the live system; I'll try this out in that environment in a more controlled fashion at a later date.

Thanks for the help!

0 Karma
Get Updates on the Splunk Community!

Built-in Service Level Objectives Management to Bridge the Gap Between Service & ...

Wednesday, May 29, 2024  |  11AM PST / 2PM ESTRegister now and join us to learn more about how you can ...

Get Your Exclusive Splunk Certified Cybersecurity Defense Engineer Certification at ...

We’re excited to announce a new Splunk certification exam being released at .conf24! If you’re headed to Vegas ...

Share Your Ideas & Meet the Lantern team at .Conf! Plus All of This Month’s New ...

Splunk Lantern is Splunk’s customer success center that provides advice from Splunk experts on valuable data ...