Despite having recently finished the Splunk Admin course, I'm still fuzzy on the terms "index-time" and "search-time" especially when it comes to actually configuring the indexer and search head in a distributed search environment. When determining where to put certain modifications to the props.conf and transforms.conf, should "index-time" options only reside on the indexer and "search-time" options only reside on the search head? Or am I just conflating different ideas with unfortunately similar names?
I currently have an issue on a production Splunk deployment (which due to politics cannot simply be rebuilt from scratch). We are indexing some IIS logs, and following a 5.0.4 to 6.2.2 upgrade, there is a problem of log entries being truncated where seemingly random charactars are lost resulting in fields containing the wrong data, or fragments of other fields. There is also an issue of missing events and duplicated events. When I compare the actual IIS log files to a generic sourcetype=iis search, the log file is fine, but the search results do not match.
I have tried multiple times to re-work the props.conf and transforms.conf in etc/system/local on the indexer and search head, but my new configurations don't seem to take effect (or at least only partially). Based on the way .conf precedence works, nothing should override my new conf files, but my changes just don't seem to be taking effect. I feel like I am missing something very fundamental, but I can't understand what it is. Help?
There are some configuration which should be on forwarder and others on Indexer. Please refer this wiki page, https://wiki.splunk.com/Community:HowIndexingWorks which will give you what configuration works where.
So, one think you might be seeing is that between 5 and 6, we introduced indexed extractions for structured data formats such as json,csv,tsv, and xml. The default behavior, if you updated, should have changed...
So aside from the wiki, think about index time as what is done as the data is in the pipelines before its written to disk.
Whereas Search Time, is manipulating the data after it's written to disk, ala its read only, so the only changes we can do to it are via search. (This means you cant change sourcetypes, timestamp, hosts, source fields on existing buckets of data.) If you wanted to do this, you would have to transform it and write it to a summary index or only have it changed from the SH via search..
You can read a bit more here : http://docs.splunk.com/Documentation/Splunk/6.4.2/Data/Aboutindexedfieldextraction
Feel free to contact your instructor and ask also. Typically our EDU team will love to clarify.
Also, in regards to your configs, check btool and confirm your changes are applied the way you think they should be.
If you have a distributed search
deployment, processing is split
between search peers (indexers) and a
search head. You must deploy the
changes as follows:
Deploy the props.conf and
transforms.conf changes to each of the
search peers. Deploy the fields.conf
changes to the search head.
So, based on that, I should only be changing the props and transforms on my indexer, and not making any changes on my search head? Currently I have various settings split between both servers. Should I just consolidate all my field extractions and transforms to the indexer?
You're not alone 🙂 It does take some getting used to. But once you get it, it make perfect sense. Here is a nice post helped me get my arms around this. Hopefully, it clarifies things for you.