Okay, so let's level-set here. You have IIS logs, arriving into HDFS via some mechanism that you are then searching via Hunk. The IIS events are basic w3c formatted strings inside of a JSON wrapper. Your goal is to extract the "sub-fields" from within the larger field.
Let's talk about some things that won't work.
First, INDEXED_EXTRACTIONS won't work because Splunk isn't actually indexing your data. Data that is picked up via virtual indexes using Hunk does not do INDEXED_EXTRACTIONS (because it's never actually indexed by Splunk)
Second, SOURCETYPE_CLONE won't work either, because same issue. You're not indexing the data.
Third, while there is a naming convention of vendor:product:logtype that is common, this is only a naming convention. There is no hierarchical configuration support for this.
Fourth, while sourcetype renaming is a useful feature I don't think it actually does what you want to do here.
I've got an idea that may work, watch this space. UPDATE!
Thanks @dshpritz for solving one final bugaboo for me. Turns out you can do DELIMS extraction inside of existing fields. But, those fields have to have been brought to life using a REPORT or an EXTRACT - JSON and other auto-kv things don't bring the field to life in time.
So assuming we make a new sourcetype for this. It doesn't match any of the existing sourcetypes well.
[foo]
EXTRACT-foo = "Event":"(?<Event>[^"]+)"
REPORT-foo = foobarbaz
So our new foo sourcetype uses a regex based extraction to get Event from the JSON data, treating it as a quoted string. Then our REPORT uses DELIMS on that new field to extract what needs to be extracted from it.
[foobarbaz]
SOURCE_KEY = Event
DELIMS = " "
FIELDS = date,time,s-sitename,s-ip,cs-method,cs-uri-stem,cs-uri-query,s-port,cs-username,c-ip,cs(User-Agent),sc-status,sc-substatus,sc-win32-status
Note this is terrible. I'm glad it works, but I'm made sad by its necessity. In an environment where a forwarder is collecting IIS logs, you'd be far better off to let INDEXED_EXTRACTIONS handle the IIS sourcetype natively. But, in your situation where collection is happening elsewhere and everything you have to do is based whollly on search time via Hunk, this may be the best you can get.
... View more