Getting Data In
Highlighted

Possible to define a sub-sourcetype?

Motivator

We are ingesting IIS logs in json format as we are adding some additional fields to the log file that contain information we need to pull. However, IIS uses the W3C format in which the fields are pre-defined as follows:

Fields: date time s-ip cs-method cs-uri-stem cs-uri-query s-port cs-username c-ip cs(User-Agent) sc-status sc-substatus sc-win32-status time-taken

These key/value pairs reside in the 'event' key:

{"EventReceivedTime":"2017-02-21 08:00:20","SourceModuleName":"EWIPRD","SourceModuleType":"im_file","FileName":"L:\\Logs\\W3SVC1\\u_ex170221.log","SiteId":"1","WebServer":"<servername>","Event":"2017-02-21 13:00:00 x.x.x.x POST /autodiscover/autodiscover.xml - 443 - x.x.x.x Microsoft+Office/16.0+(Windows+NT+6.2;+Microsoft+Outlook+16.0.7571;+Pro) - 301 0 0 0"}

Is it possible to define the 'event' key [autokvforiisdefault] in transforms.conf as below:
DELIMS = " "
FIELDS = date time s-sitename s-ip cs-method cs-uri-stem cs-uri-query s-port cs-username c-ip cs(User-Agent) sc-status sc-substatus sc-win32-status

Thx

Highlighted

Re: Possible to define a sub-sourcetype?

Contributor

Not sure if I understand what you're trying to do exactly. The subject of your post mentions creating a sub-sourcetype (which I don't think you can do), but I see you mentioning extracting additional fields from a key. Are you trying to extract fields from the values of that other field? Or are you trying to separate events like that into another area (like an eventtype)?

0 Karma
Highlighted

Re: Possible to define a sub-sourcetype?

Motivator

I'm trying to extract fields from the 'event' key in an automated way as I created a regex to extract the IIS fields

So "Event" contains the following information in my example:

"2017-02-21 13:00:00 x.x.x.x POST /autodiscover/autodiscover.xml - 443 - x.x.x.x Microsoft+Office/16.0+(Windows+NT+6.2;+Microsoft+Outlook+16.0.7571;+Pro) - 301 0 0 0"}`

which breakdown as IIS key/value pairs:
Date - 2017-02-21
Time - 13:00:00
sip - x.x.x.x
cs
method - POST
cs-uri-stem - /autodiscover/autodiscover.xml
cs-uri-query -
s-port - 443
cs-username -
c-ip - x.x.x.x
cs(User-Agent) - Microsoft+Office/16.0+(Windows+NT+6.2;+Microsoft+Outlook+16.0.7571;+Pro)
cs(Referer) -
sc-status - 301
sc-substatus - 0
sc-win32-status - 0
time-taken - 0

If we weren't ingesting these files as json, I'd simply modify the transforms.conf file as follows:

[autokvforiisdefault]
DELIMS = " "
FIELDS = date time s-sitename s-ip cs-method cs-uri-stem cs-uri-query s-port cs-username c-ip cs(User-Agent) sc-status sc-substatus sc-win32-status

Wasn't sure if it possible to do another field extraction from within an event that has already had its sourcetype defined. Perhaps the regex I have extracting field names at search is the way to go.

Thx

0 Karma
Highlighted

Re: Possible to define a sub-sourcetype?

Contributor

Oh, ok, so you were trying to create field extractions without specifying them in a search string? If yes, you can always add them either via the gui at Settings > Fields > Field extractions and specify your sourcetype there, or you can do it manually in props.conf for the sourcetype specified for each event. Is that what you were asking?

0 Karma
Highlighted

Re: Possible to define a sub-sourcetype?

Motivator

Kind of.

The events are set to the sourcetype of json already, so before I created the regex to extract fields at searchtime from the key/multivalue field "Event", some fields were already being extracted automatically, such as "SourceModuleName", "SourceModuleType", "FileName" ,"SiteId" ,"WebServer", and "Event".

What I thought might be possible would be to define the key/multivalue field"Event" ala
http://docs.splunk.com/Documentation/AddOns/released/MSIIS/Setupaddon (Perform additional steps for search-time field extraction) by modifying the transforms.conf file, but wasn't sure how to extract the fields/values from the already defined "Event" field.

Thx

0 Karma
Highlighted

Re: Possible to define a sub-sourcetype?

Contributor

It shouldn't matter if the "Event" field is parsed at index time or search time - there shouldn't be an issue creating additional field extractions on the search head for any sourcetype or even the same events, even if the fields were already extracted. New search time field extractions can be made from raw events where there are configuration files with stanzas already parsing fields (either from the indexer or search head) with same or similar values from the same events.

If you want to create new key/multivalue fields from the field "Event" within the same sourcetype where you aren't specifying the regex in the search string, you can do that by going to Settings > Fields > Field extractions and specify your sourcetype there, or you can do it manually in props.conf in the app. Your fields would be extracted "automatically" instead of you having to specify it during search time.

If you want to create new key/multivalue fields from the events (in the json sourcetype) and have those specific events sent to another sourcetype, you may want to explore cloning the events to a different sourcetype of your choosing in order to run your own custom field extractions for that sourcetype. This can be accomplished using CLONE_SOURCETYPE in transforms.conf. Basically, it just clones the data from that sourcetype into another sourcetype for you to play with. You would need to do this on your indexers and make sure to specify the appropriate stanzas in both your props.conf and transforms.conf

http://docs.splunk.com/Documentation/Splunk/latest/admin/propsconf
http://docs.splunk.com/Documentation/Splunk/latest/admin/transformsconf

If I misunderstood you a third time, please let me know.

0 Karma
Highlighted

Re: Possible to define a sub-sourcetype?

Esteemed Legend

Cisco, Palo Alto (and others) do this by using colons in the soucetype, such as cisco:esa:textmail, cisco:wsa:squid, pan:wildfire_report, pan:newapps, pan:logs, etc.

Then when searching, you can do stuff like sourcetype=pan:* or sourcetype=cisco:*mail, etc.

You use rename to update the sourcetype at search-time to something new, when taking a second pass at parsing your stuff.

As far as the other part of your question (adding extra fields), anything can be done SO LONG AS the values that you are adding are inside the raw event. If the data is not inside the raw event, you will need to add the data using SEDCMD so that it is inside the raw event. Fields (at index time) always point to data inside _raw.

So I think that all the nuts-and-bolts are there for you and if I understood you better, I might be able to assemble them for you.

It is a bad idea to allow Splunk to sourcetype your stuff for you; you should always explicitly set the sourcetype.

Highlighted

Re: Possible to define a sub-sourcetype?

Motivator

Thx for the reply and information.

technically, i can rename the sourcetype as we're ingesting these logs into Hunk, so everytime we start to ingest a new log source I have to define the sourcetype in /opt/splunk/etc/apps/search/local/props.conf.

Here is the stanza for our IIS logs:

[source::/LogCentral/IIS/EWIPRD/...]
sourcetype = _json
INDEXED
EXTRACTIONS = JSON

Is it possible to rename the sourcetype from _json to something else like ms:iis:default with still keeping the automated field extractions via json?

My worry is renaming the sourcetype from json to ms:iis:default, I'll wipe out the automated field extractions unless the INDEXEDEXTRACTIONS = JSON is what will keep the automated extractions in place even though the sourcetype will be set to something other than _json.

Thx

0 Karma
Highlighted

Re: Possible to define a sub-sourcetype?

Esteemed Legend

Yes, the rename is a search-time alias. Read about it here:
http://docs.splunk.com/Documentation/Splunk/latest/admin/propsconf

In particular:

# The following attribute/value pairs can only be set for a stanza that
# begins with [<sourcetype>]:

rename = <string>
* Renames [<sourcetype>] as <string> at search time
* With renaming, you can search for the [<sourcetype>] with sourcetype=<string>
* To search for the original source type without renaming it, use the field _sourcetype.
* Data from a a renamed sourcetype will only use the search-time configuration for the target sourcetype. Field extractions  (REPORTS/EXTRACT) for this stanza sourcetype will be ignored.
* Defaults to empty.
0 Karma
Highlighted

Re: Possible to define a sub-sourcetype?

SplunkTrust
SplunkTrust

My first question would be "how did you collect w3c inside of JSON?" Normally, we'd allow a forwarder to pick up w3c files directly and use INDEXED_EXTRACTIONS=w3c and that'd be the end of it.