Hi,
I read the field.conf examples, but I still don't understand how to set it up. I am using Field Extraction from web Manager without an issue but would like to know how the proper configuration syntax to set up fields.conf with the same regex and number of fields.
^(?[^ ]+)\s[^ ]+\,[0-9]+ \[(? [^ ]+)\] New documents[ ]+: (? [^a-zA-Z._]*) \<[0-9 ]+ \((? [^ ]+)\)\> *DATASOURCE.(? .*)\.
This isn't something you want to do in fields.conf. You'll want to do this in transforms/props.
In transforms.conf:
[foo]
REGEX = ^(?<date>[^ ]+)\s[^ ]+\,[0-9]+ \[(?<codetype>[^ ]+)\] New documents[ ]+: (?<newdocs>[^a-zA-Z._]*) \<[0-9 ]+ \((?<processid>[^ ]+)\)\> *DATASOURCE.(?<datasource>.*)\.
And then in props.conf:
[sourcetype]
REPORT-sourcetype = foo
Check the documentation for both props.conf and transforms.conf for more details.
If it's a simple situation where you just have one extract for one sourcetype, you can eliminate the transforms.conf step, and do it all in props.conf:
[sourcetype]
EXTRACT-sourcetype = ^(?<date>[^ ]+)\s[^ ]+\,[0-9]+ \[(?<codetype>[^ ]+)\] New documents[ ]+: (?<newdocs>[^a-zA-Z._]*) \<[0-9 ]+ \((?<processid>[^ ]+)\)\> *DATASOURCE.(?<datasource>.*)\.
Unless there is a really good reason to do so, you probably don't want to be using fields.conf unless you want to extract fields at index time instead of at search time. Take a look at the following doc and the relevant links for the different types of extractions:
http://docs.splunk.com/Documentation/Splunk/5.0/Data/Aboutindexedfieldextraction
That being said, you have used the Interactive Field Extractor feature in Splunk which uses search time extractions. If this is how you wish to proceed but would like to have the extraction done without entering it into your search via the rex command, then you have several options.
All these options differ based on what you are looking to do. Without knowing what exactly you are looking to do, your question is difficult to answer. If the regex you posted is within the TOKENIZER parameter of your fields.conf then it will most likely not behave as you expect it to. However, the same regex could be used with props.conf and transforms.conf to achieve what you are looking to do.
If so, you will want to read this doc for how you go about doing that:
That first link is dead now, here is the updated link:
http://docs.splunk.com/Documentation/Splunk/6.4.2/Knowledge/Createandmaintainsearch-timefieldextract...
This isn't something you want to do in fields.conf. You'll want to do this in transforms/props.
In transforms.conf:
[foo]
REGEX = ^(?<date>[^ ]+)\s[^ ]+\,[0-9]+ \[(?<codetype>[^ ]+)\] New documents[ ]+: (?<newdocs>[^a-zA-Z._]*) \<[0-9 ]+ \((?<processid>[^ ]+)\)\> *DATASOURCE.(?<datasource>.*)\.
And then in props.conf:
[sourcetype]
REPORT-sourcetype = foo
Check the documentation for both props.conf and transforms.conf for more details.
If it's a simple situation where you just have one extract for one sourcetype, you can eliminate the transforms.conf step, and do it all in props.conf:
[sourcetype]
EXTRACT-sourcetype = ^(?<date>[^ ]+)\s[^ ]+\,[0-9]+ \[(?<codetype>[^ ]+)\] New documents[ ]+: (?<newdocs>[^a-zA-Z._]*) \<[0-9 ]+ \((?<processid>[^ ]+)\)\> *DATASOURCE.(?<datasource>.*)\.
Got it. Thanks!
The field extraction UI just builds the transforms/props.conf stanzas for you. So it really depends on which is easier for you. I find this easier, as I bundle it into an app that is pushed to all my search heads via the deployment server.
Oh I see. Then just putting in the Manage-->field extraction is much easier. Do I have any disadvantages over creating so many field extractions instead of creating transforms.conf configuration?