Can anybody tell me what is the major difference in extraction field from the event and extracting a field using regex in search? And what is more efficient?
If you extract the field at index time then it may be easier to find your results later on, especially if searching for large amounts of data, as if you run a regex at search-time then it must do all the regexing at the same time.
I'd personally stick with index-time extractions as your searches will be much more performant.
Regex is quite an expensive command to run in either case, so keep an eye on your data ingestion pipelines. The Monitoring console (https://docs.splunk.com/Documentation/Splunk/7.1.2/DMC/DMCoverview) may be able to give you an idea on how much processing the regex is taking at extraction time, if this helps.
Let me know if this helps.
Thanks
Will
@navd Fields from Event can be extracted during Search Time or Index Time.
Metadata kind of field which is applicable for all events like folder path name which signifies type of System may be extracted during index time
depending on the use case. Advantage of this would be if most of your queries relies on this field (System for example), then you can use tstats on index time field extraction. However, it also means that additional processing will happen while indexing the data (in other words may lead to delay in indexing of data). Do understand the impact of index time field extraction and confirm your use case (including performance testing) before creating index time field extraction. Refer to documentation: http://docs.splunk.com/Documentation/Splunk/latest/Indexer/Indextimeversussearchtime
Search time
field extraction can be created using Interactive Field Extraction (IFX) and/or KV_MODE depending on type of data. Before creating a field extraction using props.conf
or transforms.conf
you can use several commands like rex, erex, extract, KV etc. Refer to documentation: http://docs.splunk.com/Documentation/Splunk/latest/Search/Extractfieldswithsearchcommands
Even if you use regular expression based extraction using rex
command, after testing the same with your sample data (including performance test), you can move the regular expression to props.conf and transforms.conf. Refer to documentation: http://docs.splunk.com/Documentation/Splunk/latest/Knowledge/Createandmaintainsearch-timefieldextrac...
The advantage of using IFX, or props.conf/transforms.conf over rex is that field extraction can then be easily configured/maintained from a single place and can be scaled to multiple data sources as well. However, rex command will add a dependency with individual searches.
PS: You should also test the performance of your regular expression either in Splunk or using utility like https://regex101.com
Thank you for the details
If you extract the field at index time then it may be easier to find your results later on, especially if searching for large amounts of data, as if you run a regex at search-time then it must do all the regexing at the same time.
I'd personally stick with index-time extractions as your searches will be much more performant.
Regex is quite an expensive command to run in either case, so keep an eye on your data ingestion pipelines. The Monitoring console (https://docs.splunk.com/Documentation/Splunk/7.1.2/DMC/DMCoverview) may be able to give you an idea on how much processing the regex is taking at extraction time, if this helps.
Let me know if this helps.
Thanks
Will