I have following message format.
2013-06-17 15:33:01+0200 appid="myapplication" responsetimems="155" message="Calling method="calculate" class="math" data="size="98123" rows="9811" firstcolumn="customername"""
Splunk parsed that into following fields
appid = myapplication responsetimesms = 155 message = Calling method= class = math data = size= rows = 9811 firstcolumn = customername
But I want to
appid = myapplication responsetimesms 155 message = Calling method="calculate" class="math" data="size="98123" rows="9811" firstcolumn="customername" method = calculate class = math data = size="98123" rows="9811" firstcolumn="customername" size = 98123 rows = 9811 firstcolumn = customername
How can I do that?
What Kristian writes is correct. You do need to manipulate the extraction mode using a customization. Based on your data sample, there may be two steps to a solution.
Assume that your data is catalogued with sourcetype "answers-1371500719", then create an entry in props.conf and transforms.conf with the following:
#props.conf [answers-1371500719] REPORT-get_kv_fields = get_kv_fields #transforms.conf [get_kv_fields] REGEX = ([a-zA-Z0-9]+)\=\"([a-zA-Z0-9]+)\" FORMAT = $1::$2 MV_ADD = true
This ensures that you obtain all of those fields and corresponsing values that follow this convetion
This will provide you the appropriate value pairs. Please note that the message field is still incorrect.
The message field can then overriden using an inline regular expression at search time,
sourcetype="answers-1371500719" | rex field=_raw "message\=\"(?<message>.+\"?)\"\""
or automatically by updating your props.conf entry
#props.conf [answers-1371500719] REPORT-get_kv_fields = get_kv_fields EXTRACT-message_field = message\=\"(?<message>.+\"?)\"\"
In the end you end up with this:
BTW: Thanks for posting a data sample. It is always easy if we see the data.
Yes. You can do it. Either through
rex extractions in each search, or through doing some configuration in props.conf. That would involve EXTRACTs where you specify exactly what you want extracted (pretty much the same regex syntax as for
rex). You might want to set KV_MODE to
none as well.
EXTRACT-<class> = [<regex>|<regex> in <src_field>] * Used to create extracted fields (search-time field extractions) that do not reference transforms.conf stanzas. * Performs a regex-based field extraction from the value of the source field. * <class> is a unique literal string that identifies the namespace of the field you're extracting. **Note:** <class> values do not have to follow field name syntax restrictions. You can use characters other than a-z, A-Z, and 0-9, and spaces are allowed. <class> values are not subject to key cleaning. * The <regex> is required to have named capturing groups. When the <regex> matches, the named capturing groups and their values are added to the event. * Use '<regex> in <src_field>' to match the regex against the values of a specific field. Otherwise it just matches against _raw (all raw event data). * NOTE: <src_field> can only contain alphanumeric characters (a-z, A-Z, and 0-9). * If your regex needs to end with 'in <string>' where <string> is *not* a field name, change the regex to end with '[i]n <string>' to ensure that Splunk doesn't try to match <string> to a field name. KV_MODE = [none|auto|multi|json|xml] * Used for search-time field extractions only. * Specifies the field/value extraction mode for the data. * Set KV_MODE to one of the following: * none: if you want no field/value extraction to take place. * auto: extracts field/value pairs separated by equal signs. * auto_escaped: extracts fields/value pairs separated by equal signs and honors \" and \\ as escaped sequences within quoted values, e.g field="value with \"nested\" quotes" * multi: invokes the multikv search command to expand a tabular event into multiple events. * xml : automatically extracts fields from XML data. * json: automatically extracts fields from JSON data. * Setting to 'none' can ensure that one or more user-created regexes are not overridden by automatic field/value extraction for a particular host, source, or source type, and also increases search performance. * Defaults to auto. * The 'xml' and 'json' modes will not extract any fields when used on data that isn't of the correct format (JSON or XML).
Hope this helps,