Splunk Search

Problem parsing fields with spaces at index time for metrics

Engager

Hello all,

I am currently having some problems with filtering my raw data into a metric index. My raw data currently looks like this:

Date=2019-02-15_00:06:04_+0000,collection=Available Memory,object=Memory,counter=Available Bytes,metric_name=available_bytes,instance=0,Value=5155557376

my main issue is with the 'counter' and 'collection' fields which have values that contain spaces. e.g. Available Bytes.

I initially was using the field_extraction TRANSFORM in order to parse the data. Here are the relevant stanzas from my props.conf and transforms.conf here:

props.conf:

[mkv:meminfo:Memory]
TRANSFORMS-EXTRACT = field_extraction
METRIC-SCHEMA-TRANSFORMS = metric-schema:extract_metrics
category = Log to Metrics

transforms.conf:

[metric-schema:extract_metrics]
METRIC-SCHEMA-MEASURES = Value

BUT this only seemed to take the first word of the phrase. e.g. in splunk, counter would only be 'Available' (see image below)

I then tried to manually extract the field using REGEX through the config files. This is what my transforms.conf and props.conf look like at this point:

Data:
Date=2019-02-15_00:06:04_+0000,collection=Available Memory,object=Memory,counter=Available Bytes,metric_name=available_bytes,instance=0,Value=5155557376

props.conf:

[mkv:meminfo:Memory]
TRANSFORMS-metricsfields = custom_field_extractor
METRIC-SCHEMA-TRANSFORMS = metric-schema:extract_metrics
category = Log to Metrics

transforms.conf:

[custom_field_extractor]
REGEX = ([a-zA-Z]+)=([^,]*)
FORMAT = $1::$2
WRITE_META = true
REPEAT_MATCH = true

[metric-schema:extract_metrics]
METRIC-SCHEMA-MEASURES = Value

This produces the same results, the counter and collection values are still only 'Available'.
Can anybody see a problem with the strategy that i'm implementing?

NOTE: have also added stanza to fields.conf although not sure if it's doing anything:

[metricsfields]
INDEXED=true
0 Karma

Esteemed Legend

Keep everything the same but change this:

REGEX = ,([^=]+)\s*=\s*([^,]+)
0 Karma

Engager

Hey Gregg,

Made the REGEX change you suggested and when i restarted splunk gave me this error:
Bad regex value: ',([^=]+)\s=\s(?[^,]+)', of param: transforms.conf / [customfieldextractor] / REGEX; why: unrecognized character after (? or (?-
One or more regexes in your configuration are not valid. For details, please see btool.log or directly above.

0 Karma

Engager

Hey Gregg, still doesn't seem to be working 😞 am still only seeing 'Available' instead of 'Available Bytes'. Could this be some sort of splunk bug?

0 Karma

Esteemed Legend

I can PROVE that this works. Run this search and look at the results:

| makeresults 
| eval _raw="Date=2019-02-15_00:06:04_+0000,collection=Available Memory,object=Memory,counter=Available Bytes,metric_name=available_bytes,instance=0,Value=5155557376" 
| rex max_match=0 ",(?<key>[^=]+)\s*=\s*(?<value>[^,]+)"

So, why might this not be working? Did you:
Use the ORIGINAL sourcetype value in your stanza header if you are doing sourcetype override/overwrite?
Deploy to the first full instance of Splunk that handles these events (HF or Indexers)?
Restart all splunk instances there.
Send fresh data in after the restarts.
Test with a search using a Time picker value of All time and SPL like this:

index=* sourcetype=YourOriginalSourcetypeHere _index_earliest=-5m
0 Karma

Esteemed Legend

I blew it and left a stray ? in there. I edited my original answer and fixed it. Try it now.

0 Karma