Splunk Search

Is it possible to extract dimensions from the source field for metrics imported from CSV files?

Communicator

We already use a custom CSV formt to report application metrics. The format is very similar to the one introduced in Splunk 7.
But while Splunk extracts dimensions from the CSV lines, we extract some of the dimenstions from the source field.
According to the docs something like that is possible for nearly all other methods of importing metrics, but not for CSV files.

Is there any way I can achieve this without modifying the CSV files (e.g. via search- or index-time field extractions)?

EDIT 1:

This is en example CSV file (source = X:\LogFiles\MyEnvironment\MyApplication\MyInstance\Values.amf):

metric_name,_value
Process.IO,16620.4
Process.ProcessorTime,4.0666666666666664
Process.ThreadCount,40
Process.WorkingSet,258634547.2

We currently use a search time field extraction to extract the following three fields (~dimensions) from the source field:

Environment=MyEnvironment
Application=MyApplication
Instance=MyInstance

Those fields are extracted via the following line in our props.conf:

EXTRACT-source = (?i)LogFiles\\(?<Environment>[^\\]+)\\(?<Application>[^\\]+)\\(?<Instance>[^\\]+)\\ in source

Using either this or an equivalent transform to extract the fields at index time did not work.

EDIT 2:
I've tried using the following conf file stanzas:

props.conf:

[metrics_csv]
TRANSFORMS-amf2 = amf2

transforms.conf:

[amf2]
SOURCE_KEY = field:source
REGEX = (?i)LogFiles\\(?<Environment>[^\\]+)\\(?<Application>[^\\]+)\\(?<Instance>[^\\]+)\\
0 Karma
1 Solution

SplunkTrust
SplunkTrust

Hey, try this please for your transforms.conf:

SOURCE_KEY = MetaData:Source
REGEX = yourregex
WRITE_META = true

You can find more information on how to extract index time fields here:
http://docs.splunk.com/Documentation/Splunk/7.1.0/Data/Configureindex-timefieldextraction

For metrics data, you can completely ignore anything written that relating to fields.conf.

View solution in original post

0 Karma

SplunkTrust
SplunkTrust

Hey, try this please for your transforms.conf:

SOURCE_KEY = MetaData:Source
REGEX = yourregex
WRITE_META = true

You can find more information on how to extract index time fields here:
http://docs.splunk.com/Documentation/Splunk/7.1.0/Data/Configureindex-timefieldextraction

For metrics data, you can completely ignore anything written that relating to fields.conf.

View solution in original post

0 Karma

Communicator

Thanks for your support, you clearly pointed me in the right direction:

  1. SOURCE_KEY = MetaData:Source is required.
  2. WRITE_META = true is required.
  3. yourregex MUST NOT end with a backslash (I've encountered that issue long ago but forgot about it. Splunk thinks I want to escape the newline)
  4. Named capturing groups are not working; FORMAT = Environment::$1 Application::$2 Instance::$3does work.

This transforms.conf stanza works:

[amf2]
SOURCE_KEY = MetaData:Source
REGEX = (?i)LogFiles\\([^\\]+)\\([^\\]+)\\([^\\]+)
FORMAT = Environment::$1 Application::$2 Instance::$3
WRITE_META = true

Path Finder

Thanks for pointing this out!
Especially 4. is a no-go for uns.

@Splunk: Any news on this?
Support for named capture groups in transforms should also work for metric dimensions.

0 Karma

SplunkTrust
SplunkTrust

Can you please post a line of sample data for your currently used CSV format and an explanation how it would have to be split?

0 Karma

Communicator

Thanks for the response - I've edited the question.

0 Karma

SplunkTrust
SplunkTrust

This should be possible even with CSV using index-time field extractions. Can you please show the props + transforms you tried?

0 Karma

Communicator

Done - see edit 2.

0 Karma