Getting Data In

Looking for working example of metrics from a CSV file with no headers and search-time extraction.

eugenek
Path Finder

Is there a working example out there for ingesting metrics from a CSV file without headers using search-time extraction?

 

Can't get it working when NOT using

INDEXED_EXTRACTIONS = csv

 

Labels (3)
Tags (1)
0 Karma
1 Solution

eugenek
Path Finder

As my other comment indicates, INDEXED_EXTRACTIONS = csv info is misleading.  This is what ended up working:

props.conf

[my_metrics]
DATETIME_CONFIG =
LINE_BREAKER = ([\r\n]+)
MAX_TIMESTAMP_LOOKAHEAD = 25
METRIC-SCHEMA-TRANSFORMS = metric-schema:my_metrics
NO_BINARY_CHECK = true
TIME_FORMAT = %Y/%m/%d %T.%3N
TIME_PREFIX = ^
TRANSFORMS = create_my_idx_fields
category = Log to Metrics
pulldown_type = 1

 

transforms.conf

[create_my_idx_fields]
SOURCE_KEY = _raw
REGEX = ([^,]*),([^,]*),([^,]*),([^,]*),([^,]*),([^,]*),([^,]*),([^,]*),([^,]*),([^,]*),([^,]*),([^,]*),([^$]*)
FORMAT = my_time::$1 field1::$2 field2::$3 field3::$4 field4::$5 field5::$6 field6::$7 field7::$8 field8::$9 field9::$10 field20::$11 field11::$12 field12::$13
WRITE_META = true
# SAMPLE
# 2020/03/06 13:19:01.142, Tesla P100-SXM2-16GB, 00000000:85:00.0, 418.87.00, P0, 3, 3, 67, 90, 37, 16280, 9453, 6827

[metric-schema:my_metrics]
METRIC-SCHEMA-BLACKLIST-DIMS = my_time
METRIC-SCHEMA-MEASURES = field5,field6,field7,field8,field9,field10,field11,field12

View solution in original post

0 Karma

eugenek
Path Finder

As my other comment indicates, INDEXED_EXTRACTIONS = csv info is misleading.  This is what ended up working:

props.conf

[my_metrics]
DATETIME_CONFIG =
LINE_BREAKER = ([\r\n]+)
MAX_TIMESTAMP_LOOKAHEAD = 25
METRIC-SCHEMA-TRANSFORMS = metric-schema:my_metrics
NO_BINARY_CHECK = true
TIME_FORMAT = %Y/%m/%d %T.%3N
TIME_PREFIX = ^
TRANSFORMS = create_my_idx_fields
category = Log to Metrics
pulldown_type = 1

 

transforms.conf

[create_my_idx_fields]
SOURCE_KEY = _raw
REGEX = ([^,]*),([^,]*),([^,]*),([^,]*),([^,]*),([^,]*),([^,]*),([^,]*),([^,]*),([^,]*),([^,]*),([^,]*),([^$]*)
FORMAT = my_time::$1 field1::$2 field2::$3 field3::$4 field4::$5 field5::$6 field6::$7 field7::$8 field8::$9 field9::$10 field20::$11 field11::$12 field12::$13
WRITE_META = true
# SAMPLE
# 2020/03/06 13:19:01.142, Tesla P100-SXM2-16GB, 00000000:85:00.0, 418.87.00, P0, 3, 3, 67, 90, 37, 16280, 9453, 6827

[metric-schema:my_metrics]
METRIC-SCHEMA-BLACKLIST-DIMS = my_time
METRIC-SCHEMA-MEASURES = field5,field6,field7,field8,field9,field10,field11,field12

View solution in original post

0 Karma

to4kawa
Ultra Champion
FIELDS = <quoted string list>
* NOTE: This setting is only valid for search-time field extractions.
* Used in conjunction with DELIMS when you are performing delimiter-based
  field extraction and only have field values to extract.
* FIELDS enables you to provide field names for the extracted field values,
  in list format according to the order in which the values are extracted.
* NOTE: If field names contain spaces or commas they must be quoted with " "
  To escape, use \.
* The following example is a delimiter-based field extraction where three
  field values appear in an event. They are separated by a comma and then a
  space.
    [commalist]
    DELIMS = ", "
    FIELDS = field1, field2, field3
* Default: ""

https://docs.splunk.com/Documentation/Splunk/latest/Admin/Transformsconf

header line is better to send nullqueue.

props.conf

 

[your sourcetype]
SHOULD_LINEMERGE = false
INDEXED_EXTRACTIONS = none
TRANSFORMS-header = null
REPORT-csv = noheader_csv

 

transforms.conf

 

[null]
REGEX = your_header_word
DEST_KEY = queue
FORMAT = nullQueue

[noheader_csv]
DELIMS = ","
FIELDS = field1, field2, field3

 

@eugenek 

I made a slight mistake, so I fixed it.

INDEXED_EXTRACTIONS = none is clear that we do not extract by index.

 




 

 

0 Karma

eugenek
Path Finder

@to4kawa  were you able to get this working with INDEXED_EXTRACTIONS = none?

I'm sure it works for events, but were you able to get this working for metrics?

0 Karma

eugenek
Path Finder

From talking to some people, it might be that indexed extractions is the only option for metrics.  

The documentation in props.conf notes:

 

NOTE: Index-time field extractions have performance implications.
      Create additions to the default set of indexed fields ONLY
      in specific circumstances. Whenever possible, extract
      fields only at search time.

 

 

And other guidance also discourages index-time extractions.
https://docs.splunk.com/Documentation/Splunk/8.0.4/Data/Configureindex-timefieldextraction 

So it might be that this guidance needs to be updated for metrics.

0 Karma
Did you miss .conf21 Virtual?

Good news! The event's keynotes and many of its breakout sessions are now available online, and still totally FREE!