Getting Data In

slow "command.search.kv" phase

lraynal
Explorer

I have slow searches on one particular index, which is receiving apache access.log files.
When I inspect my jobs, I see a very long "command.search.kv" phase.
I guess I made a rookie mistake on the regular expressions.
The log format is set by others, I can't change it. It contains stuff like :

1.2.3.4 - - [23/Dec/2015:14:44:33 +0100] "GET http://1.2.3.4/ABCDEF/Pop.do HTTP/1.0" 200 886 16352 "-" "check_http/v1.4.15 (nagios-plugins 1.4.15)"

or

1.2.3.4 - - [23/Dec/2015:14:54:08 +0100] "GET /ABCDEF/Pop.do HTTP/1.1" 200 10738 18287 "http://172.18.56.35/ABCDEF/Map.do" "Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 6.1; Trident/5.0; SLCC2; .NET CLR 2.0.50727; .NET CLR 3.5.30729; .NET CLR 3.0.30729; .NET4.0C; .NET4.0E)"

(notice sometimes complete hostname, sometimes not)

props.conf

[web_access_log:myapp]
SHOULD_LINEMERGE = false
MAX_TIMESTAMP_LOOKAHEAD = 50
TIME_PREFIX = \[
TIME_FORMAT =
KV_MODE = none
category = app
EXTRACT-fields = ^(?P<c_ip>[^ ]+)[^\[\n]*\[(?P<date>[^:]+):(?P<time>[^ ]+)[^ \n]* (?P<timezone>\+\d+)\]\s+"(?P<cs_method>\w+)[^ \n]* (?P<cs_uri>[^ ]+)\s+(?P<protocol>[^"]+)[^ \n]* (?P<sc_status>\d+)\s+(?P<sc_bytes>\d+)\s+(?P<final_time_taken>\d+)\s+"(?P<cs_referer>[^"]+)"\s+"(?P<cs_useragent>[^"]+)
EXTRACT-cs_uri_query = ^(?:[^ ]+)[^\[\n]*\[(?:[^:]+):(?:[^ ]+)[^ \n]* (?:\+\d+)\]\s++"(?:\w+)[^ \n]* (?:[^\/]*)(?:\/{2})?(?:[^\/]*)(?P<cs_uri_query>[^ ]+)\s
EXTRACT-cs_hostname = ^(?:[^ ]+)[^\[\n]*\[(?:[^:]+):(?:[^ ]+)[^ \n]* (?:\+\d+)\]\s++"(?:\w+)[^ \n]* (?:[^\/]*)(?:\/{2})?(?:[^\/]*)\/(?P<cs_hostname>\w*)\/
EVAL-cs_hostname = if(isnull(cs_hostname),host,cs_hostname)
EVAL-final_time_taken = final_time_taken/1000000
EVAL-cs_uri_stem = cs_uri_query
Tags (3)
0 Karma

lraynal
Explorer

I ended up doing index-time field extraction, for the field used in the event type I was querying a lot in my reports.
The transition is a bit of a pain to handle when you have a lot of data in your index already, but it seems easier than using a datamodel in this case..

transforms.conf:

[web_access_log_asset]
REGEX = (?:GET|POST)\s(?:[^ ]*(?<asset>\.css|\.gif|\.jpg|\.js[^p]|\.ico|\.css|\.png))
FORMAT = asset::$1
WRITE_META = true

fields.conf:

[asset]
INDEXED=true
0 Karma

sduchene_splunk
Splunk Employee
Splunk Employee

Why not creating an accelerated datamodel, and use this datamodel in your search ? this would solve your performance issue. Can you gives this idea a try ?

I'm sure you know how to do this. for readers who don't :
http://docs.splunk.com/Documentation/Splunk/latest/Knowledge/Aboutdatamodels

and then accelerate it :
http://docs.splunk.com/Documentation/Splunk/latest/Knowledge/Acceleratedatamodels

0 Karma

lraynal
Explorer

Problem is this index is 80 times larger than any other on this indexer ..
Anyway, a search will take 10 seconds on this index and 1 second on another
index="foo" | head 10000

PerformanceTroubleshooting: I saw it; nothing I can detect on systems check (not a linux superstar..)
I have an anti-pattern on an eventtype (too many NOTs), but I want to be sure there is no other way before going to index-time field extraction

[slm_request]
color =
description =
disabled = 0
priority = 1
search = sourcetype="web_access_log:foo" NOT ( cs_uri_stem="*.css*" OR cs_uri_stem="*.gif*" OR cs_uri_stem="*.ico*" OR cs_uri_stem="*.jpg*" OR cs_uri_stem="*.js" OR cs_uri_stem="*.js?*" OR cs_uri_stem="*.png*")
tags =

I didn't set a transforms.conf on this app
props list :

[web_access_log:foo]
ANNOTATE_PUNCT = True
AUTO_KV_JSON = true
BREAK_ONLY_BEFORE = 
BREAK_ONLY_BEFORE_DATE = True
CHARSET = UTF-8
DATETIME_CONFIG = /etc/datetime.xml
EVAL-cs_hostname = if(isnull(cs_hostname),host,cs_hostname)
EVAL-cs_uri_stem = cs_uri_query
EVAL-final_time_taken = final_time_taken/1000000
EXTRACT-cs_hostname = ^(?:[^ ]+)[^\[\n]*\[(?:[^:]+):(?:[^ ]+)[^ \n]* (?:\+\d+)\]\s+(?:\d+)\s+"(?:\w+)[^ \n]* (?:[^\/]*)(?:\/{2})?(?P<cs_hostname>[^\/]*)\/
EXTRACT-cs_uri_query = ^(?:[^ ]+)[^\[\n]*\[(?:[^:]+):(?:[^ ]+)[^ \n]* (?:\+\d+)\]\s+(?:\d+)\s+"(?:\w+)[^ \n]* (?:[^\/]*)(?:\/{2})?(?:[^\/]*)(?P<cs_uri_query>[^ ]+)\s
EXTRACT-fields = ^(?P<c_ip>[^ ]+)[^\[\n]*\[(?P<date>[^:]+):(?P<time>[^ ]+)[^ \n]* (?P<timezone>\+\d+)\]\s+(?P<final_time_taken>\d+)\s+"(?P<cs_method>\w+)[^ \n]* (?P<cs_uri>[^ ]+)\s+(?P<protocol>[^"]+)[^ \n]* (?P<sc_status>\d+)\s+(?P<sc_bytes>\d+)\s+"(?P<cs_referer>[^"]+)"\s+"(?P<cs_useragent>[^"]+)
HEADER_MODE = 
KV_MODE = none
LEARN_SOURCETYPE = true
LINE_BREAKER_LOOKBEHIND = 100
MAX_DAYS_AGO = 31
MAX_DAYS_HENCE = 2
MAX_DIFF_SECS_AGO = 3600
MAX_DIFF_SECS_HENCE = 604800
MAX_EVENTS = 256
MAX_TIMESTAMP_LOOKAHEAD = 50
MUST_BREAK_AFTER = 
MUST_NOT_BREAK_AFTER = 
MUST_NOT_BREAK_BEFORE = 
SEGMENTATION = indexing
SEGMENTATION-all = full
SEGMENTATION-inner = inner
SEGMENTATION-outer = outer
SEGMENTATION-raw = none
SEGMENTATION-standard = standard
SHOULD_LINEMERGE = false
TIME_FORMAT = 
TIME_PREFIX = \[
TRANSFORMS = 
TRUNCATE = 10000
category = app
description = foo
detect_trailing_nulls = false
maxDist = 100
priority = 
sourcetype = 
0 Karma

lraynal
Explorer

I set the TIME_FORMAT in props.conf, no noticeable difference in performance.

0 Karma

sduchene_splunk
Splunk Employee
Splunk Employee

Hello Laurent can you provide a complete props list? (Cli : SPLUNK btool props list) as well as transforms?
Can you confirm its faster with other source type on this same indexer? (See jobs running time through rest)
I would specify time format.
Also have you read this ?
https://wiki.splunk.com/Community:PerformanceTroubleshooting

0 Karma
Get Updates on the Splunk Community!

Introducing Splunk Enterprise 9.2

WATCH HERE! Watch this Tech Talk to learn about the latest features and enhancements shipped in the new Splunk ...

Adoption of RUM and APM at Splunk

    Unleash the power of Splunk Observability   Watch Now In this can't miss Tech Talk! The Splunk Growth ...

Routing logs with Splunk OTel Collector for Kubernetes

The Splunk Distribution of the OpenTelemetry (OTel) Collector is a product that provides a way to ingest ...