I have slow searches on one particular index, which is receiving apache access.log files.
When I inspect my jobs, I see a very long "command.search.kv" phase.
I guess I made a rookie mistake on the regular expressions.
The log format is set by others, I can't change it. It contains stuff like :
1.2.3.4 - - [23/Dec/2015:14:44:33 +0100] "GET http://1.2.3.4/ABCDEF/Pop.do HTTP/1.0" 200 886 16352 "-" "check_http/v1.4.15 (nagios-plugins 1.4.15)"
or
1.2.3.4 - - [23/Dec/2015:14:54:08 +0100] "GET /ABCDEF/Pop.do HTTP/1.1" 200 10738 18287 "http://172.18.56.35/ABCDEF/Map.do" "Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 6.1; Trident/5.0; SLCC2; .NET CLR 2.0.50727; .NET CLR 3.5.30729; .NET CLR 3.0.30729; .NET4.0C; .NET4.0E)"
(notice sometimes complete hostname, sometimes not)
props.conf
[web_access_log:myapp]
SHOULD_LINEMERGE = false
MAX_TIMESTAMP_LOOKAHEAD = 50
TIME_PREFIX = \[
TIME_FORMAT =
KV_MODE = none
category = app
EXTRACT-fields = ^(?P<c_ip>[^ ]+)[^\[\n]*\[(?P<date>[^:]+):(?P<time>[^ ]+)[^ \n]* (?P<timezone>\+\d+)\]\s+"(?P<cs_method>\w+)[^ \n]* (?P<cs_uri>[^ ]+)\s+(?P<protocol>[^"]+)[^ \n]* (?P<sc_status>\d+)\s+(?P<sc_bytes>\d+)\s+(?P<final_time_taken>\d+)\s+"(?P<cs_referer>[^"]+)"\s+"(?P<cs_useragent>[^"]+)
EXTRACT-cs_uri_query = ^(?:[^ ]+)[^\[\n]*\[(?:[^:]+):(?:[^ ]+)[^ \n]* (?:\+\d+)\]\s++"(?:\w+)[^ \n]* (?:[^\/]*)(?:\/{2})?(?:[^\/]*)(?P<cs_uri_query>[^ ]+)\s
EXTRACT-cs_hostname = ^(?:[^ ]+)[^\[\n]*\[(?:[^:]+):(?:[^ ]+)[^ \n]* (?:\+\d+)\]\s++"(?:\w+)[^ \n]* (?:[^\/]*)(?:\/{2})?(?:[^\/]*)\/(?P<cs_hostname>\w*)\/
EVAL-cs_hostname = if(isnull(cs_hostname),host,cs_hostname)
EVAL-final_time_taken = final_time_taken/1000000
EVAL-cs_uri_stem = cs_uri_query
... View more