We have JSON logs being stored in Splunk. A sample log record looks like :
{
data:
{
"hostname":"http://server.com",
"uri":"/api/something/",
"service":"service_1",
"http_status_code":"500"
}
}
The following search query (to find endpoints which throw 5xx errors) runs against a schedule and puts the results in a KVStore (lookup table) :
host=data_source "data{}.http_status_code"= 5* | eval endpoint_url='data{}.hostname'+'data{}.uri' | stats count(endpoint_url) as error-count by endpoint_url | outputlookup 5xx-error-lookup
The requirement now is that, we have to only show endpoints (results) that were not part of the previous search.
I am able to filter results against a simple field like service_name with something like.
host=data_source "data{}.http_status_code"= 5* NOT [| inputlookup 5xx-error-lookup | fields service-name | rename service-name as data{}.service_name ] | eval endpoint_url='data{}.hostname'+'data{}.uri' | stats count(endpoint_url) as error-count by endpoint_url
What I'd actually want to do is to split the endpoint_url to 'hostname' and 'uri' and filter results based on a match for BOTH these fields. Any inputs please?
Thanks in advance.
@technie101, since everything else is working fine for you and you want to split endpoint_url
as hostname
and uri
, I am giving you only that piece.
[| inputlookup 5xx-error-lookup
| fields endpoint_url
| rex field=endpoint_url "(?<hostname>\w+\:\/\/\w+\.\w+)(?<uri>\/\w+\/\w+\/)"
| rename hostname as "data{}.hostname"
| rename uri as "data{}.uri" ]
PS:
I have tested regular expression based on only one sample data provided. Please refer to various hostname you have got and the uri to ensure that your regular expression is working as expected with the sample data. Use regex101.com for testing regular expression with sample data (ll possible patterns).
It would be better to create field alias (or rename in the base index search rather than lookup here, since it is better to have normalized field names without special characters like {
, }
and .
.
Please try out and confirm.
hi @technie101,
Can you try below approach?
Here I'm adding a "|"
(pipe) as a separator in endpoint_url field. see below lookup search.
host=data_source "data{}.http_status_code"= 5* | eval endpoint_url='data{}.hostname'+'|'+'data{}.uri' | stats count(endpoint_url) as error-count by endpoint_url | outputlookup 5xx-error-lookup
And split using split
and mvindex
. check below search.
YOUR SEACH | eval hostname=mvindex(split(endpoint_url,"|"),0), uri=mvindex(split(endpoint_url,"|"),1)
Let me know if any help you need.
Thanks
Hi Kamlesh,
The issue specifically was about using eval with split in the subsearch in the lookup. So the ask is :
host=data_source "data{}.http_status_code"= 5* NOT [| inputlookup 5xx-error-lookup | *<SPLIT endpoint_url FROM LOOKUP AND THEN USE THE 2 FIELDS AS FILTERS ON THE OUTER SEARCH>* ] | eval endpoint_url='data{}.hostname'+'data{}.uri' | stats count(endpoint_url) as error-count by endpoint_url
Any pointers here?
Hi Kamlesh - any help here please?
Hi
Apology for late reply.
Can you please try this?
host=data_source "data{}.http_status_code"= 5* NOT [| inputlookup 5xx-error-lookup | eval hostname=mvindex(split(endpoint_url,"|"),0),uri=mvindex(split(endpoint_url,"|"),1) | return @hostname @uri] | eval endpoint_url='data{}.hostname'+'data{}.uri' | stats count(endpoint_url) as error-count by endpoint_url
I get an error saying : 'Error in 'eval' command: The arguments to the 'split' function are invalid.'