I have an access logs which prints like this
server - - [date& time] "GET /google/page1/page1a/633243463476/googlep1?sc=RT&lo=en_US HTTP/1.1" 200 350 85
which rex is
| rex field=_raw "(?<SRC>\d+\.\d+\.\d+\.\d+).+\]\s\"(?<http_method>\w+)\s(?<uri_path>\S+)\s(?<uri_query>\S+)\"\s(?<statusCode>\d+)\s(?<body_size>\d+)\s\s(?<response_time>\d+)"
Is there a way to seperate uri into two or 3?
/google/page1/page1a/633243463476/googlep1?sc=RT&lo=en_US
TO
/google
/page1/page1a/633243463476/googlep1?sc=RT&lo=en_US
OR
/google
/page1/page1a/633243463476/googlep1
?sc=RT&lo=en_US
This will get the 3 parts
(?<uri_root>/[^/]+)(?<uri_path>[^?\s]+)\s?(?<uri_query>\S+)
Thank you worked like a charm, however i used
(?<uri_root>/[^/]+)(?<uri_path>[^?\s]+)\s(?<uri_query>\S+)
uri_query seemed to give results for Http/1.1
can you also please check this? It is follow up question.
https://community.splunk.com/t5/Splunk-Search/Using-lookup-command-after-rex-field/td-p/624450
An alternative to regex is to use split, which can be more semantically explicit. (And slightly more efficient.)
Now to using split. Assuming that you have that field uri.
| eval uri = split(uri, "?")
| eval uri_query = "?" . mvindex(uri, 1) ``` ?sc=RT&lo=en_US ```
| eval uri = split(mvindex(uri, 0), "/")
| eval root = "/" . mvindex(uri, 1) ``` /google ```
| eval remainder = "/" . mvjoin(mvindex(uri, 2, -1), "/")
This gives
remainder | root | uri_query |
/page1/page1a/633243463476/googlep1 | ?sc=RT&lo=en_US |
the search didnt give any results, also how do i get results of all the other companies like facebook, twitter?
This will get the 3 parts
(?<uri_root>/[^/]+)(?<uri_path>[^?\s]+)\s?(?<uri_query>\S+)