I have a log source that breaks up a URL into different chunks (ie: domain, uri string, uri query, etc) within the log. All parts are getting parsed correctly and are labeled correctly... however, I would like the ability to create an additional field with all the parts of the URL combined together in the correct order to make a complete URL... and then name that new field "url".
Here's a sample log:
2019-04-03 16:47:49 4 10.10.10.10 200 TCP_MISS 5005 723 GET https some.example.com 443 /1023/random/url_picture.png ?u=3892342034 UUser - some.example.com image/png https://cdn.example.com/343/index.html "Mozilla/5.0 (Windows NT 6.1; Trident/7.0; rv:11.0) like Gecko" OBSERVED "Web Ads/Analytics;Content Servers" - 10.10.10.11 random-guid-string - - - 203.0.113.0
In this log, there is a URL that is broken up into 3 main chunks.
There is the main domain, some.example.com, then the uri path, /1023/random/url_picture.png, and then the uri query, ?u=3892342034.
I want to create an additional field called "url" that combines (concatenates) these 3 values together within the field so that the end result is "url=some.example.com/1023/random/url_picture.png?u=3892342034"
I do not have access to SSH on this Splunk instance, so I have to do it through the GUI.
I have gone within the Fields>Field transformations
section and have attempted to create the new field by creating a regex-based transformation.
I have the following expression:
^\d{4}\x2d\d{2}\x2d\d{2}\s\d{2}\x3a\d{2}\x3a\d{2}\s\d+\s(?:\S+\s){6}\S+\s(\S+)\s\S+\s(?:\x2d|(\S+))\s(?:\x2d|(\S+))\s
This expression gets all the desired information within each capture group.
So then, within the Format section, I would normally put something like field_name::$1 (and so on)... but in this case I want to concatenate multiple captures and assign them to a single field named "url"
So, I put url::$1$2$3
and hoped to see all three captures brought together... but instead, all I get is a statically set value for the new field named "url" of "$1$2$3".
Does Splunk have a way to concatenate multiple captures from within a regular expression into a single field name?
Hopefully this makes sense, and thanks for any help provided.
You cannot do concatenated values in search time field extractions like you tried.
For this you create a calculated field (which is similar to eval expressions in the search bar). In the GUI you find that under Settings -> Fields -> Calculated Fields.
You cannot do concatenated values in search time field extractions like you tried.
For this you create a calculated field (which is similar to eval expressions in the search bar). In the GUI you find that under Settings -> Fields -> Calculated Fields.
If the field is extracted via inline or REPORT, search time extract will still work for concatenate, as per the precedence rule. However , using EVAL-url=domain_name.uri_path.uri_query in props, could be better option.
Thanks for pointing me to the calculated field section, I was able to get the result without using a regex, but instead just created the eval expression to say "domain_name + uri_path + uri_query" and called it "url".
Not sure what you mean, transforms.conf spec is very clear about using FORMAT to create concatenated fields:
* At index time only, you can use FORMAT to create concatenated fields:
* Example: FORMAT = ipaddress::$1.$2.$3.$4
...
* NOTE: You cannot create concatenated fields with FORMAT at search time.
That functionality is only available at index time.
Using an EVAL (which can be defined from GUI as a Calculated Field) is the only option (apart from doing something similar in the query itself).
You could do that in the search time, say your search | eval url=domain_name.uri_path.uri_query
using concatenation. would this be acceptable? You can also put that in to a macro, if you use it more often. There are also URL tools and parser in splunk base if you are interested - https://splunkbase.splunk.com/app/2734/
Yeah, that was working great added to a search, but what I wanted was a parsed field all ready to go without having to add that search into every spot.