Hello Everyone,
I'm new to regex, can you please support to extract URL name only until .com or .net only.
This regex GET\s\w+:(?<URL>[^"]+), capturing whole thing, but I would require to capture only until .com and .net.
Also please support to get the fields http_method, status
below is the sample log line.
<14>Jan 19 04:32:59 XXXXXX accesslog_SIEM: Info: 1674102779.113 336 - 10.X.X.X TCP_MISS/200 271 GET http://us-hnl-anx-r001.router.teamviewer.com/din.aspx?s=00000000&id=909083993&client=DynGate&p=10000... us-hnl-anx-r001.router.teamviewer.com din.aspx?s=00000000&id=909083993&client=DynGate&p=10000001 - application/octet-stream DEFAULT_CASE_12-DOMPVM.WebControl.AP-DOMPVM.WebControl.ID-NONE-NONE-NONE-DefaultGroup-NONE - 53843 us-hnl-anx-r001.router.teamviewer.com 80 1 IW_meet 5.0 0 - "0" 0 0 1 - - - - - 0 0 - - - - IW_meet - "Online Meetings" "TeamViewer" "Presentation / Conferencing" - - 6.45 0 - - 0 "Unknown" - 0 "Unknown" - - - - - - "Mozilla/4.0 (compatible; MSIE 6.0; DynGate)" 191
<14>Jan 19 04:32:59 XXXXX accesslog_SIEM: Info: 1674102779.121 7 - 10.130.130.152 TCP_DENIED_SSL/403 0 POST https://activity.windows.com:443/v3/feeds/me/$batch - v3/feeds/me/$batch "INDIADomain\username@INDIADomain" - DROP_WEBCAT_7-BGC.BlockInternetAccess.DP-DOMPVM.Generalusers.ID-NONE-NONE-NONE-NONE-NONE - 61519 activity.windows.com 443 1 IW_comp 5.0 - - - - - - - - - - - - - - - - - IW_comp - "Computers and Internet" "Unknown" "Unknown" - - 0.00 0 - - - - - - - - - - - - - "SGPlatform 2.0" 21040
Try this regex
\d+\.\d+\.\d+\.\d+\s(?<status>\S+)\s\d+\s(?<method>\S+)\shttps?:\/\/(?<domain>.*?)(?:.com|.net)
Hi Rich,
Thank you for your answer.
After applying below regex is not capturing .com and .net at the end.
https?:\/\/(?<domain>.*?)(?:.com|.net)
Below is the output
activity.windows
px.ads.linkedin
And also the above regex is not capturing domain/url details for the below sample log receiving from same device.
Could you please help on this.
<14>Jan 19 04:32:59 xxxxx accesslog_SIEM: Info: 1674102779.144 250 - 1x.1xx.1xx.xx TCP_MISS_SSL/200 0 TCP_CONNECT 192.111.4.115:443 cloud-ec-asn.amp.cisco.com - - - DECRYPT_ADMIN_2-NONE-DOMPVM.Generalusers.ID-NONE-NONE-NONE-DefaultGroup-NONE - 55009 cloud-ec-asn.amp.cisco.com 443 2 IW_comp 9.4 1 - - - - - - - - - - - - - - - - IW_comp - "Computers and Internet" "Unknown" "Unknown" - - 0.00 0 - - - - - - - - - - - - - - 0
I read the OP as wanting the URL up to, but not including .com or .net. To include then use this regex
\d+\.\d+\.\d+\.\d+\s(?<status>\S+)\s\d+\s(?<method>\S+)\shttps?:\/\/(?<domain>.*?(?:.com|.net))
This regex works with the two sample events in the OP. The one in your latest reply is quite different and will require a different regex. If these events will be in the same stream then consider using separate regexes for each field you wish to extract.
If you want .com and .net included, try this:
\d+\.\d+\.\d+\.\d+\s(?<status>\S+)\s\d+\s(?<method>\S+)\shttps?:\/\/(?<domain>.*?(\.com|\.net))
If this isn't what you want, please provide a sample event and the values you are expecting to extract from it.
Hi ,
Thank you for your inputs.
I have written same regex in the props.conf file in Splunk heavy forwarder, but field extractions are not happening in search head.
Could you please guide me if there is any mistake from my end. Below is the props.conf file
[user@XXXXX local]$ cat props.conf
[cp_log1]
category = Custom
pulldown_type = 1
[wsa_test]
category = Custom
EXTRACT-src_ipaddress = .+[^\d](?<ipaddress>\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3})
EXTRACT-dest_ipaddress = TCP_(.+[^\d](?<dstipaddress>\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}):\d)
EXTRACT-domain = https?:\/\/(?<domain>.*?(?:.com|.net))
EXTRACT-username = GOLDBAR\\(?<username>[\w]+)
pulldown_type = 1
Thank you
Hi,
Had written regex in the props.conf file, but field extractions are not observing in Search head. Please find the below props.conf file configuration. could you please guide me if anything to be included in props.conf file.
Thank you
There's nothing wrong with the regexes, however, they will extract fields only when events have text that match the expressions.
Make sure the HF and SH were restarted after the props.conf files were modified.