We're ingesting logfiles from Windows DNS Servers. This Log entries contrain the src_domain as
(6)config(4)edge(5)skype(3)com(0)
(6)s-0001(8)s-msedge(3)net(0)
(4)api3(7)central(6)sophos(3)com(0)
(4)tsfe(14)trafficshaping(3)dsp(2)mp(9)microsoft(3)com(0)
(12)settings-win(4)data(9)microsoft(3)com(0)
To further process this kind of information i'm trying to normalize the src_domain field to look like a regular dns record. Eg.
config.edge.skype.com
api3.central.sophos.com
I've already replaced the entire (digit) occurence with . in my props.conf with a calculated field.
EVAL-src_domain_punct = replace(src_domain, "(\d+)", ".")
So I get
.config.edge.skype.com.
.api3.central.sophos.com.
But a would like to omit the first and last . so I get a clean DNS Record.
Is there a way to do this kind of parsing?
Regards
You can definitely take two passes at it.
| makeresults
| eval src_domain = "(4)tsfe(14)trafficshaping(3)dsp(2)mp(9)microsoft(3)com(0)"
| eval src_domain_punct1 = replace(src_domain, "\(\d+\)",".")
| eval src_domain_punct = replace (src_domain_punct1,"^.(.*).$","\1")
My first replace is a bit different than yours to escape the "(" characters so the regex doesn't think it's a capture group. In my second replace, I use a capture group and then collect the contents with a "\1".
It's very likely that you can also do this with a single pass with SED or a bit better Regex expression.
Hope that helps a little. Also this:
https://unix.stackexchange.com/questions/270023/sed-to-change-dns-log-string-format
You can definitely take two passes at it.
| makeresults
| eval src_domain = "(4)tsfe(14)trafficshaping(3)dsp(2)mp(9)microsoft(3)com(0)"
| eval src_domain_punct1 = replace(src_domain, "\(\d+\)",".")
| eval src_domain_punct = replace (src_domain_punct1,"^.(.*).$","\1")
My first replace is a bit different than yours to escape the "(" characters so the regex doesn't think it's a capture group. In my second replace, I use a capture group and then collect the contents with a "\1".
It's very likely that you can also do this with a single pass with SED or a bit better Regex expression.
Hope that helps a little. Also this:
https://unix.stackexchange.com/questions/270023/sed-to-change-dns-log-string-format
Here it is (still 2 passes) with rex and sed:
| makeresults
| eval src_domain = "(4)tsfe(14)trafficshaping(3)dsp(2)mp(9)microsoft(3)com(0)"
| rex field=src_domain mode=sed "s/^\(\d+\)|\(\d+\)$//g"
| rex field=src_domain mode=sed "s/\(\d+\)/./g"
This works, too.
| makeresults
| eval src_domain = "(4)tsfe(14)trafficshaping(3)dsp(2)mp(9)microsoft(3)com(0)"
| rex field=src_domain mode=sed "s/^\(\d+\)|\(\d+\)$//g s/\(\d+\)/./g"
Thanks for your answer! Works great! But I would prefer to get it already in props.conf extracted. So, my users do not have to have the query at hand. I took your approach and came up with the following solution in props.conf:
[MSAD:NT6:DNS]
EVAL-src_domain_punct = trim(replace(src_domain, "\(\d+\)", "."),".")