Getting Data In

Parsing src_domain from sourcetype=[msad:nt6:dns] to get clean dns record names

hayduk
Path Finder

We're ingesting logfiles from Windows DNS Servers. This Log entries contrain the src_domain as

(6)config(4)edge(5)skype(3)com(0)
(6)s-0001(8)s-msedge(3)net(0)
(4)api3(7)central(6)sophos(3)com(0)
(4)tsfe(14)trafficshaping(3)dsp(2)mp(9)microsoft(3)com(0)
(12)settings-win(4)data(9)microsoft(3)com(0)

To further process this kind of information i'm trying to normalize the src_domain field to look like a regular dns record. Eg.

config.edge.skype.com
api3.central.sophos.com

I've already replaced the entire (digit) occurence with . in my props.conf with a calculated field.

EVAL-src_domain_punct = replace(src_domain, "(\d+)", ".")

So I get

.config.edge.skype.com.
.api3.central.sophos.com.

But a would like to omit the first and last . so I get a clean DNS Record.

Is there a way to do this kind of parsing?

Regards

Tags (2)
0 Karma
1 Solution

memarshall63
Communicator

You can definitely take two passes at it.

| makeresults  
| eval src_domain = "(4)tsfe(14)trafficshaping(3)dsp(2)mp(9)microsoft(3)com(0)" 
| eval src_domain_punct1 = replace(src_domain, "\(\d+\)",".") 
| eval src_domain_punct = replace (src_domain_punct1,"^.(.*).$","\1")

My first replace is a bit different than yours to escape the "(" characters so the regex doesn't think it's a capture group. In my second replace, I use a capture group and then collect the contents with a "\1".

It's very likely that you can also do this with a single pass with SED or a bit better Regex expression.

Hope that helps a little. Also this:
https://unix.stackexchange.com/questions/270023/sed-to-change-dns-log-string-format

View solution in original post

0 Karma

memarshall63
Communicator

You can definitely take two passes at it.

| makeresults  
| eval src_domain = "(4)tsfe(14)trafficshaping(3)dsp(2)mp(9)microsoft(3)com(0)" 
| eval src_domain_punct1 = replace(src_domain, "\(\d+\)",".") 
| eval src_domain_punct = replace (src_domain_punct1,"^.(.*).$","\1")

My first replace is a bit different than yours to escape the "(" characters so the regex doesn't think it's a capture group. In my second replace, I use a capture group and then collect the contents with a "\1".

It's very likely that you can also do this with a single pass with SED or a bit better Regex expression.

Hope that helps a little. Also this:
https://unix.stackexchange.com/questions/270023/sed-to-change-dns-log-string-format

0 Karma

memarshall63
Communicator

Here it is (still 2 passes) with rex and sed:

| makeresults 
| eval src_domain = "(4)tsfe(14)trafficshaping(3)dsp(2)mp(9)microsoft(3)com(0)"
| rex field=src_domain mode=sed "s/^\(\d+\)|\(\d+\)$//g" 
| rex field=src_domain mode=sed "s/\(\d+\)/./g"
0 Karma

memarshall63
Communicator

This works, too.

| makeresults 
| eval src_domain = "(4)tsfe(14)trafficshaping(3)dsp(2)mp(9)microsoft(3)com(0)"
| rex field=src_domain mode=sed "s/^\(\d+\)|\(\d+\)$//g s/\(\d+\)/./g"
0 Karma

hayduk
Path Finder

Thanks for your answer! Works great! But I would prefer to get it already in props.conf extracted. So, my users do not have to have the query at hand. I took your approach and came up with the following solution in props.conf:

[MSAD:NT6:DNS]
EVAL-src_domain_punct = trim(replace(src_domain, "\(\d+\)", "."),".")
Get Updates on the Splunk Community!

Introducing the 2024 SplunkTrust!

Hello, Splunk Community! We are beyond thrilled to announce our newest group of SplunkTrust members!  The ...

Introducing the 2024 Splunk MVPs!

We are excited to announce the 2024 cohort of the Splunk MVP program. Splunk MVPs are passionate members of ...

Splunk Custom Visualizations App End of Life

The Splunk Custom Visualizations apps End of Life for SimpleXML will reach end of support on Dec 21, 2024, ...