Solved: How to achieve dnstap field extraction?

yaye · ‎06-07-2023

Hello,

I am struggling a bit with regex and field extractions. I need to write my own sourcetype because I haven't found anything pre-made for dnstap. Maybe I was blind and you have something ready to hand.

I have the following raw event text:

;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id:  24094
;; flags: qr aa rd ra    ; QUESTION: 1, ANSWER: 1, AUTHORITY: 1, ADDITIONAL: 1
;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 4096
; COOKIE: 67816834b9432822c5a508fd59b65054fb5bbab0c5fe14f8
;; QUESTION SECTION:
;www.test.aa.			IN	A
;; ANSWER SECTION:
www.test.aa.		60	IN	CNAME	testserver.domain
www.test.aa.		60	IN	A	192.168.1.20
;; AUTHORITY SECTION:
test.aa.		60	IN	NS	localhost.

I want to extract the "ANSWER SECTION", but my regex fails:

;;\sANSWER\sSECTION:\v(?<response_query>\S+)\s+(?<response_ttl>\S+)\s+(?<response_class>\S+)\s+(?<reponse_type>\S+)\s+(?<response>\S+)

The problem is that only the first line of the section is captured, but I need to capture every line because I need all the values. The "ANSWER SECTION" can consist of one line or several lines.

I'm using regex101.com with the regex flags "multi line" and "single line" as described in props.conf -> EXTRACT-<class>.

yaye · ‎06-09-2023

Solution found with reddit community

Regex Challenge - Field Extraction : r/Splunk (reddit.com)

View solution in original post

yaye · ‎06-09-2023

Solution found with reddit community

Regex Challenge - Field Extraction : r/Splunk (reddit.com)

PickleRick · ‎06-09-2023

So that's exactly as we said - two-step approach. Firstly you parse the whole section, then you parse separate entries from it.

Still my warning about multivalued fields holds.

PickleRick · ‎06-07-2023

One thing is - as @isoutamo already pointed out - you should first "split" the event into sections, then parse "sets" of fields from each set.

But there is an additional problem - if you parse the answer section into multivalued fields, you will have separate mvfields with no relation between them. Splunk doesn't handle multi-level structures very well.

isoutamo · ‎06-07-2023

Hi

one option is do this with two steps like

| makeresults
| eval _raw = ";; ->>HEADER<<- opcode: QUERY, status: NOERROR, id:  24094
;; flags: qr aa rd ra    ; QUESTION: 1, ANSWER: 1, AUTHORITY: 1, ADDITIONAL: 1
;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 4096
; COOKIE: 67816834b9432822c5a508fd59b65054fb5bbab0c5fe14f8
;; QUESTION SECTION:
;www.test.aa.			IN	A
;; ANSWER SECTION:
www.test.aa.		60	IN	CNAME	testserver.domain
www.test.aa.		60	IN	A	192.168.1.20
;; AUTHORITY SECTION:
test.aa.		60	IN	NS	localhost."
| rex max_match=0 ";;\sANSWER\sSECTION:\v(?<line>[^;]+)"
| rex max_match=0 field=line "(?<response_query>\S+)\s+(?<response_ttl>\S+)\s+(?<response_class>\S+)\s+(?<reponse_type>\S+)\s+(?<response>\S+)"
|fields - _raw _time

If you do it on props.conf, you need to ensure that extraction names are correct for getting line extracted firsts.

r. Ismo

How to achieve dnstap field extraction?

Index This | Why did the turkey cross the road?

Enter the Agentic Era with Splunk AI Assistant for SPL 1.4

Feel the Splunk Love: Real Stories from Real Customers

Are you a member of the Splunk Community?

How to achieve dnstap field extraction?

Index This | Why did the turkey cross the road?

Enter the Agentic Era with Splunk AI Assistant for SPL 1.4

Feel the Splunk Love: Real Stories from Real Customers