In the Splunk Add-on for Infoblox, the dest_ip field does not always parse correctly--especially instances in which there are multiple IP addresses or CNAME records returned. Here is an instance where the parsing works fine.
... query: www.amazon.com IN A + (54.239.25.200)
Due to this stanza in transforms.conf:
[infoblox_dns_extract_field_7]
REGEX = query:\s.+\sIN\s(\S+)\s\+\s\((\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3})\)
SOURCE_KEY = client_message
FORMAT = the_query_type::$1 dest_ip::$2
However, here is an instance where it does not work:
... query: download.cdn.mozilla.net IN A response: NOERROR + download.cdn.mozilla.net. 100 IN CNAME 2-01-2967-001b.cdx.cedexis.net.; 2-01-2967-001b.cdx.cedexis.net. 59 IN CNAME wildcard.cdn.mozilla.net.edgesuite.net.; wildcard.cdn.mozilla.net.edgesuite.net. 2490 IN CNAME a1284.g.akamai.net.; a1284.g.akamai.net. 9 IN A 23.15.7.155; a1284.g.akamai.net. 9 IN A 23.15.7.122;
This may be a little tough since the number of fields varies, but I am open to thoughts or an update to the TA.
Hi TonyLeeVT,
Agreeed that some of these extractions aren't doing exactly what's intended. Internally we're working on some updates however don't have a time frame just yet.
First the message_type needed some rework.
EVAL-message_type = if(match(_raw,"Response:|response:"),"Response",if(match(_raw,"query:\s(\S+)\s(\w+)\s(\S+)\s((?:\+|\-)\S*)\s\((\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3})\)"),"Query","unknown"))
Then split the request and response regex.
[dns_request]
REGEX = client\s(?<dns_request_client_ip>\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}|(?:::)?(?:[a-zA-Z\d]{1,4}::?){1,7}[a-zA-Z\d]{0,4})#(?<dns_request_client_port>\d+).*\squery:\s(?<dns_request_queried_domain>\S+)\s(?<dns_request_class_name>\w+)\s(?<dns_request_type_name>\w+)\s(?<dns_request_setDC>(?:\+|\-)\S*)\s\((?<dns_request_name_serverIP>\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}|(?:::)?(?:[a-zA-Z\d]{1,4}::?){1,7}[a-zA-Z\d]{0,4})\)
[dns_response]
REGEX = \S+\s+(?<server_ip>\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3})\snamed\[(?<pid>\d+)\]\:\s(?<log_date>\S+)\s(?<log_time>\S+)\sclient\s(?<dns_response_client_ip>\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}|(?:::)?(?:[a-zA-Z\d]{1,4}::?){1,7}[a-zA-Z\d]{0,4})#(?<dns_response_client_port>\d+)?[\D]*\s(?<dns_response_protocol>\w+):\squery:\s(?<dns_response_queried_domain>\S+)\s(?<dns_response_class_name>\w+)\s(?<dns_response_type_name>\w+)\sresponse:\s(?<dns_response_rcode>\w+)\s(?<dns_response_flags>\S+)\s?(?<dns_response_RR_in_TEXT>[\S+\s+]*)?
And to deal with the multiple records within the response we can split out again
[dns_incepted]
REGEX=(?<dns_record>[^;]+)
SOURCE_KEY=dns_response_RR_in_TEXT
MV_ADD=true
[dns_records_extract]
REGEX = (?<dns_answer_name>\S+)\s(?<dns_answer_ttl>\d+)\s(?<dns_class>\S+)\s(?<dns_type>\S+)\s(?<dns_rdata>\S+)
SOURCE_KEY = dns_record
MV_ADD=true
Hope this helps for now, however with all the changes it may be better to wait for the update.
Don
What would you want the dest_ip to be in the second example? I put together this updated regex which will grab the same ip in example one and the the 7.122 address in the second example. But i'm not sure if that's what you need and/or how well it work in other unidentified scenarios.
https://regex101.com/r/gN4zF0/1
query:\s.+\sIN\s+(\S)\s\+?\s?\(?(\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3})\)?
Interesting regex.... because it is better than a complete miss which is what occurs right now. I wonder if it would be possible to turn the results into an array using props/transforms. I have been experimenting with an spl solution and found:
rex field=_raw max_match=0
max_match
Syntax: max_match=
Description: Controls the number of times the regex is matched. If greater than 1, the resulting fields are multivalued fields.
Default: 1, use 0 to mean unlimited.