I have some real DNS data obtained from IDS, I can get it by the following search statement
index = ids sourcetype=suricata event_type=dns | table _time src_ip domain
I have read Operationalize Machine Learning part of the dga app for splunk
Setup notes:
1、Create an index that holds domain names and computed features (we used a index named "dga_proxy")
2、Activate scheduled searches (app menu: More > Alerts) to generate sample data and fill this index.
3、Check the macro domain_input in Settings > Advanced Search if you have custom naming
Following the instructions above, I did the following:
1、create an index , the index is named dga_prod
2、create a scheduled searches alert, the spl as follows:
index=ids event_type=dns
|stats latest(_time) as _time,values(src_ip) as src_ip,values(dest_ip) as dest_ip,values(dns.answer{}.rrtype) as type,values(dns.type) as dns_type,values(asset_name) as asset_name count by domain
| 'ut_shannon(domain)'
| 'ut_meaning(domain)'
| eval ut_digit_ratio = 0.0
| eval ut_vowel_ratio = 0.0
| eval ut_domain_length = max(1,len(domain))
| rex field=domain max_match=0 "(?\d)"
| rex field=domain max_match=0 "(?[aeiou])"
| eval ut_digit_ratio=if(isnull(digits),0.0,mvcount(digits) / ut_domain_length)
| eval ut_vowel_ratio=if(isnull(vowels),0.0,mvcount(vowels) / ut_domain_length)
| eval ut_consonant_ratio = max(0.0, 1.000000 - ut_digit_ratio - ut_vowel_ratio)
| eval ut_vc_ratio = ut_vowel_ratio / ut_consonant_ratio
| apply "dga_ngram"
| apply "dga_pca"
| apply "dga_randomforest" as class
| fields - digits - vowels - domain_tfidf*
|collect index = dga_prod
this alert like dga_eventgen , run every minute to fill dga_prod index.
3、edit domain_input macro , modify deafult index = dga_proxy to index=dga_prod
I have some questions:
1.Am I doing it correctly?
2.how to solve false positive, I see that some very normal domain names are also detected as dga, for example: my company's domain name brower.360.cn , www.xmind.cn , http.kali.org etc.... do I need add it to whitelist and how to do it?
3.I can't find more related dag app for splunk documents, videos, manuals, etc. I also just learned to use MLTK.
... View more