I'm trying to write a regex to match DNS names with only one level in Windows debug logs. I don't want to index those, since they're all internal hosts.
Here are some samples:
3/19/2014 1:18:39 PM 05A4 PACKET 00000000039E3AE0 UDP Rcv aaa.bbb.ccc.ddd 0c45 Q [0001 D NOERROR] TXT ._nfsv4idmapdomain.
3/19/2014 1:18:37 PM 05A4 PACKET 0000000003CDA2C0 UDP Rcv aaa.bbb.ccc.ddd eb60 Q [0001 D NOERROR] A .gishpcs3.
REGEX = (\s\.[A-Za-z0-9_-]+\.\s|\.ip6\.arpa|IN-ADDR|in-addr\.arpa|\sSnd\s)
So, annoyingly, this matches in a regex test site I found (http://www.regexr.com/) but the records are still being indexed. Anyone got a clue about why this doesn't filter?
Do post the full configuration regarding these events, from their stanza in inputs.conf to props.conf to transforms.conf.
Full props.conf stanza:
SEDCMD-windns = s/\(\d+\)/./g
EXTRACT-src_ip-fqdn-win = Rcv\s(?
[^\s]+).+\]\s(?P [^\s]+)\s+\.(?P [^\s]+)\.$
TRANSFORMS-windns = windnsnull
REGEX = (^[^\d]|\s\.[A-Za-z0-9_-]+\.\s*$|IN-ADDR|in-addr|\sSnd\s|\sR\sQ\s|\.ip6\.arpa|NXDOMAIN|windowsupdate\.com)
DEST_KEY = queue
FORMAT = nullQueue
SHOULD_LINEMERGE = false
The regex you posted in the question and the regex in the transforms.conf stanza are different - make sure you're using the most up-to-date one in the transforms.
Looking at the transforms.conf expression, I'm guessing you're missing a backslash at char 8, unless you're really looking for a literal s rather than a whitespace. Changing that makes the regex match your events. Before/after:
That backslash is in the regex. It's a hassle to add regex to this discussion because you have to escape the backslash character and I missed that one.