I've found some logs in our splunk environment that seem to be duplicates (they differ only by their srcip field--which means one is coming directly from a client, while the other comes from a syslog server). So far the only way I've found to determine if the entries are actually duplicates is to export the results into different files based on srcip, then remove the srcip field and diff the resulting files. I'd really like to find a way to pull this comparison off in splunk, but I've not been able to so far. Does anyone have any ideas about how to do this?
EDIT:
Here's an example of what I'm dealing with (redacting some stuff, of course).
Aug 19 09:34:36 A.B.C.D srcip=A.B.C.D fac=authpriv pri=notice sudo: USER : TTY=pts/8 ; PWD=/var/log ; USER=root ; COMMAND=/bin/grep ssh messages
Aug 19 09:34:36 A.B.C.D srcip=W.X.Y.Z fac=authpriv pri=notice sudo: USER : TTY=pts/8 ; PWD=/var/log ; USER=root ; COMMAND=/bin/grep ssh messages
These are clearly the same event; but the log is coming to splunk from A.B.C.D (the client) and W.X.Y.Z (a syslog server).
I initially hypothesized that it was everything of facility authpriv being duplicated, but that doesn't seem to be the case --I haven't been able to verify it at least.
So, again, what I'm looking for is a way to find events like this. "diff" won't work because they differ slightly, but I need to find all of our duplicates so I can take steps to cut out the second instance of the log.
... View more