I am a total newbie to SPLUNK and request expert's help to create a query/dashboard.
We have a set of servers writing to the same log which is on a NFS. This NFS is shared on the set of servers. I want to be able to count the number of connection resets I have been getting, unfortunately since they are being treated as separate events on the different hosts, I get duplicate results in my output.
This is my SPLUNK Query
sourcetype="LogFile" "LogEvent level=\"SEVERE\"" | rex "(?.)" | search message="*Connection reset" "StackTrace" | rex field=source "/path/to/the/log/file/(?\w+|\d+)-" | convert timeformat="%Y-%m-%d" ctime(_time) AS date | top company by host,date
IF there were 2 connection resets they will unfortunately be counted as 2 across all my hosts, skewing my results .
I tried using dedup and cluster but somehow never got it working. Could somebody please help?
I'm afraid Splunk has no way of identifying which host the event is coming from. My guess is that you won't know which host is generating the event, either. To Splunk it will always look like several hosts have an identically named log file.
Now, there are a number of options:
I don't know the reason why you are logging to the same file on NFS, but if it's at all possible, I would strongly recommend splitting up so each server has its own log file. If this then resides on NFS, it doesn't matter, as long as the path and/or filename is different.
I would agree with the above. If you have any control over the generation of the logs your solution is there, rather than within Splunk. Centralising your scrutiny of multiple sources is what Splunk is all about.
Thanks for responding!
A sample log file line
LogEvent level="SEVERE" time="2014-09-17T16:07:32Z" shapename="shape20" shapetype="Connector" shapelabel="" shapeextendedinfo="Master Name of the/process(Nameofthe process Default): cookie-Q2Q93V-cookie-name_of Connector; Name of What Operation">
0 java.net.SocketException: Connection reset
Lot of Stack Trace