We are currently in a situation where we need to forward all kinds of events from a customers Splunk installation to a LogRhythm solution. For some reason, this forwarding needs to be done by Syslog - which is fine for all log data sourced from system messages over syslog - but is bad for all log data sourced from Windows event logs.
Why do I think the latter is bad?
Well, we're having problems with getting it right:
LogRhythm claims that they cannot receive syslog in CEF; so the Splunk app for CEF is not an option (even though it rid us from the CR/LF/NL problem of forwarding by syslog!).
Forwarding Windows event logs to a syslog server introduces us to the problem with // inside an event is being interpreted as a split (line breaker) resulting in one actual event ending up being as many one lined events as there are 's in the message section of the event being forwarded. Doing this when the external part also use Splunk in not a big problem since you can set SHOULD_LINEMERGE to true in props.conf on the receiving side.
An option for removing the issue is of course to use a SED-script in props.conf to replace the with a space when indexing the data (unfortunately not available at a later stage as for instance when forwarding 🙂 )
Then it is time to introduce the fact that LogRhythm also would like the Windows events to be in XML format. And it introduces some additional fun to the situation:
Setting renderXml=1 (or TRUE) is not an issue - works smoothly.
Setting renderXml=1 reduces the data amount with quite a few bytes per event.
renderXml=1 seems to give us exactly the data like you see if you in windows event viewer go to details pane and hit the radio button for "XML view". For instance, Level text is replaced by the corresponding Level number (0 for info, 3 for warning, 2 for error), as well are failure reason, and not all event text are found as event data.
LogRhythm are still having trouble parsing the data even though syslog forwarding are of raw XML data for Windows events.
I guess it should be solvable on the LR side because as I see it Splunk is only forwarding the XML formatted events as Raw to the LR syslog reception.
But how is this solvable? What am I missing on the Splunk side? Are there more to be done on the Windows side to get more or better data into the details and XML? Using XML formatted events must mean that the receiving side can resolve numbered levels, reason codes and whatnots to make it human readable at least at the alerting level.
So, in essence, the real question here is: Have anyone out there any experience with forwarding data to LogRhythm from Splunk?
Any help is deeply appreciated.
... View more
Such utter stupidity on my part. And I really can't blame it on anyone else!
This is of course an app that uses data models with acceleration and it was not set correctly for our use. To change it all that's needed is to select "Edit Acceleration" in the Data Models screen for the app's data model (or in the app's datamodels.conf file), adjust the settings and give it good amount of time to re-build and of course make sure there's enough disk space.
... View more
We're having an issue with our Splunk where data older than approximately 8 days are extremely slow to search.
First part of the search is done in a few seconds and Splunk says it's between 50 to 60 percent done with the search. The last 40 to 50 percent of the search takes several minutes to complete.
An inspection of the job tells us that the search, amongst others, spends a healthy amount of time in command.search.filter with command.search.rawdata and command.search.kv . But it is nowhere near as much time as the the whole search uses. The search (from a 3rd party app) is using the tstat command and this is noted in the inspection as taking 400 to 500 seconds to complete.
The app uses datamodels and the inspection never mentions anything about the search being done in indexes.
Anyone having any ideas why this is so slow?
We've got one big physical server as an indexer and search head as well as two other search heads along side.
We can't see anything that could indicate a performance issue on the RHEL server that could explain this behaviour.
Any suggestions would help.
... View more