At every set interval (while testing, 30 min interval), a search is issued to get min, max, and mean values of some perf counters. Those values are sent to a summary index, and this is where strange things start to happen.
In the live data indexes, those perf counters keep coming in frequently, nothing is missing. If I issue the "summary search" manually, I always get the right data, but when it is run by the Splunk scheduler, data gets into the index in an erratic manner. Here is the disturbing "pattern":
- initially, summary data was less than a day "late". The last 30 minutes samples would show up as 18-hours late data in the summary index.
- a few days later, it was around two days late, in the summary index
- still later, it was four days late
- all of a sudden, with no change, most data was 4+ days late, and there would be an isolated "peak" which would be only around 15 hours late.
- now it's again, at best, 2 days late.
When I set this up in the lab environment, I had no issue, and it runs just like it is supposed to run. However, when the exact same mechanism is set up in production, we get that strange behavior.
Here, it's not a matter of back filling older data. The "live" data is available.
I've seen several similar issues, including one which recommends to "delete the summary data for the time frame and then use back filling instead".
One last point. It seems the scheduler also behaves erratically, and does not respect the set schedule, neither the frequency, nor the time frame in which it should run, but when it does run almost right, summary inserted data is still way late.
This is happening on a Splunk Enterprise 6.3.3
... View more
Splunk Enterprise, version 6.1.1, running on Win 2008 OS.
After running into weird permissions issues (no rights to edit files until we reset ownerships, "everyone" appearing on the filesystem), we decided to reset all ACL from the "root" Splunk folder (Drive:/Program Files/Splunk).
We disabled inheritance from Drive:/Program Files/Splunk and below.
Basically, it had the local administrators group, SYSTEM, and the splunk domain service account credentials with appropriate permissions on the whole Splunk file tree.
We dumped all ACL using dumpsec, to make sure it was all clean (it was).
Then, we started splunkd (running under a domain service account).
As soon as splunkd writes to a file, permissions are modified on the folder holding the file, and the file itself. Inheritance is broken on the folder containing the file written to, "everyone" is added, SYSTEM is removed, and any group different from local administrators is removed.
This is very annoying for the reasons listed below (and probably more which have not come up yet). The longer splunkd runs, the more folders and files get their permissions changed. After a week, it's an endless list of modified folders (all etc/apps/AppName and pretty much everything under it, conf files, lookups, views, etc)
- we do not want "everyone" to be set, on any folder or file, there is absolutely no reason for such permissions. It's all a Windows integrated security context, so "everyone" should not appear on any domain resource ACL.
- it generated loads of events reporting file permission changes
- it prevents appropriate auditing of the filesystem because if generates lots of noise
- it removes legitimate groups from accessing/editing files (such as lookups)
Has anyone met this type of behavior ?
We've seen such issues in older Splunk versions (reported bug in 4.x under windows), but they all date back awhile.
... View more
It's a simple search query. It needs to find events containing a file name which will change every month.
The eval command should return YYmm* (1412*).
This query works
The eval field signatureVersionCriteria has been replaced -hard coded- with the value it should hold.
The field values shown in a table do indeed display "1412*" for the signatureVersionCriteria field.
index=xxx_app_sep | eval signatureVersionCriteria = strftime(now(), "%y%m") + "*" | search signature_version="1412*"| table signature_version, signatureVersionCriteria
This query does not work
It returns nothing.
index=xxx_app_sep | eval signatureVersionCriteria = strftime(now(), "%y%m") + "*" | search signature_version = signatureVersionCriteria | table signature_version, signatureVersionCriteria
Went through quite a lots of posts, similar to this, but could not figure it out.
... View more
I installed Splunk (full instance and/or universal forwarder) on OS's in French.
Some stuff will obviously work (such as pulling information from the Windows event logs), but for other stuff, it's less "obvious".
For example, I'm thinking of performance counters. In the French version of the OS, the counter names are different, and contain accents, which Splunk does not recognize. When starting the UF with a localized perfmon.conf file with french counter names, it will error out and report it cannot find the counters (because, in the names it looks for, it will drop accented letters and thus, the counter name does indeed not match).
I've gone through some research, but have not really found anything on this type of thing, beside character set settings in props.conf.
Any pointer or experience on this matter ?
Also, what's the idea as far as distributing conf files for OS'es which are in different languages ? Have different ServerClass'es based on language in order to distribute the appropriate conf file ?
... View more
Many thanks for your quick reply and information.
I will indeed keep on going through all that doc as well, no small tasks, but usual with software of the sort.
... View more
I'm rather new to Splunk, so I haven't yet covered all the documented aspects of it, and have not found anything yet on this subject.
In other classical monitoring setups (such as NetIQ's Security Manager or MS SCOM), it is possible to install agents in other Windows domains which are not trusted by the domain where the solution's servers are running. Those agents are often called "unmanaged" agents, and because of this specific context, some functionality is lost (such as updating the agent software, because of credential problems). But in general, as long as the communication ports between the agent and the servers are opened, the agent is able to perform its job, and send its data back to the servers in the other domain.
So, I'm wondering how this type of context is tackled with Splunk. I guess that universal forwarders can be deployed (either through regular software distribution solution, or "manually") in the untrusted domain(s) with appropriate server information so that they can "link" back to those), running with their own domain specific service account, as long as their communication ports are left open between the two domains.
But then, what kind of side effects can we expect ? I'm thinking of things like:
- for the deployement monitor app
- for keeping the agent software up to date
- for specific apps, such as the Windows app or the Enterprise Security app (which I have not yet become familiar with)
- Anything else which might come into play
So, has anyone faced this type of setup ?
Where can I find information regarding different aspects of such a setup ?
Many thanks in advance.
For having worked with other SIEM solutions, this is an absolutely great piece of software, kind of... magical!
Have a nice day,
... View more