Hello, I want to set up Cisco CDR Reporting and Analytics app on Splunk 6.4.2 // Win7, and I can mount and access through a shared folder to my CDR repository (for Splunk and R&A app could see the CDRs as local). However:
1) When I try to set up Data Inputs and select "Check Path" to the Shared Folder ( As Z: or a Shared Link that I created with mklink to the Shared Folder Z:}}, I got the error : "Either this directory does not exist, or the user as which Splunk is running does not have read access here."
2) in contrast, creating a TEST folder with an empty cdr_ text file does not give me an error.
The shared folder has read permissions for all users, and when I installed Splunk and Cisco CDR R&A app in Debian pointing to the same Shared Folder, it didn't give me any error.
First, it's a REALLY good idea to chat a bit more about your CDR Repository.
For one thing, if that data input wizard had succeeded, I believe it has a stern warning in its UI to the effect that in the process of indexing the data it finds, it will delete all the files there.
"repository" isn't a word people normally use when it's data that can be deleted suddenly.
Also, it may be best to just leave that repository completely as is, and follow the docs to setup a different host as an "external billing server" in CUCM, (possibly as a second one, which is fine) and have UF forward from that second host.
HOWEVER, to answer your question in its plainest form... I don't know! I've seen some windows-specific error and in fact that particular exception only ever seems to be hit on windows.
You could quite probably drop a single empty cdrfoo file and a single empty cmr* file in some other directory, submit that with the wizard, and then manually edit etc/apps/cisco_cdr/local/inputs.conf to change the path to your Z: path. I suspect that would work. Again though, the key issue here is "are you ok with totally deleting every file in that 'repository'", because that's what the sinkhole input will do.
UPDATE - actually that stern warning I mentioned, about how the files will all be deleted. It seems to not exist. I've added it back, and there will be one as soon as the next maintenance release is up.
Hello, thanks for your answer
Let me clarify that I deployed the 3rd and last Billing Server allowed for CUCM for store and share raw CDRs for more than 1 tool. That's because Im not exporting CDRs directly to the Splunk Server (as you and the doc recommended)
I was testing R&A app this weekend on my Win7 Splunk setup, but:
1) I achieved to parse the CDRs and get the reports I wanted, but only if I downloaded it for the specific month. The idea is that Splunk and R&A app could read the CDRs directly from the CDR Repository , so I can avoid download CDRs every month. What I didnt understand yet is , ¿why you said that R&A could delete all CDRs only for reading or indexing them?? (even, for avoid any tool delete the CDRs from repository, I assigned read-only permissions for the shared folder, so it could be not possible for it delete them, and trying with Debian, R&A recognized the CDR path to the same shared folder, the only limitation was that I didnt have licensing for Splunk on Debian for allow index me all the CDR repository, aprox 20 GB). It could happen in the next release??
2) I also ended a bit concerned with the query performance for search in one Standalone Splunk (took too much time for read a 2.3 GB Montly CDR size), so it could be the opportunity to look for a Distributed Deployment (for speed up the query and report generation processes). I was reading those documents (https://docs.splunk.com/Documentation/Splunk/6.4.3/Admin/OptimizeSplunkforpeakperformance , http://docs.splunk.com/Documentation/Splunk/latest/Deploy/Distributedoverview) but I dont know what could be a good good start point for improve R&A performance (¿¿How many forwarders, indexers, etc could I need???, Could split the load also in the Win7 CDR repository if i assign it a forward or index role??)
Hello, I tried to declare the data inputs for R&A app as "monitor", instead of "batch" (for avoid destroying it) without success.
The app still presents the message "I am ready to proceed with indexing my CDR data" and dont recognize the Read-Only Shared Folder (where the CDRs are).
I suspect the problem is that the code we use in Python to walk the directories does not work for paths on mounted windows drives.
I'm curious what happens if you create the data input manually. Go to Settings > Data Inputs > Files and Directories > New.
Enter a path that ends in "cdr*" like so:
And make absolutely sure that you pick an explicit sourcetype of "cucmcdr", and that you pick the right index (generally "ciscocdr").
Generating the data input this way, it will use a normal "monitor" input. I'm curious to see if Splunk is able to see the mounted drive letter.
As to the larger issue of why our wizard generates a "sinkhole" input that consumes the files, rather than a monitor input that merely watches the files. The answer is that CUCM generates one file every minute for each cluster. Splunk's monitor inputs are designed for cases where there are perhaps hundreds or thousands of files, but when there are tens of thousands or hundreds of thousands (or more) it scales very very badly. The problem is that the monitor input is not only looking for new files, it's also scanning each and every old file for appended changes. In short it constantly is running out of inodes and releasing some so it can create more. Within a week of this you'll be using substantial system resources just so Splunk can look at all the files. Within a couple months your system will be crippled and Splunk will be spending 90%+ of it's resources just statting and scanning files.
Until a few major releases ago we used to use monitor inputs anyway, and then we just asked the admin to set up a script to run every night deleting files older than a couple days. However in far too many cases these scripts would later break, or be set up improperly and things would break down unhappily. =/
For the performance issue, we can take a look. It might be easiest to simply have a webex since this is a large topic and you could have several different bottlenecks going on. IO Speed is among the most important aspect, and it is generally the most overlooked. However it's possible you're underprovisioned with respect to CPU and RAM as well.
Feel free to email us at email@example.com as well, although I'm perfectly happy to keep our conversation here on Answers as well.
Hello, thanks again for your comprehensive answer.
I also tried to add manually the data inputs for cdr and cmr pointing to Z: and leaving the option "Continuosuly Monitor" for R&A app, and the result is the same: the app still dont recognize the data inputs created manually to the shared folder.
Also, we can schedule the webex so we can see in more detail the performance and (hopefully) the shared folder issue. I'm a bit busy on the following days but I'll write you an email for schedule it.
OK. Well if creating the data input through the core Splunk UI doesn't work either, then it's ultimately a deeper problem either with Splunk or the filesystem or just with the permissions. Happy to do a webex always - I'll wait to hear from you by email.