All Apps and Add-ons
Highlighted

Is there any way to speed up eStreamer eNcore data collection?

Contributor

Hi,

We are running the eSteamer eNcore app with default settings but it looks like it's not able to catch up with the FMC. Currently we are receiving data with time-stamp of Aug 7th 7:53 PM. We do see the time-stamp moving forward but it's extremely slow.

Tried changing the monitor.period setting under estreamer.conf to 30 seconds and even tried 1 sec just to see what happens but eNcore log files are still generated at the interval of 120 seconds, which is the default setting app came with. Splunk has been restarted after the change but the new setting doesn't seem to take effect.

Period changed to 10 seconds:
alt text

Log monitoring frequency still @ 2 minutes:
alt text

Has anyone been able to successfully modify the default settings for this app?

Thanks,

~ Abhi

Highlighted

Re: Is there any way to speed up eStreamer eNcore data collection?

Path Finder

There is no way to speed up eStreamer eNcore through configuration today.

Your options are:

  1. Limit the events being sent by the server (e.g. turn off connections)
  2. Increase CPU power (if you're running in a VM then give at least 2 cores)
  3. Ensure a good network connection to the FMC
  4. Ensure you have good output disk performance

A few things worth noting

  • monitor.period refers to the frequency with which the monitor thread asks encore how it's doing. You will see the results of this in sourcetype=cisco:estreamer:log. It has nothing to do with performance. In fact the more frequently you ask, the slower encore will be. I would recommend you keep this at 120 seconds.
  • The difference in file timestamp has nothing to do with monitor.period. Each file is written to until it has 10,000 lines in it (configurable, although not in the Splunk UI). What we're seeing in your screenshot is that on average you're writing about 10000 events per 2 minutes - which is 83 events per second.
  • I have managed to get 350 events/sec using just one core of a Celeron CPU on a VM (and around 1200 using an i7 with SSD). At 83ev/s I don't think encore will break a sweat.

The question is: if encore is not catching up... is it falling behind? If it's not falling behind then I am willing to bet that the FMC is just delayed in its reporting from sensors. If it is falling behind then it's definitely worth further investigation to find out where the bottleneck is.

EDIT:
After running the Splunk query: sourcetype="cisco:estreamer:log" (ERROR OR WARNING) there was an error like:

2017-08-10 11:51:30,783 estreamer.subscriber ERROR EncoreException: The server returned an invalid version (0) - expected 1. This is likely to be caused by CSCve44987 which is present on both FMC 6.1.0.3 and 6.2.0.1. For more information see: https://bst.cloudapps.cisco.com/bugsearch/bug/CSCve44987. Message={'messageType': 0, 'length': 0, 'version': 0}\nTraceback (most recent call last):\n File "/opt/splunk/etc/apps/TA-eStreamer/bin/encore/estreamer/subscriber.py", line 219, in start\n self.__tryHandleNextResponse()\n File "/opt/splunk/etc/apps/TA-eStreamer/bin/encore/estreamer/subscriber.py", line 158, in __tryHandleNextResponse\n message = self.connection.response()\n File "/opt/splunk/etc/apps/TA-eStreamer/bin/encore/estreamer/connection.py", line 203, in response\n raise estreamer.EncoreException( errMsg.format( version, str(message) ))\nEncoreException: The server returned an invalid version (0) - expected 1. This is likely to be caused by CSCve44987 which is present on both FMC 6.1.0.3 and 6.2.0.1. For more information see: https://bst.cloudapps.cisco.com/bugsearch/bug/CSCve44987. Message={'messageType': 0, 'length': 0, 'version': 0}\n

The important bit: EncoreException: The server returned an invalid version (0) - expected 1. This is likely to be caused by CSCve44987 which is present on both FMC 6.1.0.3 and 6.2.0.1. For more information see: https://bst.cloudapps.cisco.com/bugsearch/bug/CSCve44987

This is a known FMC bug affecting all eStreamer clients: eStreamer keeps crashing and encore keeps reconnecting and stumbling on as best it can. It's probably running only a few seconds out of every 2 minutes and is therefore very slow on average. TAC should be able to sort you out with the correct patch if you pass them the quoted exception.

View solution in original post

0 Karma
Highlighted

Re: Is there any way to speed up eStreamer eNcore data collection?

Contributor

Thanks Sam.

I was interpreting monitor.period completely wrong. It has been changed back to 120.

Server receiving these events from FMC has plenty of resources to spare(Physical server with 32 Cores and 500 G RAM). Logs are being stores on a SAN drive but the connection & speed is good. We do not see any issues there.. Network connection is good as well(Both FMC and Indexer are in same subnet)

While using the older version of this app, we had enabled all the events and were able to index events in real-time without any delay.

But in our case, it looks like eNcore is not even writing 10,000 lines.

[root@ data]# ls -lh
total 25M
-rw-------. 1 root root 1.7M Aug 10 09:45 encore.1502372743.log
-rw-------. 1 root root 1.4M Aug 10 09:47 encore.1502372866.log
-rw-------. 1 root root 2.9M Aug 10 09:49 encore.1502372990.log
-rw-------. 1 root root 1.1M Aug 10 09:51 encore.1502373113.log
-rw-------. 1 root root 328K Aug 10 09:54 encore.1502373237.log
-rw-------. 1 root root 252K Aug 10 09:56 encore.1502373362.log
-rw-------. 1 root root 1.1M Aug 10 09:58 encore.1502373486.log
-rw-------. 1 root root 2.4M Aug 10 10:00 encore.1502373610.log
-rw-------. 1 root root 319K Aug 10 10:02 encore.1502373733.log
-rw-------. 1 root root 319K Aug 10 10:04 encore.1502373857.log
-rw-------. 1 root root 3.6M Aug 10 10:06 encore.1502373981.log
-rw-------. 1 root root 2.2M Aug 10 10:08 encore.1502374105.log
-rw-------. 1 root root 1.3M Aug 10 10:10 encore.1502374228.log
-rw-------. 1 root root 1.3M Aug 10 10:12 encore.1502374353.log
-rw-------. 1 root root 1.2M Aug 10 10:14 encore.1502374476.log
-rw-------. 1 root root 1.2M Aug 10 10:16 encore.1502374600.log
-rw-------. 1 root root 1.3M Aug 10 10:18 encore.1502374723.log
-rw-------. 1 root root 1.2M Aug 10 10:20 encore.1502374847.log
[root@ data]#
[root@ data]# cat encore.1502374847.log | wc -l
720
 [root@ data]# cat encore.1502373981.log | wc -l
1985

I'll try disabling "Connection events" and enable logs that are less verbose but I think issue is something else.
We are getting data with a day old timestamp. https://answers.splunk.com/answers/560550/unable-to-see-all-events-indexed-by-splunk-via-est.html

Have opened a case with Cisco TAC and provided all the logs to them. Waiting for a response.

Thanks,

~ Abhi

0 Karma
Highlighted

Re: Is there any way to speed up eStreamer eNcore data collection?

Path Finder

Hi Abhi!

So a few questions / suggestions for you:
1. Is the data falling behind? Or remaining stable - i.e. is it always about 24 hours behind?
2. Can you the following Splunk query and paste the results please? sourcetype="cisco:estreamer:log" (ERROR OR WARNING) - it's very odd that the files are not filling up. My expectation is that would only happen if encore shuts down and starts up again. Now that I think about it, the input script is configured to restart every 2 minutes if it shuts down... so that may be it.
3. Which version of FMC are you running?

Sam

0 Karma
Highlighted

Re: Is there any way to speed up eStreamer eNcore data collection?

Contributor

It looks like data is moving forward but extremely slowly. e.g. since restarting eNcore this morning @ 9:45 Eastern, it had successfully logged files until 11:53 (it stopped creating files after that for some reason. We also see this behavior every morning when all old files have been purged and data directory is blank but that’s another issue it seems). Now between 9:45 – 11:53, timestamp has only moved from August 9, 2017 10:00:10 AM to August 9, 2017 10:00:21 AM. That’s just 11 seconds. With this rate, even if it’s lagging behind, eNcore will never catch up.

The only two errors we see being repeated with each execution of monitor(120 Sec) are the following:

2017-08-10 11:53:26,291 estreamer.handler INFO Error state. Clearing queue

2017-08-10 11:51:30,783 estreamer.subscriber ERROR EncoreException: The server returned an invalid version (0) - expected 1. This is likely to be caused by CSCve44987 which is present on both FMC 6.1.0.3 and 6.2.0.1. For more information see: https://bst.cloudapps.cisco.com/bugsearch/bug/CSCve44987. Message={'messageType': 0, 'length': 0, 'version': 0}\nTraceback (most recent call last):\n File "/opt/splunk/etc/apps/TA-eStreamer/bin/encore/estreamer/subscriber.py", line 219, in start\n self.__tryHandleNextResponse()\n File "/opt/splunk/etc/apps/TA-eStreamer/bin/encore/estreamer/subscriber.py", line 158, in __tryHandleNextResponse\n message = self.connection.response()\n File "/opt/splunk/etc/apps/TA-eStreamer/bin/encore/estreamer/connection.py", line 203, in response\n raise estreamer.EncoreException( errMsg.format( version, str(message) ))\nEncoreException: The server returned an invalid version (0) - expected 1. This is likely to be caused by CSCve44987 which is present on both FMC 6.1.0.3 and 6.2.0.1. For more information see: https://bst.cloudapps.cisco.com/bugsearch/bug/CSCve44987. Message={'messageType': 0, 'length': 0, 'version': 0}\n

We are running Software Version 6.2.0.1

Thanks,

~Abhi

0 Karma
Highlighted

Re: Is there any way to speed up eStreamer eNcore data collection?

Contributor

Just for the sake of testing, I disabled all log items on eStreamer(FMC Side). We are still getting logs. timestamp has now moved to August 10, 2017 12:25:09 PM

The three items under Data(eNcore Setup page) for (Packets/Connection/Metadata) are still checked, but I would assume that once we unchecked all log options on FMC side data should immediately stop..

0 Karma
Highlighted

Re: Is there any way to speed up eStreamer eNcore data collection?

Path Finder

Hi Abhi,

That's the problem!

From above: EncoreException: The server returned an invalid version (0) - expected 1. This is likely to be caused by CSCve44987 which is present on both FMC 6.1.0.3 and 6.2.0.1. For more information see: https://bst.cloudapps.cisco.com/bugsearch/bug/CSCve44987

This is a known FMC bug affecting all eStreamer clients: eStreamer keeps crashing and encore keeps reconnecting and stumbling on as best it can. It's probably running only a few seconds out of every 2 minutes. TAC should be able to sort you out with the correct patch if you pass them the quoted exception. This will resolve your other issue too with your out of date data.

Sam

0 Karma
Highlighted

Re: Is there any way to speed up eStreamer eNcore data collection?

Contributor

Thanks Sam.

We are working with TAC on deploying this patch.

0 Karma
Highlighted

Re: Is there any way to speed up eStreamer eNcore data collection?

Contributor

Issue got resolved after deploying the patch for CSCve44987.

Thanks,

~ Abhi

0 Karma
Highlighted

Re: Is there any way to speed up eStreamer eNcore data collection?

Path Finder

Great news! Glad to hear it's sorted!

0 Karma