All Apps and Add-ons

Rapid7 Nexpose Technology Add-On for Splunk: Why is the Nexpose Application Service crashing after the add-on runs cron?

mattspierce
Explorer

Good Morning,
I updated my splunk 6.5.2 test environment from the old Rapid7 App to Rapid7 Nexpose Technology Add-On for Splunk last week. Since then my Nexpose instance v6.4.22 is crashing leaving only the nxpsql postgres process running. I have a ticket open with Rapd7 but was wondering if anyone has a similar issue? The API access seems to be working as I have data in my index I created for this app. The nsc.log doesn't show any errors. It just abruptly ends and not necessarily with anything correlating. TA-rapid7_nexpose.log doesn't show any abnormalities I can see. Some time after job ends the app server goes offline.

ps result

nxpgsql  20280  0.0  0.0 164396  4100 ?        S    10:11   0:00 /opt/rapid7/nexpose/nsc/nxpgsql/pgsql/bin/postgres -D /opt/rapid7/nexpose/nsc/nxpgsql/nxpdata

nsc.log
Here is the tailend of the API call for the SQL results.

2017-02-22T10:15:10 [INFO] [Thread: critical-task-executor3] [Silo ID: default] [Report: ad_hoc_6447718972749473] [Report Config ID: 9971] [Started: 2017-02-22T10:11:43] [Duration: 0:03:27.277] Calculated 846831 vulnerability finding matches that resulted in 1104369 solution results.
2017-02-22T10:15:11 [INFO] [Thread: critical-task-executor3] [Silo ID: default] [Report: ad_hoc_6447718972749473] [Report Config ID: 9971] [Started: 2017-02-22T10:10:52] [Duration: 0:04:19.407] Finished preparing the reporting data model version 2.0.1.
2017-02-22T10:15:11 [INFO] [Thread: critical-task-executor3] com.rapid7.sql.export.batch.size is not configured - returning default value 100.
2017-02-22T10:15:11 [INFO] [Thread: critical-task-executor3] [Silo ID: default] [Report: ad_hoc_6447718972749473] [Report Config ID: 9971] Executing query 'SELECT asset_id, da.ip_address, da.mac_address, site_id,                            favf.vulnerability_instances, favf.vulnerability_id,                            fasva.first_discovered, fasva.most_recently_discovered, dv.title, dv.severity, dvc.categories,                            dve.skill_levels, dvr.sources, favf.scan_id,                            dv.cvss_score, dv.date_added, solution_summary, solution_count, solution_types                      from dim_site_asset                     RIGHT OUTER JOIN (select favf.asset_id, favf.vulnerability_instances, favf.vulnerability_id, favf.scan_id FROM fact_asset_vulnerability_finding favf) favf USING (asset_id)                     LEFT OUTER JOIN (select dv.vulnerability_id, dv.title, dv.severity, dv.cvss_score, dv.date_added FROM dim_vulnerability dv) dv USING (vulnerability_id)                     LEFT OUTER JOIN (select dvc.vulnerability_id, (string_agg(DISTINCT '<' || dvc.category_name, '>') || '>') as categories FROM dim_vulnerability_category dvc GROUP BY dvc.vulnerability_id) dvc USING (vulnerability_id)                     LEFT OUTER JOIN (select dve.vulnerability_id, (string_agg(DISTINCT '<' || dve.skill_level, '>') || '>') as skill_levels FROM dim_vulnerability_exploit dve GROUP BY dve.vulnerability_id) dve USING (vulnerability_id)                     LEFT OUTER JOIN (select dvr.vulnerability_id, (string_agg(DISTINCT '<' || dvr.source || ':' || dvr.reference,'>') || '>') as sources FROM dim_vulnerability_reference dvr GROUP BY dvr.vulnerability_id) dvr USING (vulnerability_id)                     LEFT OUTER JOIN (select fasva.asset_id, fasva.vulnerability_id, fasva.first_discovered, fasva.most_recently_discovered FROM fact_asset_vulnerability_age fasva) fasva USING(asset_id, vulnerability_id)                      LEFT OUTER JOIN (select da.asset_id, da.ip_address, da.mac_address FROM dim_asset da) da USING (asset_id)                      LEFT OUTER JOIN (select vulnerability_id, (array_agg(summary))[1] as solution_summary, COUNT(solution_id) as solution_count, string_agg(distinct(solution_type),'|') as solution_types from dim_vulnerability_solution                           JOIN (select solution_id, solution_type, summary from dim_solution) dsol USING (solution_id)                           GROUP BY vulnerability_id                       ) dsv USING (vulnerability_id)                      WHERE site_id=21                      GROUP BY asset_id, da.ip_address, da.mac_address, fasva.first_discovered, fasva.most_recently_discovered, site_id, favf.scan_id, favf.vulnerability_id, favf.vulnerability_instances, dv.title, dv.vulnerability_id, dv.severity,                      dvc.categories, dve.skill_levels, dvr.sources, dv.cvss_score, solution_count, dsv.solution_summary, dsv.solution_count, dsv.solution_types, dv.date_added                     '.
2017-02-22T10:15:35 [INFO] [Thread: Thread-859] [172.20.15.253] Scan engine certificate verified.
2017-02-22T10:17:07 [INFO] [Thread: Thread-860] [172.20.15.253] Scan engine certificate verified.
2017-02-22T10:18:39 [INFO] [Thread: Thread-861] [172.20.15.253] Scan engine certificate verified.
2017-02-22T10:20:11 [INFO] [Thread: Thread-862] [172.20.15.253] Scan engine certificate verified.
2017-02-22T10:21:01 [INFO] [Thread: Scheduler] Executing job JobID[Auto-Content-update retriever-78BE780D0C1146315BD57A0CE66EC5CE17D29FE1] Content Update
2017-02-22T10:21:01 [INFO] [Thread: Scheduled Execution Thread: Auto-Content-update retriever-78BE780D0C1146315BD57A0CE66EC5CE17D29FE1] Updating the Security Console content.
2017-02-22T10:22:05 [INFO] [Thread: Thread-864] [172.20.15.253] Scan engine certificate verified.
2017-02-22T10:22:11 [INFO] [Thread: task-executor4] Done with statistics generation [Started: 2017-02-22T10:22:07] [Duration: 0:00:03.582].
2017-02-22T10:22:35 [INFO] [Thread: Scheduled Execution Thread: Auto-Content-update retriever-78BE780D0C1146315BD57A0CE66EC5CE17D29FE1] Updating content on remote scan engines.
2017-02-22T10:23:37 [INFO] [Thread: Thread-865] [172.20.15.253] Scan engine certificate verified.
2017-02-22T10:25:08 [INFO] [Thread: Thread-866] [172.20.15.253] Scan engine certificate verified.
2017-02-22T10:26:40 [INFO] [Thread: Thread-867] [172.20.15.253] Scan engine certificate verified.
2017-02-22T10:28:12 [INFO] [Thread: Thread-868] [172.20.15.253] Scan engine certificate verified.

Here is the break in the logs. The following is when I started the app.

2017-02-22T16:09:35 [INFO] [Thread: main]
2017-02-22T16:09:35 [INFO] [Thread: main] OS Information
2017-02-22T16:09:35 [INFO] [Thread: main] ------------------------------------------------------------
2017-02-22T16:09:35 [INFO] [Thread: main] Current directory: /opt/rapid7/nexpose/nsc
2017-02-22T16:09:35 [INFO] [Thread: main] User name:         root
2017-02-22T16:09:35 [INFO] [Thread: main] Computer name:     nexpose.place.com
2017-02-22T16:09:35 [INFO] [Thread: main] Operating system:  CentOS Linux 6.8
2017-02-22T16:09:35 [INFO] [Thread: main] Total memory:      8061512 KBytes
2017-02-22T16:09:35 [INFO] [Thread: main] Available memory:  6942380 KBytes
2017-02-22T16:09:35 [INFO] [Thread: main] CPU speed:         2399MHz
2017-02-22T16:09:35 [INFO] [Thread: main] Number of CPUs:    1
2017-02-22T16:09:35 [INFO] [Thread: main] Super user:        true
2017-02-22T16:09:35 [INFO] [Thread: main] JVM started:       Wed Feb 22 10:09:25 CST 2017
2017-02-22T16:09:35 [INFO] [Thread: main] JVM uptime:        6 seconds
2017-02-22T16:09:37 [INFO] [Thread: main]
2017-02-22T16:09:37 [INFO] [Thread: main] OS Information
2017-02-22T16:09:37 [INFO] [Thread: main] ------------------------------------------------------------

TA-rapid7_nexpose.log

2017-02-22 04:04:11,467 INFO    nx_logger:38 - In AdHoc generate
2017-02-22 04:04:11,468 INFO    nx_logger:38 - Making Query:

2017-02-22 04:06:31,827 INFO    nx_logger:38 - Processing asset report for site(s) <['21']>
2017-02-22 04:06:32,120 INFO    nx_logger:38 - Finished processing asset report for site(s) <['21']>
2017-02-22 04:08:32,475 INFO    nx_logger:38 - Connecting Nexpose client
2017-02-22 04:08:33,054 INFO    nx_logger:38 - Executing vuln query for site(s) <['21']>
2017-02-22 04:08:33,055 INFO    nx_logger:38 - In AdHoc generate
2017-02-22 04:08:33,055 INFO    nx_logger:38 - Making Query:
1 Solution

mattspierce
Explorer

After working with support we found in /var/log/messages the error:

messages-20170226:Feb 23 04:30:22 nexpose kernel: lowmem_reserve[]: 0 0 0 0
messages-20170226:Feb 23 04:30:22 nexpose kernel: Out of memory: Kill process 25870 (nexserv) score 372 or sacrifice child

This explains the issue, and luckily my children were not sacrificed.

View solution in original post

0 Karma

mattspierce
Explorer

After working with support we found in /var/log/messages the error:

messages-20170226:Feb 23 04:30:22 nexpose kernel: lowmem_reserve[]: 0 0 0 0
messages-20170226:Feb 23 04:30:22 nexpose kernel: Out of memory: Kill process 25870 (nexserv) score 372 or sacrifice child

This explains the issue, and luckily my children were not sacrificed.

0 Karma

jonathan_stewar
Path Finder

I'm glad Support were able to help you out!
Jonathan (Rapid7)

0 Karma
Get Updates on the Splunk Community!

More Control Over Your Monitoring Costs with Archived Metrics!

What if there was a way you could keep all the metrics data you need while saving on storage costs?This is now ...

New in Observability Cloud - Explicit Bucket Histograms

Splunk introduces native support for histograms as a metric data type within Observability Cloud with Explicit ...

Updated Team Landing Page in Splunk Observability

We’re making some changes to the team landing page in Splunk Observability, based on your feedback. The ...