Splunk Search

Why does the DBX read count and write count doesn't match and is throwing an error?

BenjaminWyatt
Communicator

I have a DBX 3.1.2 job that's failing at some point along the way. I don't get any error messages (everything is set to DEBUG levels), just the following message in the metrics logs:

2018-05-03 12:06:37.976 -0400 INFO c.s.dbx.server.task.listeners.JobMetricsListener - action=collect_job_metrics connection=my_db_connection jdbc_url=null record_read_success_count=3444 db_read_time=397794 record_read_error_count=1 hec_upload_time=102 hec_record_process_time=13 format_hec_success_count=3444 hec_upload_bytes=1631645 status=FAILED input_name=my_db_input batch_size=1000 error_threshold=N/A is_jmx_monitoring=false start_time=2018-05-03_12:00:00 end_time=2018-05-03_12:06:37 duration=397965 read_count=3444 write_count=3000 filtered_count=0 error_count=0

As you can see, not everything in the read_count field is making it into the write_count field. But when I search for error messages related to this input, I don't get anything beyond this.

Has anybody else had this problem? Where did you look?

Tags (4)
0 Karma

jcoates
Communicator

sounds like HEC performance, which usually means indexer pushback. Look at your indexing queues.

0 Karma

Richfez
SplunkTrust
SplunkTrust

3000 is a suspiciously round number and also a suspicious multiple of your batch_size.

Also, that hec_upload_time of 102 seconds is... I hope that's in ms. Even then that seems kind of high for a few thousand records totaling a MB and a half.

Have you confirmed that the right number of records made it into Splunk or not? I'm pretty sure it didn't, but maybe this is an error on the internal's metrics?

0 Karma

BenjaminWyatt
Communicator

I agree that it's a suspicious multiple.

Upload times are in ms as far as I can see...this is one of the most heavily taxed databases in the environment, so it's going to be a bit higher than one would like.

We have confirmed that Splunk is not reading the appropriate amount of records. We are missing entries.

0 Karma
Get Updates on the Splunk Community!

Adoption of RUM and APM at Splunk

    Unleash the power of Splunk Observability   Watch Now In this can't miss Tech Talk! The Splunk Growth ...

Routing logs with Splunk OTel Collector for Kubernetes

The Splunk Distribution of the OpenTelemetry (OTel) Collector is a product that provides a way to ingest ...

Welcome to the Splunk Community!

(view in My Videos) We're so glad you're here! The Splunk Community is place to connect, learn, give back, and ...