Re: Splunk DB Connect 2: How to troubleshoot why n...

jawebb · ‎12-29-2015

Hello,

I am using DB Connect 2 to pull data from a variety of different SQL databases into unique indexes. Strangely, as of a few days ago, not all data appears to be indexed. On the 26th, I have 3800 events, and on the 27th and onward, I have 3600 events. When I go into the DB Connect Operations tab, I can verify that the data is there when I do the query preview, but for whatever reason it's not making it into the index. I have attempted to change source types and force a re-index this morning and certain events are still missing. There doesn't seem to be anything I can gather from the missing events, only that it's the same events each time.

When I check the health tab in DB connect everything seems OK. Resource usage is fine, transactions are occurring at the normal rates, etc...

Any help in where I can look to troubleshoot would be appreciated.

UPDATE

The issue had to do with the way the DB Import was seeing the timestamps on a number of events. Even though I specified to use the indexed time as the _time, it was still having events stamped for 4-5 years ago. When I look at data over a 24h span, many of the events appear to be missing. This was the problem. I was able to edit the props.conf with a setting to ignore timestamps greater 24h.

Thanks

jcoates_splunk · ‎01-02-2016

Hi, from your update it sounds like an ordering problem. The class of problem is just as relevant in SPL as SQL... You have a pile of data that is too big to take at once, so you have to take chunks.

Select top N whatever order by something ASC.

Set "something" as the rising column and DBX will do the rest... but the order by part is important if the data isn't already using a sensible primary key, or if your verification query is different or behaves differently for weird contextual reasons.

hettervik · ‎11-15-2017

Hi. You point out that DBX can't handle too big chunks of data at once. Is there any known hard limit on how big these chunks have to be in terms of row numbers or size before we get problems?

I'm at a customer that have some huge database records. DB can traversely work its way through the database, but whenever it hit one of these big records it stops traversing, and start reindexing the same events over and over.

jkat54 · ‎12-29-2015

My guess is that there is a funny character or "bad" value in row 3601 of your database. If you'll find the rows in the db table that arent being indexed and find the 3 rows before and after those rows... then paste an example of those data rows here... I'll be happy to help you hack the python code in dbconnect 2 to suite your needs.

jawebb · ‎12-29-2015

I just checked into the rows and it looks pretty random. It begins at row 274 and then after that is missing 276, 278, 284, 300, 301 etc..

Here is the data in the surrounding rows, but I don't see anything unusual.

271 SBC20077 271 \.\PHYSICALDRIVE0 100616PCK330VKK6MSPJ Hitachi HTS725032A9A365 1412622639180 2 False
272 CMM01036 272 \.\PHYSICALDRIVE0 5VCGYK22 ST9250317AS 1317335321260 2 False
273 SBC20069 273 \.\PHYSICALDRIVE0 100616PCK330VKK68WNJ Hitachi HTS725032A9A365 1394157394820 2 False
274 SBC10013 274 \.\PHYSICALDRIVE0 100903PCK330GKGEH33K Hitachi HTS725032A9A365 1310679115177 3 False
275 SBC20197 275 \.\PHYSICALDRIVE0 100616PCK330VKK47EWJ Hitachi HTS725032A9A365 1431131118240 2 False
276 SBO20013 276 \.\PHYSICALDRIVE0 100903PCK330GKGE67AK Hitachi HTS725032A9A365 1447354139470 2 False
277 SBC20073 277 \.\PHYSICALDRIVE0 100616PCK330VKK60KRJ Hitachi HTS725032A9A365 1422490991457 3 False
278 CMMC20247 278 \.\PHYSICALDRIVE0 100903PCK330GKGE7T2J Hitachi HTS725032A9A365 1430866533340 2 False

jkat54 · ‎12-29-2015

It looks like it's only taking events that end with 3 false and not 2 false. What is this "number 3 or 2" column, and is that true? is it only indexing events that end with 3 false?

jawebb · ‎12-29-2015

The numbered column represents users associated to the trusted drive. It looks like the missing events have varying data in those fields so it's not that.

I may try and alter the query to only include information I need, which is only the name and drive serial. I'll see if that makes a difference.

Thanks for your help.

jawebb · ‎12-29-2015

UPDATE

I found the missing events. For some reason, while I choose to use the current index time as the _time field, about 250 of the events are using varying time. I'm looking at a span of 24 hours or 1 week while these have a time of back in 2006-2007. Now I just have to figure out how to address this.

jkat54 · ‎01-01-2016

Ah so the problem is the default frozenTimePeriodInSecs = 6 years and when its getting that old of a date associated to the events, it discards the events from the index sooner or later too.

You need to correct the timestamp. Props.conf:

Since it doesnt appear you have a timestamp in most of these, maybe use current time instead.

  DATETIME_CONFIG = CURRENT

If that third to last column is timestamp in epoch, then use this

  MAX_TIMESTAMP_LOOKAHEAD = 128
  TIME_FORMAT = %s%3N

http://docs.splunk.com/Documentation/Splunk/6.2.0/SearchReference/Commontimeformatvariables

Splunk DB Connect 2: How to troubleshoot why not all data is being indexed?

Splunk Mobile: Your Brand-New Home Screen

Introducing Value Insights (Beta): Understand the Business Impact your organization ...

Enterprise Security (ES) Essentials 8.3 is Now GA — Smarter Detections, Faster ...

Are you a member of the Splunk Community?

Splunk DB Connect 2: How to troubleshoot why not all data is being indexed?

Splunk Mobile: Your Brand-New Home Screen

Introducing Value Insights (Beta): Understand the Business Impact your organization ...

Enterprise Security (ES) Essentials 8.3 is Now GA — Smarter Detections, Faster ...