Solved: tstats search slow when using field from automatic...

tah7004 · ‎12-21-2020

Hello, I'm seeing an issue where tstats search is slow due to an automatic lookup.

I'm running the searches over ranges where data model acceleration should be 100%. We do not receive constant stream of data and only at batch schedule.
DMA is running over raw index which has automatic lookups. The output fields from the automatic lookups are defined in the DM.
It is when I try to run searches using these output lookup fields, the tstats search doesn't appear to find anything in tsidx and tries to fall back to normal search.

Example:

| tstats c from datamodel=test_dm where test_dm.output_field_1 = 1

This search is very slow even though test_dm.output_field_1 is part of the data model. From the search.log, it seems to fall back to raw index search instead of the summaries data even though there should be something in the summaries of the tsidx.

If i don't specify a specific value in the search, it runs fast as expected:

| tstats c from datamodel=test_dm where test_dm.output_field_1 = *

Also, it runs just as fast if I use summariesonly=t like this:

| tstats summariesonly=t c from datamodel=test_dm where test_dm.output_field_1 = 1

Any other searches where the fields are not from automatic lookup and are from the raw index are fine such as this:

| tstats c from datamodel=test_dm where test_dm.field1 = 1

I'm really confused why this is happening. Filter at WHERE clause works fine with any other fields. Just the fields that are from automatic lookups seems to cause these weird fallback searches on raw index unless I use summariesonly=t or use wildcards as the value.

Does anyone have any idea?

tah7004 · ‎01-06-2021

I found the solution to the issue.

I added the output lookup field as additional accelerated field in the kvstore lookup.

in my collections.conf, I previously had something like this:

[test_kv]

accelerated_fields.test1 = {"inputfield1": 1, "inputfield2": 1}

Since the automatic lookups are using those two lookup fields like this:

lookup test_kv_lookup inputfield1 inputfield2 OUTPUT outputfield1

I changed it to the following:

accelerated_fields.test1 = {"inputfield1": 1, "inputfield2": 1, "outputfield1":1}

accelerated_fields.test2 = {"outputfield1": 1, "inputfield1": 1, "inputfield2":1}

Even though I don't need to do lookup by field "outputfield1", this will ensure that my searches without summariesonly=true will do a quick check to ensure that value I'm searching for is in the kvstore or not without having to load the entire lookup which takes like 20 minutes to do.

Maybe there is a better way to do this, but this resolved the issue for me.

View solution in original post

tah7004 · ‎12-22-2020

I think I may understand why this issue is happening.

I have two big KV store lookups both of which are about 1.2 G each.

When the tstats search is ran, Splunk is trying to fully load these two big lookup files because of "WHERE test_dm.output_field_1=1". I was under the impression that if the ranges I'm searching are 100% accelerated, it shouldn't be loading the lookups, but that seems to be the case.

Since, I have about 40+ sourcetypes where I defined the automatic lookups, the loading time seems to be accumulated. Seems like it's trying to load the two 1.2 G lookups 40 times over.

tah7004 · ‎01-06-2021