Hello!
As part of data separation activities I am migrating summary indexes between Splunk deployments. Some of these summary indexes have been collected with sourcetype=stash, while others have their sourcetype set to a specific one, let's say "specific_st".
The data is very simple, here is one event:
2023-06-10-12:43:00;FRONT;GBX;OK
The sourcetype is set as follows:
[specific_st]
DATETIME_CONFIG =
NO_BINARY_CHECK = true
TIME_FORMAT = %Y-%m-%d-%H:%M:%S
TZ = UTC
category = Custom
pulldown_type = 1
disabled = false
SHOULD_LINEMERGE = false
FIELD_DELIMITER = ;
FIELD_NAMES = "TIMESTAMP_EVENT","SECTOR","CODE","STATUS"
TIMESTAMP_FIELDS = TIMESTAMP_EVENT
OK. You wrote that you copied the buckets between indexers. But what are the definitions on search-heads? Indexers handle index-time operations (which are obviously not performed if the data is already indexed) but your extractions are search-time so you should define them on SH-level.
The default sourcetype for data that is generated for a summary index is stash. So the fact that it has a different sourcetype (specific_st) and you found a configuration stanza for it seems to imply it is not summary-index data.
Keep in mind that the Splunk docs refer a lot to a "summary index" - but you can have real-time event data and summary index data in the same index. BUT - it is a best practice to keep them separate because raw data typically has a pattern of ingestion/search that is wildly different from summary data, so long term you tune those indexes differently. Thus why the docs often refer to it as a separate index.
But based on what you've provided so far it sounds like someone configured that sourcetype to go into what others think is a "summary data only" index. What's the source values for those events? Are they Splunk servers, or are they part of an application/webserver/database pool of servers?