Why is a sourcetype not being applied to a summary...

andrewtrobec · ‎10-27-2023

Hello!

As part of data separation activities I am migrating summary indexes between Splunk deployments. Some of these summary indexes have been collected with sourcetype=stash, while others have their sourcetype set to a specific one, let's say "specific_st".

The data is very simple, here is one event:

2023-06-10-12:43:00;FRONT;GBX;OK

The sourcetype is set as follows:

[specific_st]
DATETIME_CONFIG = 
NO_BINARY_CHECK = true
TIME_FORMAT = %Y-%m-%d-%H:%M:%S
TZ = UTC
category = Custom
pulldown_type = 1
disabled = false
SHOULD_LINEMERGE = false
FIELD_DELIMITER = ;
FIELD_NAMES = "TIMESTAMP_EVENT","SECTOR","CODE","STATUS"
TIMESTAMP_FIELDS = TIMESTAMP_EVENT

After collecting on the source Splunk instance, I run the search on the summary index and the fields are extracted correctly. After migrating the index to the new Splunk instance, the sourcetype does not seem to work and the fields are not extracted. The event correctly lists the sourcetype as "specific_st".

To migrate I copied the db folder via SCP from source indexer (single) to target indexer which is part of a cluster. I made sure to rename any buckets and when I brought the indexer back up the index was correctly recognized and replicated. The sourcetype is located on all indexers as well as the search head.

Has anybody had this problem before? Do I maybe need to update the sourcetype in some way?

Thank you and best regards,

Andrew

PickleRick · ‎10-28-2023

OK. You wrote that you copied the buckets between indexers. But what are the definitions on search-heads? Indexers handle index-time operations (which are obviously not performed if the data is already indexed) but your extractions are search-time so you should define them on SH-level.

_JP · ‎10-27-2023

The default sourcetype for data that is generated for a summary index is stash. So the fact that it has a different sourcetype (specific_st) and you found a configuration stanza for it seems to imply it is not summary-index data.

Keep in mind that the Splunk docs refer a lot to a "summary index" - but you can have real-time event data and summary index data in the same index. BUT - it is a best practice to keep them separate because raw data typically has a pattern of ingestion/search that is wildly different from summary data, so long term you tune those indexes differently. Thus why the docs often refer to it as a separate index.

But based on what you've provided so far it sounds like someone configured that sourcetype to go into what others think is a "summary data only" index. What's the source values for those events? Are they Splunk servers, or are they part of an application/webserver/database pool of servers?

Why is a sourcetype not being applied to a summary index that was migrated?

index

indexer

props.conf

Fastest way to demo Observability

September Community Champions: A Shoutout to Our Contributors!

Splunk Decoded: Service Maps vs Service Analyzer Tree View vs Flow Maps

Are you a member of the Splunk Community?