Getting Data In

Why is a sourcetype not being applied to a summary index that was migrated?

andrewtrobec
Motivator

Hello!

As part of data separation activities I am migrating summary indexes between Splunk deployments.  Some of these summary  indexes have been collected with sourcetype=stash, while others have their sourcetype set to a specific one, let's say "specific_st".

The data is very simple, here is one event:

2023-06-10-12:43:00;FRONT;GBX;OK

The sourcetype is set as follows:

[specific_st]
DATETIME_CONFIG = 
NO_BINARY_CHECK = true
TIME_FORMAT = %Y-%m-%d-%H:%M:%S
TZ = UTC
category = Custom
pulldown_type = 1
disabled = false
SHOULD_LINEMERGE = false
FIELD_DELIMITER = ;
FIELD_NAMES = "TIMESTAMP_EVENT","SECTOR","CODE","STATUS"
TIMESTAMP_FIELDS = TIMESTAMP_EVENT
 
After collecting on the source Splunk instance, I run the search on the summary index and the fields are extracted correctly.  After migrating the index to the new Splunk instance, the sourcetype does not seem to work and the fields are not extracted.  The event correctly lists the sourcetype as "specific_st".
 
To migrate I copied the db folder via SCP from source indexer (single) to target indexer which is part of a cluster.  I made sure to rename any buckets and when I brought the indexer back up the index was correctly recognized and replicated.  The sourcetype is located on all indexers as well as  the search head.
 
Has anybody had this problem before?  Do I maybe need to update the sourcetype in some way?
 
Thank you and best regards,
 
Andrew
Labels (4)
0 Karma

PickleRick
SplunkTrust
SplunkTrust

OK. You wrote that you copied the buckets between indexers. But what are the definitions on search-heads? Indexers handle index-time operations (which are obviously not performed if the data is already indexed) but your extractions are search-time so you should define them on SH-level.

0 Karma

_JP
Contributor

The default sourcetype for data that is generated for a summary index is stash.  So the fact that it has a different sourcetype (specific_st) and you found a configuration stanza for it seems to imply it is not summary-index data.  

Keep in mind that the Splunk docs refer a lot to a "summary index" - but you can have real-time event data and summary index data in the same index.  BUT - it is a best practice to keep them separate because raw data typically has a pattern of ingestion/search that is wildly different from summary data, so long term you tune those indexes differently.  Thus why the docs often refer to it as a separate index.

But based on what you've provided so far it sounds like someone configured that sourcetype to go into what others think is a "summary data only" index.  What's the source values for those events?  Are they Splunk servers, or are they part of an application/webserver/database pool of servers?

0 Karma
Get Updates on the Splunk Community!

Detecting Remote Code Executions With the Splunk Threat Research Team

WATCH NOWRemote code execution (RCE) vulnerabilities pose a significant risk to organizations. If exploited, ...

Enter the Splunk Community Dashboard Challenge for Your Chance to Win!

The Splunk Community Dashboard Challenge is underway! This is your chance to showcase your skills in creating ...

.conf24 | Session Scheduler is Live!!

.conf24 is happening June 11 - 14 in Las Vegas, and we are thrilled to announce that the conference catalog ...