Getting Data In

Why is a sourcetype not being applied to a summary index that was migrated?

andrewtrobec
Motivator

Hello!

As part of data separation activities I am migrating summary indexes between Splunk deployments.  Some of these summary  indexes have been collected with sourcetype=stash, while others have their sourcetype set to a specific one, let's say "specific_st".

The data is very simple, here is one event:

2023-06-10-12:43:00;FRONT;GBX;OK

The sourcetype is set as follows:

[specific_st]
DATETIME_CONFIG = 
NO_BINARY_CHECK = true
TIME_FORMAT = %Y-%m-%d-%H:%M:%S
TZ = UTC
category = Custom
pulldown_type = 1
disabled = false
SHOULD_LINEMERGE = false
FIELD_DELIMITER = ;
FIELD_NAMES = "TIMESTAMP_EVENT","SECTOR","CODE","STATUS"
TIMESTAMP_FIELDS = TIMESTAMP_EVENT
 
After collecting on the source Splunk instance, I run the search on the summary index and the fields are extracted correctly.  After migrating the index to the new Splunk instance, the sourcetype does not seem to work and the fields are not extracted.  The event correctly lists the sourcetype as "specific_st".
 
To migrate I copied the db folder via SCP from source indexer (single) to target indexer which is part of a cluster.  I made sure to rename any buckets and when I brought the indexer back up the index was correctly recognized and replicated.  The sourcetype is located on all indexers as well as  the search head.
 
Has anybody had this problem before?  Do I maybe need to update the sourcetype in some way?
 
Thank you and best regards,
 
Andrew
Labels (4)
0 Karma

PickleRick
SplunkTrust
SplunkTrust

OK. You wrote that you copied the buckets between indexers. But what are the definitions on search-heads? Indexers handle index-time operations (which are obviously not performed if the data is already indexed) but your extractions are search-time so you should define them on SH-level.

0 Karma

_JP
Contributor

The default sourcetype for data that is generated for a summary index is stash.  So the fact that it has a different sourcetype (specific_st) and you found a configuration stanza for it seems to imply it is not summary-index data.  

Keep in mind that the Splunk docs refer a lot to a "summary index" - but you can have real-time event data and summary index data in the same index.  BUT - it is a best practice to keep them separate because raw data typically has a pattern of ingestion/search that is wildly different from summary data, so long term you tune those indexes differently.  Thus why the docs often refer to it as a separate index.

But based on what you've provided so far it sounds like someone configured that sourcetype to go into what others think is a "summary data only" index.  What's the source values for those events?  Are they Splunk servers, or are they part of an application/webserver/database pool of servers?

0 Karma
Get Updates on the Splunk Community!

Understanding Generative AI Techniques and Their Application in Cybersecurity

Watch On-Demand Artificial intelligence is the talk of the town nowadays, with industries of all kinds ...

.conf24 | Registration Open!

Hello, hello! I come bearing good news: Registration for .conf24 is now open!   conf is Splunk’s rad annual ...

Using the Splunk Threat Research Team’s Latest Security Content

REGISTER HERE Tech Talk | Security Edition Did you know the Splunk Threat Research Team regularly releases ...