Deployment Architecture

Does Splunk keep a copy of the indexed data? What happens if indexed files are later deleted?

gerryha
Explorer

I guess my real question is how do I move Splunk from one company to another, including some but not all of the data and the indexes for the selected data?

I see I can copy config and indexes from the $SPLUNK_HOME, but indexes are (I guess) just metadata, referencing other data. So, a search will read the index, then use that to get the data to return and display.

I am going to guess Splunk will make a copy of the indexed data, because data sources can disappear for various reasons and that would not be ideal for later searches.

 

Labels (1)
0 Karma
1 Solution

gcusello
SplunkTrust
SplunkTrust

Hi @gerryha,

as @richgalloway said and as you can read at https://docs.splunk.com/Documentation/Splunk/9.0.1/Indexer/HowSplunkstoresindexes , indexed data are stored in many folders of a main folder for each index.

In each bucket you can find both row data (compressed) and metadata.

When you remove or discard a bucket, both raw data and metadata are discarded so you cannot have only metadata.

Then if you have a cluster, youd data are also replicated in the other indexers.

About your question, I agree with Rich, firstly analyze the legal situation before start with your job, then the only way to take only a part of data (not the full index) is to run a search and export as raw data the results.

If you have to extract many data, rememeber that this job must be planned because you have the limit of 10.000 results for each search, so you have to plan and run many searches to extract all your data.

Ciao.

Giuseppe

View solution in original post

gcusello
SplunkTrust
SplunkTrust

Hi @gerryha,

as @richgalloway said and as you can read at https://docs.splunk.com/Documentation/Splunk/9.0.1/Indexer/HowSplunkstoresindexes , indexed data are stored in many folders of a main folder for each index.

In each bucket you can find both row data (compressed) and metadata.

When you remove or discard a bucket, both raw data and metadata are discarded so you cannot have only metadata.

Then if you have a cluster, youd data are also replicated in the other indexers.

About your question, I agree with Rich, firstly analyze the legal situation before start with your job, then the only way to take only a part of data (not the full index) is to run a search and export as raw data the results.

If you have to extract many data, rememeber that this job must be planned because you have the limit of 10.000 results for each search, so you have to plan and run many searches to extract all your data.

Ciao.

Giuseppe

gerryha
Explorer

thanks, I have been looking for that documentation but couldn't find it

0 Karma

gcusello
SplunkTrust
SplunkTrust

Hi @gerryha,

in the above documentation, you can find how Splunk works for indexing.

About your questions, what is not clear?

Ciao.

Giuseppe

0 Karma

richgalloway
SplunkTrust
SplunkTrust

When data is sent to Splunk, it is written to disk in what Splunk calls an "index".  This is the copy of which you speak.  Additional (replicated) copies may be created in an indexer cluster.  Splunk also creates assorted metadata to help it search and manage the indexed data.

Moving indexed data from one company to another may be possible, but also may be unlawful and/or a violation of the original company's policies.  Please tell us more about the problem you are trying to solve.

---
If this reply helps you, Karma would be appreciated.
Get Updates on the Splunk Community!

ICYMI - Check out the latest releases of Splunk Edge Processor

Splunk is pleased to announce the latest enhancements to Splunk Edge Processor.  HEC Receiver authorization ...

Introducing the 2024 SplunkTrust!

Hello, Splunk Community! We are beyond thrilled to announce our newest group of SplunkTrust members!  The ...

Introducing the 2024 Splunk MVPs!

We are excited to announce the 2024 cohort of the Splunk MVP program. Splunk MVPs are passionate members of ...