Deployment Architecture

Does Splunk keep a copy of the indexed data? What happens if indexed files are later deleted?

gerryha
Explorer

I guess my real question is how do I move Splunk from one company to another, including some but not all of the data and the indexes for the selected data?

I see I can copy config and indexes from the $SPLUNK_HOME, but indexes are (I guess) just metadata, referencing other data. So, a search will read the index, then use that to get the data to return and display.

I am going to guess Splunk will make a copy of the indexed data, because data sources can disappear for various reasons and that would not be ideal for later searches.

 

Labels (1)
0 Karma
1 Solution

gcusello
SplunkTrust
SplunkTrust

Hi @gerryha,

as @richgalloway said and as you can read at https://docs.splunk.com/Documentation/Splunk/9.0.1/Indexer/HowSplunkstoresindexes , indexed data are stored in many folders of a main folder for each index.

In each bucket you can find both row data (compressed) and metadata.

When you remove or discard a bucket, both raw data and metadata are discarded so you cannot have only metadata.

Then if you have a cluster, youd data are also replicated in the other indexers.

About your question, I agree with Rich, firstly analyze the legal situation before start with your job, then the only way to take only a part of data (not the full index) is to run a search and export as raw data the results.

If you have to extract many data, rememeber that this job must be planned because you have the limit of 10.000 results for each search, so you have to plan and run many searches to extract all your data.

Ciao.

Giuseppe

View solution in original post

gcusello
SplunkTrust
SplunkTrust

Hi @gerryha,

as @richgalloway said and as you can read at https://docs.splunk.com/Documentation/Splunk/9.0.1/Indexer/HowSplunkstoresindexes , indexed data are stored in many folders of a main folder for each index.

In each bucket you can find both row data (compressed) and metadata.

When you remove or discard a bucket, both raw data and metadata are discarded so you cannot have only metadata.

Then if you have a cluster, youd data are also replicated in the other indexers.

About your question, I agree with Rich, firstly analyze the legal situation before start with your job, then the only way to take only a part of data (not the full index) is to run a search and export as raw data the results.

If you have to extract many data, rememeber that this job must be planned because you have the limit of 10.000 results for each search, so you have to plan and run many searches to extract all your data.

Ciao.

Giuseppe

gerryha
Explorer

thanks, I have been looking for that documentation but couldn't find it

0 Karma

gcusello
SplunkTrust
SplunkTrust

Hi @gerryha,

in the above documentation, you can find how Splunk works for indexing.

About your questions, what is not clear?

Ciao.

Giuseppe

0 Karma

richgalloway
SplunkTrust
SplunkTrust

When data is sent to Splunk, it is written to disk in what Splunk calls an "index".  This is the copy of which you speak.  Additional (replicated) copies may be created in an indexer cluster.  Splunk also creates assorted metadata to help it search and manage the indexed data.

Moving indexed data from one company to another may be possible, but also may be unlawful and/or a violation of the original company's policies.  Please tell us more about the problem you are trying to solve.

---
If this reply helps you, Karma would be appreciated.
Get Updates on the Splunk Community!

Stay Connected: Your Guide to July and August Tech Talks, Office Hours, and Webinars!

Dive into our sizzling summer lineup for July and August Community Office Hours and Tech Talks. Scroll down to ...

Edge Processor Scaling, Energy & Manufacturing Use Cases, and More New Articles on ...

Splunk Lantern is a Splunk customer success center that provides advice from Splunk experts on valuable data ...

Get More Out of Your Security Practice With a SIEM

Get More Out of Your Security Practice With a SIEMWednesday, July 31, 2024  |  11AM PT / 2PM ETREGISTER ...