Deployment Architecture

Where exactly and how is the data stored in splunk?

harshal_chakran
Builder
  1. I wanted to know if I am getting the data from some stream say TCP stream, where is the data stored? As I understand the data gets uploaded and indexed but some day can I get that data from Splunk? or I have to keep another copy if I need for my reference?

  2. What will be the limit of data storage in Splunk, suppose I am getting data of 1 TB uploaded on Splunk every day so within short time span say week or month my storage will get filled up, in this case what will happen? whether the data will be overwritten or we will have option to copy the older data to some other source or so?

  3. In what format does Splunk stores the data? If I am sending the data which is raw data from TCP/UDP stream it is stored in which format .txt, or in some DB?

Tags (2)
1 Solution

Ayn
Legend

You should read up on what the documentation has to say about how data is stored. http://docs.splunk.com/Documentation/Splunk/6.0/Indexer/HowSplunkstoresindexes

Short answers based on that information:
1. All data is always stored in Splunk's index, no matter where it came from originally. You can extract this data in a number of ways - either search for a subset of data that you're interested in and export it, or grab all data from an index and extract it using tools such as Splunk's exporttool.
2. This is no limit to Splunk itself, this is a storage limit in your system. In short, if you don't have enough storage, add more storage. 🙂 Splunk will trigger warnings if you're low on diskspace. If you're out of disk space, Splunk stops indexing until there's disk space available.
3. Splunk stores data in its indexes (which you could say is a kind of database).

View solution in original post

dimoobraznii
Path Finder

Guys, can you deep dive in 2nd point? Maybe link to docs.

pasanmk
Engager

see 'Set a retirement and archiving policy' in Splunk docs:
http://docs.splunk.com/Documentation/Splunk/6.2.2/Indexer/Setaretirementandarchivingpolicy

0 Karma

Ayn
Legend

You should read up on what the documentation has to say about how data is stored. http://docs.splunk.com/Documentation/Splunk/6.0/Indexer/HowSplunkstoresindexes

Short answers based on that information:
1. All data is always stored in Splunk's index, no matter where it came from originally. You can extract this data in a number of ways - either search for a subset of data that you're interested in and export it, or grab all data from an index and extract it using tools such as Splunk's exporttool.
2. This is no limit to Splunk itself, this is a storage limit in your system. In short, if you don't have enough storage, add more storage. 🙂 Splunk will trigger warnings if you're low on diskspace. If you're out of disk space, Splunk stops indexing until there's disk space available.
3. Splunk stores data in its indexes (which you could say is a kind of database).

ConnorG
Path Finder

Nice answer Ayn.

I'm curious. Since you can't delete events from an index would it be possible to instead export all the events except the ones you want removed? Sort of an alternative method of deletion.

0 Karma

halr9000
Motivator

You actually can delete data from indexes, but by default, nobody (not even admin) has this permission. You would have to give a user the "can delete" capability, only then will you be able to use the "delete" search command. It acts on events which are streamed to it, so you can be very granular by creating a search which outputs only what you want deleted, then add "| delete" and hit enter.

Career Survey
First 500 qualified respondents will receive a $20 gift card! Tell us about your professional Splunk journey.
Get Updates on the Splunk Community!

Tech Talk Recap | Mastering Threat Hunting

Mastering Threat HuntingDive into the world of threat hunting, exploring the key differences between ...

Observability for AI Applications: Troubleshooting Latency

If you’re working with proprietary company data, you’re probably going to have a locally hosted LLM or many ...

Splunk AI Assistant for SPL vs. ChatGPT: Which One is Better?

In the age of AI, every tool promises to make our lives easier. From summarizing content to writing code, ...