Deployment Architecture

Where exactly and how is the data stored in splunk?

  1. I wanted to know if I am getting the data from some stream say TCP stream, where is the data stored? As I understand the data gets uploaded and indexed but some day can I get that data from Splunk? or I have to keep another copy if I need for my reference?

  2. What will be the limit of data storage in Splunk, suppose I am getting data of 1 TB uploaded on Splunk every day so within short time span say week or month my storage will get filled up, in this case what will happen? whether the data will be overwritten or we will have option to copy the older data to some other source or so?

  3. In what format does Splunk stores the data? If I am sending the data which is raw data from TCP/UDP stream it is stored in which format .txt, or in some DB?

Tags (2)
1 Solution

Legend

You should read up on what the documentation has to say about how data is stored. http://docs.splunk.com/Documentation/Splunk/6.0/Indexer/HowSplunkstoresindexes

Short answers based on that information:
1. All data is always stored in Splunk's index, no matter where it came from originally. You can extract this data in a number of ways - either search for a subset of data that you're interested in and export it, or grab all data from an index and extract it using tools such as Splunk's exporttool.
2. This is no limit to Splunk itself, this is a storage limit in your system. In short, if you don't have enough storage, add more storage. 🙂 Splunk will trigger warnings if you're low on diskspace. If you're out of disk space, Splunk stops indexing until there's disk space available.
3. Splunk stores data in its indexes (which you could say is a kind of database).

View solution in original post

Path Finder

Guys, can you deep dive in 2nd point? Maybe link to docs.

Engager

see 'Set a retirement and archiving policy' in Splunk docs:
http://docs.splunk.com/Documentation/Splunk/6.2.2/Indexer/Setaretirementandarchivingpolicy

0 Karma

Legend

You should read up on what the documentation has to say about how data is stored. http://docs.splunk.com/Documentation/Splunk/6.0/Indexer/HowSplunkstoresindexes

Short answers based on that information:
1. All data is always stored in Splunk's index, no matter where it came from originally. You can extract this data in a number of ways - either search for a subset of data that you're interested in and export it, or grab all data from an index and extract it using tools such as Splunk's exporttool.
2. This is no limit to Splunk itself, this is a storage limit in your system. In short, if you don't have enough storage, add more storage. 🙂 Splunk will trigger warnings if you're low on diskspace. If you're out of disk space, Splunk stops indexing until there's disk space available.
3. Splunk stores data in its indexes (which you could say is a kind of database).

View solution in original post

Path Finder

Nice answer Ayn.

I'm curious. Since you can't delete events from an index would it be possible to instead export all the events except the ones you want removed? Sort of an alternative method of deletion.

0 Karma

Motivator

You actually can delete data from indexes, but by default, nobody (not even admin) has this permission. You would have to give a user the "can delete" capability, only then will you be able to use the "delete" search command. It acts on events which are streamed to it, so you can be very granular by creating a search which outputs only what you want deleted, then add "| delete" and hit enter.

State of Splunk Careers

Access the Splunk Careers Report to see real data that shows how Splunk mastery increases your value and job satisfaction.

Find out what your skills are worth!