Deployment Architecture

Where exactly and how is the data stored in splunk?

harshal_chakran
Builder
  1. I wanted to know if I am getting the data from some stream say TCP stream, where is the data stored? As I understand the data gets uploaded and indexed but some day can I get that data from Splunk? or I have to keep another copy if I need for my reference?

  2. What will be the limit of data storage in Splunk, suppose I am getting data of 1 TB uploaded on Splunk every day so within short time span say week or month my storage will get filled up, in this case what will happen? whether the data will be overwritten or we will have option to copy the older data to some other source or so?

  3. In what format does Splunk stores the data? If I am sending the data which is raw data from TCP/UDP stream it is stored in which format .txt, or in some DB?

Tags (2)
1 Solution

Ayn
Legend

You should read up on what the documentation has to say about how data is stored. http://docs.splunk.com/Documentation/Splunk/6.0/Indexer/HowSplunkstoresindexes

Short answers based on that information:
1. All data is always stored in Splunk's index, no matter where it came from originally. You can extract this data in a number of ways - either search for a subset of data that you're interested in and export it, or grab all data from an index and extract it using tools such as Splunk's exporttool.
2. This is no limit to Splunk itself, this is a storage limit in your system. In short, if you don't have enough storage, add more storage. 🙂 Splunk will trigger warnings if you're low on diskspace. If you're out of disk space, Splunk stops indexing until there's disk space available.
3. Splunk stores data in its indexes (which you could say is a kind of database).

View solution in original post

dimoobraznii
Path Finder

Guys, can you deep dive in 2nd point? Maybe link to docs.

pasanmk
Engager

see 'Set a retirement and archiving policy' in Splunk docs:
http://docs.splunk.com/Documentation/Splunk/6.2.2/Indexer/Setaretirementandarchivingpolicy

0 Karma

Ayn
Legend

You should read up on what the documentation has to say about how data is stored. http://docs.splunk.com/Documentation/Splunk/6.0/Indexer/HowSplunkstoresindexes

Short answers based on that information:
1. All data is always stored in Splunk's index, no matter where it came from originally. You can extract this data in a number of ways - either search for a subset of data that you're interested in and export it, or grab all data from an index and extract it using tools such as Splunk's exporttool.
2. This is no limit to Splunk itself, this is a storage limit in your system. In short, if you don't have enough storage, add more storage. 🙂 Splunk will trigger warnings if you're low on diskspace. If you're out of disk space, Splunk stops indexing until there's disk space available.
3. Splunk stores data in its indexes (which you could say is a kind of database).

ConnorG
Path Finder

Nice answer Ayn.

I'm curious. Since you can't delete events from an index would it be possible to instead export all the events except the ones you want removed? Sort of an alternative method of deletion.

0 Karma

halr9000
Motivator

You actually can delete data from indexes, but by default, nobody (not even admin) has this permission. You would have to give a user the "can delete" capability, only then will you be able to use the "delete" search command. It acts on events which are streamed to it, so you can be very granular by creating a search which outputs only what you want deleted, then add "| delete" and hit enter.

Get Updates on the Splunk Community!

Federated Search for Amazon S3 | Key Use Cases to Streamline Compliance Workflows

Modern business operations are supported by data compliance. As regulations evolve, organizations must ...

New Dates, New City: Save the Date for .conf25!

Wake up, babe! New .conf25 dates AND location just dropped!! That's right, this year, .conf25 is taking place ...

Introduction to Splunk Observability Cloud - Building a Resilient Hybrid Cloud

Introduction to Splunk Observability Cloud - Building a Resilient Hybrid Cloud  In today’s fast-paced digital ...