Deployment Architecture

Where is my data get stored in Splunk ?

Contributor

Splunk receives raw data. Splunk indexer will index the data to Series of Events.
Both the raw data and also the indexed data will be present in the Splunk later.

Ques :
1.Where do these data get stored ?
2.Why we need to store raw data once it get indexed ?

1 Solution

Champion

The chances of the same question being posted by multiple people seems pretty unlikely and given how often this happens - why post under many usernames?

Anyway, to the problem at hand. Did you try the documentation? A quick search reveals;
http://docs.splunk.com/Documentation/Splunk/5.0.1/Indexer/HowSplunkstoresindexes

This is very comprehensive and theres little point in me summarising it here, if you read it then you'll have a complete understanding of how Splunk stores and where it stores this data.

The rawdata is needed to rebuild the metadata should the buckets ever become corrupted or unable to be read by Splunk, this is also important in a clustered environment where you can choose how many copies of the raw data are available for recovery purposes.

EDIT: Oh and a final consideration, if you are indexing events from local log files then you also have to consider that the original data will also remain - depending on the retention/rolling policies already in place

View solution in original post

New Member

"The chances of the same question being posted by multiple people seems pretty unlikely and given how often this happens - why post under many usernames?"

Maybe the person thought they stood a better chance of receive support by posting it more than once?.......? I don't know, it's a thought.

So, I read the link you gave and I'm sorry...I'm still overwhelmed with trying to find my data and where it's located. I hope this user was helped, as for me...I'm still wondering where the data went. Thanks anyway.

0 Karma

Explorer

So that data is just stored in text files directly on the Splunk servers?

Splunk Employee
Splunk Employee

gurinderbhatti: I suggest you post that as a new question, with some additional detail about your deployment and usage of DB Connect.

0 Karma

Path Finder

Ironically, this post relates to my question. I have a intermediate server with a heavy forwarder installed. I am using db connect app on the intermediate server to get mssql db logs. I want to forward them to an indexer. What path in the inputs.conf file should i monitor on the heavy forwarder? "/var/splunkhot/splunk/var/lib/splunk"?

0 Karma

Legend

Data is stored in $SPLUNK_HOME/var/lib/splunk, one directory per index ($SPLUNK_HOME being where Splunk was installed). The files in the respective directories hold the data in the indexes. The data in these files is not meant to be read directly - it would be very much like trying to read MySQL's database files directly expecting to be able to make sense of them.

EDIT: Upon reading the link, this is already explained there. Where is it you get confused? What are you trying to do and why?

Champion

The chances of the same question being posted by multiple people seems pretty unlikely and given how often this happens - why post under many usernames?

Anyway, to the problem at hand. Did you try the documentation? A quick search reveals;
http://docs.splunk.com/Documentation/Splunk/5.0.1/Indexer/HowSplunkstoresindexes

This is very comprehensive and theres little point in me summarising it here, if you read it then you'll have a complete understanding of how Splunk stores and where it stores this data.

The rawdata is needed to rebuild the metadata should the buckets ever become corrupted or unable to be read by Splunk, this is also important in a clustered environment where you can choose how many copies of the raw data are available for recovery purposes.

EDIT: Oh and a final consideration, if you are indexing events from local log files then you also have to consider that the original data will also remain - depending on the retention/rolling policies already in place

View solution in original post