Deployment Architecture

How to setup volumes for Splunk deployment?

nickgleed
Explorer

OK basically I think I'm confusing myself. I have a helm deployment on K8 and orig had volumes for etc and var. I want to have separate volumes for hotwarm, cold, frozen and thawed. I created some PVC/volumes for each e.g. mapping to var/cold,var/hot etc but is this correct? I know in the index.conf you set paths e.g. per index, but can this be .... var/hotwarm/index1/? Is it ok to have 3-4 vols for each of the temps and put the indexes on each, or do I need a volume per index? I'm just getting confused. Any help appreciated. I'm also guestimating sizes of volumes - currently, we don't use Splunk much, but it's going to grow rapidly I suspect!!

E.g. my helm script includes this:

volumeMounts:
        - name: splunk-etc
          mountPath: /opt/splunk/etc
        - name: splunk-var
          mountPath: /opt/splunk/var
        - name: splunk-var-hotwarm
          mountPath: /opt/splunk/var/log/splunk/hotwarm
        - name: splunk-var-cold
          mountPath: /opt/splunk/var/log/splunk/cold
        - name: splunk-var-frozen
          mountPath: /opt/splunk/var/log/splunk/frozen

vliggio
Communicator

I would keep all Splunk core directories in the same location. Splunk writes a lot of temp stuff, and performance can be hindered if you are splitting up your install into lots of small storage volumes.

How to answer this depends on your underlying storage configuration - if you're mounting the same storage pool (ie, one big SAN), there's not much to be gained by doing separate volumes for each, and you end up copying data around unnecessarily between mount points. If you're doing tiered storage, then you can split hot and cold to tier the data.

Unless you can accurately determine data volume (ie, quantity), then splitting indexes into multiple locations is a royal pain in the rear. You're best off with one big hot volume and all indexes go into it, and one big cold volume if you have tiered storage. (there are exceptions to that depending on lots of things, which I can get into if you need).

What I did was create /opt/splunk/data and in there do hot and cold volumes in their own directories.

Don't put it in /opt/splunk/var/log. That's the log directories for Splunk. Someone could inadvertently clear the data.

Note that even if you create a separate data directory, Splunk will still store some index related data in /opt/splunk/var/lib. I put an RFE in for Splunk to write all data in only one directory so the application could be split from the data more easily - right now it's a mess of locations that can cause some serious performance issues depending on what your underlying storage is (for example, putting Splunk on a root volume with relatively low performance can cause problems even if you're writing your index data to other storage, or if it's a search head and Splunk is writing out a lot of temp files during searches).

I wouldn't bother with a volume for frozen, unless you often unfreeze large amounts of data. I don't think I did once in five years but that's just the way we were doing our storage...

Note: There's also something called volumes within Splunk. Look at the volumes stanza in the indexes.conf.

0 Karma

mattymo
Splunk Employee
Splunk Employee

Hey nickgleed!

Throwing the main doc items here for good measure...

there's some good examples in the docs here:
https://docs.splunk.com/Documentation/Splunk/latest/Indexer/Configureindexstoragesize#Configure_inde...

And the spec file is always a good read:
https://docs.splunk.com/Documentation/Splunk/7.1.2/Admin/Indexesconf#PER_INDEX_OPTIONS

You don't need a volume for each index, you are fine to just create your pvc for each bucket state, and then just reference them in your indexes.conf.

as per the spec file, this also takes token replacement now too!!

* It is recommended that you specify the path with the following syntax:
     homePath = $SPLUNK_DB/$_index_name/db
  At runtime, Splunk expands "$_index_name" to the name of the index. For example,
  if the index name is "newindex", homePath becomes "$SPLUNK_DB/newindex/db".

I think the only items I would re-consider would be:

I probably wouldn't put it in /opt/splunk/var/log/ but thats probably just my experience bias, as that is where splunk stores logging, and mixing in buckets there seems like something I probably would avoid. Not saying it wont work, but i'd go with something simple, like /opt/hotwarm /opt/cold. Less chance on a bungle on the path to your buckets too...

Also, have you tried deploying a container with this config?? I am just wondering if k8s will let you mount a volume in a mounted volume. ie. you are already mounting /opt/splunk/var/, will it even let you mount another pvc inside that mount?? I have done it with configmaps and it works...just haven't tried mounting a pvc in a pvc...

- MattyMo
0 Karma

nickgleed
Explorer

Yeah you can have a pvc for root var and then have another pvc in var/newpvc etc..

0 Karma

mattymo
Splunk Employee
Splunk Employee

nice! then yah, i would just cook up the pvcs and then ensure your indexes reference them accordingly in indexes.conf.

Let me know how it goes, or if you get stuck!

- MattyMo
0 Karma
Get Updates on the Splunk Community!

How to Get Started with Splunk Data Management Pipeline Builders (Edge Processor & ...

If you want to gain full control over your growing data volumes, check out Splunk’s Data Management pipeline ...

Out of the Box to Up And Running - Streamlined Observability for Your Cloud ...

  Tech Talk Streamlined Observability for Your Cloud Environment Register    Out of the Box to Up And Running ...

Splunk Smartness with Brandon Sternfield | Episode 3

Hello and welcome to another episode of "Splunk Smartness," the interview series where we explore the power of ...