Disk partitioning information

allamiro · ‎06-07-2012

Does any one know what are the best practices for partitioning the Linux OS i.e redhat / centos for the Splunk application server ?

lets say I have 250 GB can some one advise on this ?

dwaddle · ‎06-07-2012

This is a highly subjective question. Much of how you answer it depends on scale of installation. However there are some good, general principles.

Use Logical Volume Manager
Follow a standard "server" layout with different filesystems for /, /boot, /usr, /var, /tmp, and /opt
Put Splunk into its own filesystem on /opt/splunk
Put Splunk's index buckets into their own filesystem - at a minimum, /opt/splunk/var/lib/splunk should to be in its own filesystem

It's also important to consider the number of drives (spindles) you have, and what types they are. For optimum performance, Splunk needs RAID-10 and 800+ IOPS for hot buckets. This could mean an 8 drive (4+4P) RAID-10 array. Obviously, your base OS does not need this level of performance but it could piggyback off of it. Cold buckets can live on RAID-5 on separate spindles and still give adequate performance (if you have the drive slots to do it) - which would necessitate a different filesystem for hot/warm vs cold.

NOTE: The above advice about a RAID-5 volume just for cold buckets is somewhat out of date with the introduction of Splunk 5.0 and indexer clustering. The documentation says:

On a non-clustered indexer, by
specifying separate partitions for
hot/warm buckets and cold buckets, you
can designate different types of
storage for each. This is useful
because cold buckets are typically
accessed less frequently than hot/warm
buckets and therefore can be located
on slower disk arrays. Also, Splunk
doesn't usually need to perform index
processing on cold buckets. See "Use
multiple partitions for index data"
for details on this.

On a cluster, however, this approach
is not recommended. The storage used
for the coldPath location should have
the same performance characteristics
as that used for homePath storage.
This is because all replicated copies
of buckets reside in the peers'
coldPath directories. It doesn't
matter whether they're hot, warm, or
cold. If you use slower storage for
the coldPath location, it will slow
the overall performance of your
cluster.

Clusters require strongly performing
storage for the coldPath location in
order to handle the needs of cluster
operations. For example, some of the
buckets in the coldPath location will
be replicated hot bucket copies still
being written to. Other buckets will
be replicated warm copies, and the
search head might be accessing them
frequently. In addition, depending on
how the cluster is configured and what
occurs subsequently (in terms of peers
going offline, etc.), the peer might
need to convert bucket copies from
non-searchable to searchable,
entailing a considerable amount of
processing on the coldPath data.

View solution in original post

dwaddle · ‎06-07-2012

This is a highly subjective question. Much of how you answer it depends on scale of installation. However there are some good, general principles.

Use Logical Volume Manager
Follow a standard "server" layout with different filesystems for /, /boot, /usr, /var, /tmp, and /opt
Put Splunk into its own filesystem on /opt/splunk
Put Splunk's index buckets into their own filesystem - at a minimum, /opt/splunk/var/lib/splunk should to be in its own filesystem

It's also important to consider the number of drives (spindles) you have, and what types they are. For optimum performance, Splunk needs RAID-10 and 800+ IOPS for hot buckets. This could mean an 8 drive (4+4P) RAID-10 array. Obviously, your base OS does not need this level of performance but it could piggyback off of it. Cold buckets can live on RAID-5 on separate spindles and still give adequate performance (if you have the drive slots to do it) - which would necessitate a different filesystem for hot/warm vs cold.

NOTE: The above advice about a RAID-5 volume just for cold buckets is somewhat out of date with the introduction of Splunk 5.0 and indexer clustering. The documentation says:

On a non-clustered indexer, by
specifying separate partitions for
hot/warm buckets and cold buckets, you
can designate different types of
storage for each. This is useful
because cold buckets are typically
accessed less frequently than hot/warm
buckets and therefore can be located
on slower disk arrays. Also, Splunk
doesn't usually need to perform index
processing on cold buckets. See "Use
multiple partitions for index data"
for details on this.

On a cluster, however, this approach
is not recommended. The storage used
for the coldPath location should have
the same performance characteristics
as that used for homePath storage.
This is because all replicated copies
of buckets reside in the peers'
coldPath directories. It doesn't
matter whether they're hot, warm, or
cold. If you use slower storage for
the coldPath location, it will slow
the overall performance of your
cluster.

Clusters require strongly performing
storage for the coldPath location in
order to handle the needs of cluster
operations. For example, some of the
buckets in the coldPath location will
be replicated hot bucket copies still
being written to. Other buckets will
be replicated warm copies, and the
search head might be accessing them
frequently. In addition, depending on
how the cluster is configured and what
occurs subsequently (in terms of peers
going offline, etc.), the peer might
need to convert bucket copies from
non-searchable to searchable,
entailing a considerable amount of
processing on the coldPath data.

allamiro · ‎06-07-2012

Disk partitioning information

part / =60000
part /boot =1000
part /home =60000
part /var --size=15000
part /tmp size=20000
part /usr =20000
part /opt =50000
part /isos =7000

Would that be ok ?

Inayath_khan · ‎01-06-2020

Hi @allamiro Are these standard value? can you please give me a reference from where you took these values?

Thanks in advance

whats the best practices for partitioning the OS

Disk partitioning information

CX Day is Coming!

Strengthen Your Future: A Look Back at Splunk 10 Innovations and .conf25 Highlights!

Now Offering the AI Assistant Usage Dashboard in Cloud Monitoring Console

Are you a member of the Splunk Community?

whats the best practices for partitioning the OS

Disk partitioning information

CX Day is Coming!

Strengthen Your Future: A Look Back at Splunk 10 Innovations and .conf25 Highlights!

Now Offering the AI Assistant Usage Dashboard in Cloud Monitoring Console