Use of FLASH SAN vs SSD

flyingbeefhead · ‎11-08-2020

I have a situation with Splunk and how things are configured. We have multiple indexers and they have local SSD drives configured or the hot/warm mount point and we use our FLASH SAN attached over FibreChannel for the cold mount points. We need to expand the hot mount point and this will require purchasing a fair amount of equipment to expand the SSD that is installed. We could easily put it on the SAN and move the SSD to the SAN. What are the recommendations? Can we use SAN instead of SSD?

Here are the bonnie++ and fio outputs for both the SAN and SSD.

nohup /usr/local/bin/bonnie++ -d /ssd/splunk -x 1 -u root:root -q -f > /ssd_disk_io_test.csv 2> /ssd_disk_io_test.err < /dev/null

FLASH SAN:

Sequential Output

Sequential Input

Random
Seeks

Sequential Create

Random Create

Size:Chunk Size

Per Char

Block

Rewrite

Per Char

Block

Num Files

Create

Read

Delete

Create

Read

Delete

K/sec

% CPU

K/sec

% CPU

K/sec

% CPU

K/sec

% CPU

K/sec

% CPU

/ sec

% CPU

/ sec

% CPU

/ sec

% CPU

/ sec

% CPU

/ sec

% CPU

/ sec

% CPU

/ sec

% CPU

SERVER

505G

528475

99

252234

75

539415

83

11333.2

69

16

12646

99

+++++

+++

+++++

+++

14164

99

+++++

+++

+++++

+++

SSD:

Sequential Output

Sequential Input

Random
Seeks

Sequential Create

Random Create

Size:Chunk Size

Per Char

Block

Rewrite

Per Char

Block

Num Files

Create

Read

Delete

Create

Read

Delete

K/sec

% CPU

K/sec

% CPU

K/sec

% CPU

K/sec

% CPU

K/sec

% CPU

/ sec

% CPU

/ sec

% CPU

/ sec

% CPU

/ sec

% CPU

/ sec

% CPU

/ sec

% CPU

/ sec

% CPU

SERVER

505G

505646

99

218685

62

504329

74

+++++

+++

16

12085

99

+++++

+++

+++++

+++

12162

99

+++++

+++

+++++

+++

nohup /usr/local/bin/bonnie++ -d /ssd/splunk -s 516696 -u root:root -fb > /ssd_bonnie-seth.csv 2> /ssd_bonnie-seth.err < /dev/null

NON-SSD:

Sequential Output

Sequential Input

Random
Seeks

Sequential Create

Random Create

Size:Chunk Size

Per Char

Block

Rewrite

Per Char

Block

Num Files

Create

Read

Delete

Create

Read

Delete

K/sec

% CPU

K/sec

% CPU

K/sec

% CPU

K/sec

% CPU

K/sec

% CPU

/ sec

% CPU

/ sec

% CPU

/ sec

% CPU

/ sec

% CPU

/ sec

% CPU

/ sec

% CPU

/ sec

% CPU

SERVER

516696M

509275

97

241009

74

538182

81

7549.2

57

16

616

17

+++++

+++

780

6

556

12

+++++

+++

884

7

SSD:

Sequential Output

Sequential Input

Random
Seeks

Sequential Create

Random Create

Size:Chunk Size

Per Char

Block

Rewrite

Per Char

Block

Num Files

Create

Read

Delete

Create

Read

Delete

K/sec

% CPU

K/sec

% CPU

K/sec

% CPU

K/sec

% CPU

K/sec

% CPU

/ sec

% CPU

/ sec

% CPU

/ sec

% CPU

/ sec

% CPU

/ sec

% CPU

/ sec

% CPU

/ sec

% CPU

SERVER

516696M

501271

98

218604

62

523329

75

8428.1

60

16

501

13

+++++

+++

570

4

499

15

+++++

+++

579

4

Here is the output of FIO for SSD:

4k_benchmark: (g=0): rw=read, bs=(R) 1024KiB-1024KiB, (W) 1024KiB-1024KiB, (T) 1024KiB-1024KiB, ioengine=libaio, iodepth=128
...
fio-3.5
Starting 12 processes
4k_benchmark: Laying out IO file (1 file / 102400MiB)

4k_benchmark: (groupid=0, jobs=12): err= 0: pid=29545: Thu May 3 00:53:33 2018
read: IOPS=1307, BW=1308MiB/s (1371MB/s)(38.4GiB/30063msec)
slat (usec): min=178, max=445349, avg=9130.87, stdev=15882.69
clat (msec): min=17, max=2401, avg=1139.06, stdev=330.41
lat (msec): min=25, max=2420, avg=1148.20, stdev=331.67
clat percentiles (msec):
| 1.00th=[ 292], 5.00th=[ 592], 10.00th=[ 718], 20.00th=[ 869],
| 30.00th=[ 978], 40.00th=[ 1070], 50.00th=[ 1150], 60.00th=[ 1217],
| 70.00th=[ 1318], 80.00th=[ 1418], 90.00th=[ 1552], 95.00th=[ 1670],
| 99.00th=[ 1905], 99.50th=[ 2022], 99.90th=[ 2232], 99.95th=[ 2299],
| 99.99th=[ 2366]
bw ( KiB/s): min= 4096, max=329728, per=8.27%, avg=110678.16, stdev=45552.32, samples=700
iops : min= 4, max= 322, avg=107.93, stdev=44.42, samples=700
lat (msec) : 20=0.01%, 50=0.03%, 100=0.20%, 250=0.62%, 500=2.10%
lat (msec) : 750=8.87%, 1000=20.50%
cpu : usr=0.20%, sys=7.40%, ctx=53123, majf=0, minf=1233
IO depths : 1=0.1%, 2=0.1%, 4=0.1%, 8=0.2%, 16=0.5%, 32=1.0%, >=64=98.1%
submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.1%
issued rwts: total=39314,0,0,0 short=0,0,0,0 dropped=0,0,0,0
latency : target=0, window=0, percentile=100.00%, depth=128

Run status group 0 (all jobs):
READ: bw=1308MiB/s (1371MB/s), 1308MiB/s-1308MiB/s (1371MB/s-1371MB/s), io=38.4GiB (41.2GB), run=30063-30063msec

Disk stats (read/write):
dm-10: ios=78343/9, merge=0/0, ticks=4005205/515, in_queue=4059709, util=99.69%, aggrios=78628/7, aggrmerge=0/3, aggrticks=4012840/24755, aggrin_queue=4037190, aggrutil=99.66%
sdb: ios=78628/7, merge=0/3, ticks=4012840/24755, in_queue=4037190, util=99.66%

Here is the output for fio for FLASH SAN:

4k_benchmark: (g=0): rw=read, bs=(R) 1024KiB-1024KiB, (W) 1024KiB-1024KiB, (T) 1024KiB-1024KiB, ioengine=libaio, iodepth=128
...
fio-3.5
Starting 12 processes
4k_benchmark: Laying out IO file (1 file / 102400MiB)

4k_benchmark: (groupid=0, jobs=12): err= 0: pid=54555: Wed Apr 18 10:54:55 2018
read: IOPS=2462, BW=2463MiB/s (2582MB/s)(72.2GiB/30007msec)
slat (usec): min=243, max=178818, avg=4850.89, stdev=9427.35
clat (usec): min=623, max=1639.7k, avg=604472.15, stdev=172064.11
lat (usec): min=1986, max=1751.5k, avg=609326.97, stdev=173108.88
clat percentiles (msec):
| 1.00th=[ 239], 5.00th=[ 393], 10.00th=[ 435], 20.00th=[ 481],
| 30.00th=[ 514], 40.00th=[ 550], 50.00th=[ 575], 60.00th=[ 609],
| 70.00th=[ 651], 80.00th=[ 709], 90.00th=[ 818], 95.00th=[ 969],
| 99.00th=[ 1150], 99.50th=[ 1200], 99.90th=[ 1318], 99.95th=[ 1334],
| 99.99th=[ 1401]
bw ( KiB/s): min= 4104, max=416193, per=8.42%, avg=212218.39, stdev=58840.35, samples=701
iops : min= 4, max= 406, avg=206.70, stdev=57.47, samples=701
lat (usec) : 750=0.01%, 1000=0.01%
lat (msec) : 2=0.01%, 4=0.01%, 10=0.01%, 20=0.04%, 50=0.11%
lat (msec) : 100=0.19%, 250=0.70%, 500=24.14%, 750=60.18%, 1000=10.62%
cpu : usr=0.27%, sys=15.52%, ctx=101062, majf=0, minf=1234
IO depths : 1=0.1%, 2=0.1%, 4=0.1%, 8=0.1%, 16=0.3%, 32=0.5%, >=64=99.0%
submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.1%
issued rwts: total=73895,0,0,0 short=0,0,0,0 dropped=0,0,0,0
latency : target=0, window=0, percentile=100.00%, depth=128

Run status group 0 (all jobs):
READ: bw=2463MiB/s (2582MB/s), 2463MiB/s-2463MiB/s (2582MB/s-2582MB/s), io=72.2GiB (77.5GB), run=30007-30007msec

Disk stats (read/write):
dm-11: ios=146961/5, merge=0/0, ticks=276683/6, in_queue=277077, util=99.19%, aggrios=73895/3, aggrmerge=0/0, aggrticks=138910/3, aggrin_queue=139108, aggrutil=99.17%
dm-2: ios=0/4, merge=0/0, ticks=0/2, in_queue=2, util=0.01%, aggrios=0/3, aggrmerge=0/1, aggrticks=0/2, aggrin_queue=2, aggrutil=0.01%
dm-0: ios=0/3, merge=0/1, ticks=0/2, in_queue=2, util=0.01%, aggrios=0/2, aggrmerge=0/0, aggrticks=0/0, aggrin_queue=0, aggrutil=0.00%
sdc: ios=0/2, merge=0/0, ticks=0/0, in_queue=0, util=0.00%
sdf: ios=0/2, merge=0/0, ticks=0/1, in_queue=1, util=0.00%
dm-3: ios=147790/3, merge=0/0, ticks=277820/4, in_queue=278214, util=99.17%, aggrios=147790/3, aggrmerge=0/0, aggrticks=279775/4, aggrin_queue=279008, aggrutil=99.11%
dm-1: ios=147790/3, merge=0/0, ticks=279775/4, in_queue=279008, util=99.11%, aggrios=73895/1, aggrmerge=0/0, aggrticks=129019/0, aggrin_queue=128825, aggrutil=98.68%
sdd: ios=73894/2, merge=0/0, ticks=125568/0, in_queue=125371, util=98.66%
sdg: ios=73896/1, merge=0/0, ticks=132470/0, in_queue=132279, util=98.68%

Use of FLASH SAN vs SSD

deployment server

indexer

Tech Talk Recap | Mastering Threat Hunting

Observability for AI Applications: Troubleshooting Latency

Splunk AI Assistant for SPL vs. ChatGPT: Which One is Better?

Are you a member of the Splunk Community?

Use of FLASH SAN vs SSD

deployment server

indexer

Tech Talk Recap | Mastering Threat Hunting

Observability for AI Applications: Troubleshooting Latency

Splunk AI Assistant for SPL vs. ChatGPT: Which One is Better?