Deployment Architecture

Use of FLASH SAN vs SSD

flyingbeefhead
New Member

I have a situation with Splunk and how things are configured. We have multiple indexers and they have local SSD drives configured or the hot/warm mount point and we use our FLASH SAN attached over FibreChannel for the cold mount points. We need to expand the hot mount point and this will require purchasing a fair amount of equipment to expand the SSD that is installed. We could easily put it on the SAN and move the SSD to the SAN. What are the recommendations? Can we use SAN instead of SSD?

Here are the bonnie++ and fio outputs for both the SAN and SSD.

nohup /usr/local/bin/bonnie++ -d /ssd/splunk -x 1 -u root:root -q -f > /ssd_disk_io_test.csv 2> /ssd_disk_io_test.err < /dev/null

 

FLASH SAN:

 

Sequential Output

Sequential Input

Random
Seeks

 

Sequential Create

Random Create

 

Size:Chunk Size

Per Char

Block

Rewrite

Per Char

Block

Num Files

Create

Read

Delete

Create

Read

Delete

 

K/sec

% CPU

K/sec

% CPU

K/sec

% CPU

K/sec

% CPU

K/sec

% CPU

/ sec

% CPU

 

/ sec

% CPU

/ sec

% CPU

/ sec

% CPU

/ sec

% CPU

/ sec

% CPU

/ sec

% CPU

SERVER

505G

  

528475

99

252234

75

  

539415

83

11333.2

69

16

12646

99

+++++

+++

+++++

+++

14164

99

+++++

+++

+++++

+++

 

SSD:

 

Sequential Output

Sequential Input

Random
Seeks

 

Sequential Create

Random Create

 

Size:Chunk Size

Per Char

Block

Rewrite

Per Char

Block

Num Files

Create

Read

Delete

Create

Read

Delete

 

K/sec

% CPU

K/sec

% CPU

K/sec

% CPU

K/sec

% CPU

K/sec

% CPU

/ sec

% CPU

 

/ sec

% CPU

/ sec

% CPU

/ sec

% CPU

/ sec

% CPU

/ sec

% CPU

/ sec

% CPU

SERVER

505G

  

505646

99

218685

62

  

504329

74

+++++

+++

16

12085

99

+++++

+++

+++++

+++

12162

99

+++++

+++

+++++

+++

 

nohup /usr/local/bin/bonnie++ -d /ssd/splunk -s 516696 -u root:root -fb > /ssd_bonnie-seth.csv 2> /ssd_bonnie-seth.err < /dev/null

NON-SSD:

 

Sequential Output

Sequential Input

Random
Seeks

 

Sequential Create

Random Create

 

Size:Chunk Size

Per Char

Block

Rewrite

Per Char

Block

Num Files

Create

Read

Delete

Create

Read

Delete

 

K/sec

% CPU

K/sec

% CPU

K/sec

% CPU

K/sec

% CPU

K/sec

% CPU

/ sec

% CPU

 

/ sec

% CPU

/ sec

% CPU

/ sec

% CPU

/ sec

% CPU

/ sec

% CPU

/ sec

% CPU

SERVER

516696M

  

509275

97

241009

74

  

538182

81

7549.2

57

16

616

17

+++++

+++

780

6

556

12

+++++

+++

884

7

 

SSD:

 

Sequential Output

Sequential Input

Random
Seeks

 

Sequential Create

Random Create

 

Size:Chunk Size

Per Char

Block

Rewrite

Per Char

Block

Num Files

Create

Read

Delete

Create

Read

Delete

 

K/sec

% CPU

K/sec

% CPU

K/sec

% CPU

K/sec

% CPU

K/sec

% CPU

/ sec

% CPU

 

/ sec

% CPU

/ sec

% CPU

/ sec

% CPU

/ sec

% CPU

/ sec

% CPU

/ sec

% CPU

SERVER

516696M

  

501271

98

218604

62

  

523329

75

8428.1

60

16

501

13

+++++

+++

570

4

499

15

+++++

+++

579

4

 

Here is the output of FIO for SSD:

4k_benchmark: (g=0): rw=read, bs=(R) 1024KiB-1024KiB, (W) 1024KiB-1024KiB, (T) 1024KiB-1024KiB, ioengine=libaio, iodepth=128
...
fio-3.5
Starting 12 processes
4k_benchmark: Laying out IO file (1 file / 102400MiB)

4k_benchmark: (groupid=0, jobs=12): err= 0: pid=29545: Thu May 3 00:53:33 2018
read: IOPS=1307, BW=1308MiB/s (1371MB/s)(38.4GiB/30063msec)
slat (usec): min=178, max=445349, avg=9130.87, stdev=15882.69
clat (msec): min=17, max=2401, avg=1139.06, stdev=330.41
lat (msec): min=25, max=2420, avg=1148.20, stdev=331.67
clat percentiles (msec):
| 1.00th=[ 292], 5.00th=[ 592], 10.00th=[ 718], 20.00th=[ 869],
| 30.00th=[ 978], 40.00th=[ 1070], 50.00th=[ 1150], 60.00th=[ 1217],
| 70.00th=[ 1318], 80.00th=[ 1418], 90.00th=[ 1552], 95.00th=[ 1670],
| 99.00th=[ 1905], 99.50th=[ 2022], 99.90th=[ 2232], 99.95th=[ 2299],
| 99.99th=[ 2366]
bw ( KiB/s): min= 4096, max=329728, per=8.27%, avg=110678.16, stdev=45552.32, samples=700
iops : min= 4, max= 322, avg=107.93, stdev=44.42, samples=700
lat (msec) : 20=0.01%, 50=0.03%, 100=0.20%, 250=0.62%, 500=2.10%
lat (msec) : 750=8.87%, 1000=20.50%
cpu : usr=0.20%, sys=7.40%, ctx=53123, majf=0, minf=1233
IO depths : 1=0.1%, 2=0.1%, 4=0.1%, 8=0.2%, 16=0.5%, 32=1.0%, >=64=98.1%
submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.1%
issued rwts: total=39314,0,0,0 short=0,0,0,0 dropped=0,0,0,0
latency : target=0, window=0, percentile=100.00%, depth=128

Run status group 0 (all jobs):
READ: bw=1308MiB/s (1371MB/s), 1308MiB/s-1308MiB/s (1371MB/s-1371MB/s), io=38.4GiB (41.2GB), run=30063-30063msec

Disk stats (read/write):
dm-10: ios=78343/9, merge=0/0, ticks=4005205/515, in_queue=4059709, util=99.69%, aggrios=78628/7, aggrmerge=0/3, aggrticks=4012840/24755, aggrin_queue=4037190, aggrutil=99.66%
sdb: ios=78628/7, merge=0/3, ticks=4012840/24755, in_queue=4037190, util=99.66%

 

Here is the output for fio for FLASH SAN:

4k_benchmark: (g=0): rw=read, bs=(R) 1024KiB-1024KiB, (W) 1024KiB-1024KiB, (T) 1024KiB-1024KiB, ioengine=libaio, iodepth=128
...
fio-3.5
Starting 12 processes
4k_benchmark: Laying out IO file (1 file / 102400MiB)

4k_benchmark: (groupid=0, jobs=12): err= 0: pid=54555: Wed Apr 18 10:54:55 2018
read: IOPS=2462, BW=2463MiB/s (2582MB/s)(72.2GiB/30007msec)
slat (usec): min=243, max=178818, avg=4850.89, stdev=9427.35
clat (usec): min=623, max=1639.7k, avg=604472.15, stdev=172064.11
lat (usec): min=1986, max=1751.5k, avg=609326.97, stdev=173108.88
clat percentiles (msec):
| 1.00th=[ 239], 5.00th=[ 393], 10.00th=[ 435], 20.00th=[ 481],
| 30.00th=[ 514], 40.00th=[ 550], 50.00th=[ 575], 60.00th=[ 609],
| 70.00th=[ 651], 80.00th=[ 709], 90.00th=[ 818], 95.00th=[ 969],
| 99.00th=[ 1150], 99.50th=[ 1200], 99.90th=[ 1318], 99.95th=[ 1334],
| 99.99th=[ 1401]
bw ( KiB/s): min= 4104, max=416193, per=8.42%, avg=212218.39, stdev=58840.35, samples=701
iops : min= 4, max= 406, avg=206.70, stdev=57.47, samples=701
lat (usec) : 750=0.01%, 1000=0.01%
lat (msec) : 2=0.01%, 4=0.01%, 10=0.01%, 20=0.04%, 50=0.11%
lat (msec) : 100=0.19%, 250=0.70%, 500=24.14%, 750=60.18%, 1000=10.62%
cpu : usr=0.27%, sys=15.52%, ctx=101062, majf=0, minf=1234
IO depths : 1=0.1%, 2=0.1%, 4=0.1%, 8=0.1%, 16=0.3%, 32=0.5%, >=64=99.0%
submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.1%
issued rwts: total=73895,0,0,0 short=0,0,0,0 dropped=0,0,0,0
latency : target=0, window=0, percentile=100.00%, depth=128

Run status group 0 (all jobs):
READ: bw=2463MiB/s (2582MB/s), 2463MiB/s-2463MiB/s (2582MB/s-2582MB/s), io=72.2GiB (77.5GB), run=30007-30007msec

Disk stats (read/write):
dm-11: ios=146961/5, merge=0/0, ticks=276683/6, in_queue=277077, util=99.19%, aggrios=73895/3, aggrmerge=0/0, aggrticks=138910/3, aggrin_queue=139108, aggrutil=99.17%
dm-2: ios=0/4, merge=0/0, ticks=0/2, in_queue=2, util=0.01%, aggrios=0/3, aggrmerge=0/1, aggrticks=0/2, aggrin_queue=2, aggrutil=0.01%
dm-0: ios=0/3, merge=0/1, ticks=0/2, in_queue=2, util=0.01%, aggrios=0/2, aggrmerge=0/0, aggrticks=0/0, aggrin_queue=0, aggrutil=0.00%
sdc: ios=0/2, merge=0/0, ticks=0/0, in_queue=0, util=0.00%
sdf: ios=0/2, merge=0/0, ticks=0/1, in_queue=1, util=0.00%
dm-3: ios=147790/3, merge=0/0, ticks=277820/4, in_queue=278214, util=99.17%, aggrios=147790/3, aggrmerge=0/0, aggrticks=279775/4, aggrin_queue=279008, aggrutil=99.11%
dm-1: ios=147790/3, merge=0/0, ticks=279775/4, in_queue=279008, util=99.11%, aggrios=73895/1, aggrmerge=0/0, aggrticks=129019/0, aggrin_queue=128825, aggrutil=98.68%
sdd: ios=73894/2, merge=0/0, ticks=125568/0, in_queue=125371, util=98.66%
sdg: ios=73896/1, merge=0/0, ticks=132470/0, in_queue=132279, util=98.68%

 

Labels (2)
Tags (3)
0 Karma
Career Survey
First 500 qualified respondents will receive a $20 gift card! Tell us about your professional Splunk journey.
Get Updates on the Splunk Community!

Tech Talk Recap | Mastering Threat Hunting

Mastering Threat HuntingDive into the world of threat hunting, exploring the key differences between ...

Observability for AI Applications: Troubleshooting Latency

If you’re working with proprietary company data, you’re probably going to have a locally hosted LLM or many ...

Splunk AI Assistant for SPL vs. ChatGPT: Which One is Better?

In the age of AI, every tool promises to make our lives easier. From summarizing content to writing code, ...