I have a situation with Splunk and how things are configured. We have multiple indexers and they have local SSD drives configured or the hot/warm mount point and we use our FLASH SAN attached over FibreChannel for the cold mount points. We need to expand the hot mount point and this will require purchasing a fair amount of equipment to expand the SSD that is installed. We could easily put it on the SAN and move the SSD to the SAN. What are the recommendations? Can we use SAN instead of SSD?
Here are the bonnie++ and fio outputs for both the SAN and SSD.
nohup /usr/local/bin/bonnie++ -d /ssd/splunk -x 1 -u root:root -q -f > /ssd_disk_io_test.csv 2> /ssd_disk_io_test.err < /dev/null
FLASH SAN:
Sequential Output | Sequential Input | Random | Sequential Create | Random Create | ||||||||||||||||||||||
Size:Chunk Size | Per Char | Block | Rewrite | Per Char | Block | Num Files | Create | Read | Delete | Create | Read | Delete | ||||||||||||||
K/sec | % CPU | K/sec | % CPU | K/sec | % CPU | K/sec | % CPU | K/sec | % CPU | / sec | % CPU | / sec | % CPU | / sec | % CPU | / sec | % CPU | / sec | % CPU | / sec | % CPU | / sec | % CPU | |||
SERVER | 505G | 528475 | 99 | 252234 | 75 | 539415 | 83 | 11333.2 | 69 | 16 | 12646 | 99 | +++++ | +++ | +++++ | +++ | 14164 | 99 | +++++ | +++ | +++++ | +++ |
SSD:
Sequential Output | Sequential Input | Random | Sequential Create | Random Create | ||||||||||||||||||||||
Size:Chunk Size | Per Char | Block | Rewrite | Per Char | Block | Num Files | Create | Read | Delete | Create | Read | Delete | ||||||||||||||
K/sec | % CPU | K/sec | % CPU | K/sec | % CPU | K/sec | % CPU | K/sec | % CPU | / sec | % CPU | / sec | % CPU | / sec | % CPU | / sec | % CPU | / sec | % CPU | / sec | % CPU | / sec | % CPU | |||
SERVER | 505G | 505646 | 99 | 218685 | 62 | 504329 | 74 | +++++ | +++ | 16 | 12085 | 99 | +++++ | +++ | +++++ | +++ | 12162 | 99 | +++++ | +++ | +++++ | +++ |
nohup /usr/local/bin/bonnie++ -d /ssd/splunk -s 516696 -u root:root -fb > /ssd_bonnie-seth.csv 2> /ssd_bonnie-seth.err < /dev/null
NON-SSD:
Sequential Output | Sequential Input | Random | Sequential Create | Random Create | ||||||||||||||||||||||
Size:Chunk Size | Per Char | Block | Rewrite | Per Char | Block | Num Files | Create | Read | Delete | Create | Read | Delete | ||||||||||||||
K/sec | % CPU | K/sec | % CPU | K/sec | % CPU | K/sec | % CPU | K/sec | % CPU | / sec | % CPU | / sec | % CPU | / sec | % CPU | / sec | % CPU | / sec | % CPU | / sec | % CPU | / sec | % CPU | |||
SERVER | 516696M | 509275 | 97 | 241009 | 74 | 538182 | 81 | 7549.2 | 57 | 16 | 616 | 17 | +++++ | +++ | 780 | 6 | 556 | 12 | +++++ | +++ | 884 | 7 |
SSD:
Sequential Output | Sequential Input | Random | Sequential Create | Random Create | ||||||||||||||||||||||
Size:Chunk Size | Per Char | Block | Rewrite | Per Char | Block | Num Files | Create | Read | Delete | Create | Read | Delete | ||||||||||||||
K/sec | % CPU | K/sec | % CPU | K/sec | % CPU | K/sec | % CPU | K/sec | % CPU | / sec | % CPU | / sec | % CPU | / sec | % CPU | / sec | % CPU | / sec | % CPU | / sec | % CPU | / sec | % CPU | |||
SERVER | 516696M | 501271 | 98 | 218604 | 62 | 523329 | 75 | 8428.1 | 60 | 16 | 501 | 13 | +++++ | +++ | 570 | 4 | 499 | 15 | +++++ | +++ | 579 | 4 |
Here is the output of FIO for SSD:
4k_benchmark: (g=0): rw=read, bs=(R) 1024KiB-1024KiB, (W) 1024KiB-1024KiB, (T) 1024KiB-1024KiB, ioengine=libaio, iodepth=128
...
fio-3.5
Starting 12 processes
4k_benchmark: Laying out IO file (1 file / 102400MiB)
4k_benchmark: (groupid=0, jobs=12): err= 0: pid=29545: Thu May 3 00:53:33 2018
read: IOPS=1307, BW=1308MiB/s (1371MB/s)(38.4GiB/30063msec)
slat (usec): min=178, max=445349, avg=9130.87, stdev=15882.69
clat (msec): min=17, max=2401, avg=1139.06, stdev=330.41
lat (msec): min=25, max=2420, avg=1148.20, stdev=331.67
clat percentiles (msec):
| 1.00th=[ 292], 5.00th=[ 592], 10.00th=[ 718], 20.00th=[ 869],
| 30.00th=[ 978], 40.00th=[ 1070], 50.00th=[ 1150], 60.00th=[ 1217],
| 70.00th=[ 1318], 80.00th=[ 1418], 90.00th=[ 1552], 95.00th=[ 1670],
| 99.00th=[ 1905], 99.50th=[ 2022], 99.90th=[ 2232], 99.95th=[ 2299],
| 99.99th=[ 2366]
bw ( KiB/s): min= 4096, max=329728, per=8.27%, avg=110678.16, stdev=45552.32, samples=700
iops : min= 4, max= 322, avg=107.93, stdev=44.42, samples=700
lat (msec) : 20=0.01%, 50=0.03%, 100=0.20%, 250=0.62%, 500=2.10%
lat (msec) : 750=8.87%, 1000=20.50%
cpu : usr=0.20%, sys=7.40%, ctx=53123, majf=0, minf=1233
IO depths : 1=0.1%, 2=0.1%, 4=0.1%, 8=0.2%, 16=0.5%, 32=1.0%, >=64=98.1%
submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.1%
issued rwts: total=39314,0,0,0 short=0,0,0,0 dropped=0,0,0,0
latency : target=0, window=0, percentile=100.00%, depth=128
Run status group 0 (all jobs):
READ: bw=1308MiB/s (1371MB/s), 1308MiB/s-1308MiB/s (1371MB/s-1371MB/s), io=38.4GiB (41.2GB), run=30063-30063msec
Disk stats (read/write):
dm-10: ios=78343/9, merge=0/0, ticks=4005205/515, in_queue=4059709, util=99.69%, aggrios=78628/7, aggrmerge=0/3, aggrticks=4012840/24755, aggrin_queue=4037190, aggrutil=99.66%
sdb: ios=78628/7, merge=0/3, ticks=4012840/24755, in_queue=4037190, util=99.66%
Here is the output for fio for FLASH SAN:
4k_benchmark: (g=0): rw=read, bs=(R) 1024KiB-1024KiB, (W) 1024KiB-1024KiB, (T) 1024KiB-1024KiB, ioengine=libaio, iodepth=128
...
fio-3.5
Starting 12 processes
4k_benchmark: Laying out IO file (1 file / 102400MiB)
4k_benchmark: (groupid=0, jobs=12): err= 0: pid=54555: Wed Apr 18 10:54:55 2018
read: IOPS=2462, BW=2463MiB/s (2582MB/s)(72.2GiB/30007msec)
slat (usec): min=243, max=178818, avg=4850.89, stdev=9427.35
clat (usec): min=623, max=1639.7k, avg=604472.15, stdev=172064.11
lat (usec): min=1986, max=1751.5k, avg=609326.97, stdev=173108.88
clat percentiles (msec):
| 1.00th=[ 239], 5.00th=[ 393], 10.00th=[ 435], 20.00th=[ 481],
| 30.00th=[ 514], 40.00th=[ 550], 50.00th=[ 575], 60.00th=[ 609],
| 70.00th=[ 651], 80.00th=[ 709], 90.00th=[ 818], 95.00th=[ 969],
| 99.00th=[ 1150], 99.50th=[ 1200], 99.90th=[ 1318], 99.95th=[ 1334],
| 99.99th=[ 1401]
bw ( KiB/s): min= 4104, max=416193, per=8.42%, avg=212218.39, stdev=58840.35, samples=701
iops : min= 4, max= 406, avg=206.70, stdev=57.47, samples=701
lat (usec) : 750=0.01%, 1000=0.01%
lat (msec) : 2=0.01%, 4=0.01%, 10=0.01%, 20=0.04%, 50=0.11%
lat (msec) : 100=0.19%, 250=0.70%, 500=24.14%, 750=60.18%, 1000=10.62%
cpu : usr=0.27%, sys=15.52%, ctx=101062, majf=0, minf=1234
IO depths : 1=0.1%, 2=0.1%, 4=0.1%, 8=0.1%, 16=0.3%, 32=0.5%, >=64=99.0%
submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.1%
issued rwts: total=73895,0,0,0 short=0,0,0,0 dropped=0,0,0,0
latency : target=0, window=0, percentile=100.00%, depth=128
Run status group 0 (all jobs):
READ: bw=2463MiB/s (2582MB/s), 2463MiB/s-2463MiB/s (2582MB/s-2582MB/s), io=72.2GiB (77.5GB), run=30007-30007msec
Disk stats (read/write):
dm-11: ios=146961/5, merge=0/0, ticks=276683/6, in_queue=277077, util=99.19%, aggrios=73895/3, aggrmerge=0/0, aggrticks=138910/3, aggrin_queue=139108, aggrutil=99.17%
dm-2: ios=0/4, merge=0/0, ticks=0/2, in_queue=2, util=0.01%, aggrios=0/3, aggrmerge=0/1, aggrticks=0/2, aggrin_queue=2, aggrutil=0.01%
dm-0: ios=0/3, merge=0/1, ticks=0/2, in_queue=2, util=0.01%, aggrios=0/2, aggrmerge=0/0, aggrticks=0/0, aggrin_queue=0, aggrutil=0.00%
sdc: ios=0/2, merge=0/0, ticks=0/0, in_queue=0, util=0.00%
sdf: ios=0/2, merge=0/0, ticks=0/1, in_queue=1, util=0.00%
dm-3: ios=147790/3, merge=0/0, ticks=277820/4, in_queue=278214, util=99.17%, aggrios=147790/3, aggrmerge=0/0, aggrticks=279775/4, aggrin_queue=279008, aggrutil=99.11%
dm-1: ios=147790/3, merge=0/0, ticks=279775/4, in_queue=279008, util=99.11%, aggrios=73895/1, aggrmerge=0/0, aggrticks=129019/0, aggrin_queue=128825, aggrutil=98.68%
sdd: ios=73894/2, merge=0/0, ticks=125568/0, in_queue=125371, util=98.66%
sdg: ios=73896/1, merge=0/0, ticks=132470/0, in_queue=132279, util=98.68%