Archive

Retrieving Summary Index Data

Explorer

Hi

I am trying to retrieve data from summary index and it is taking 300secs to retrieve 140000 events from 4 search peers.
index=summaryindex earliest=-7d@d latest=now (240000 events take more than 300secs)

from the same search head it is taking less than 15secs to retrieve the same amount of raw events from normal index.(4 search peers)
index=cataloglogs earliest=-1hr@hr latest=now ( around 240000 events take less than 15secs )

why is it taking 300 secs just to retrieve data from summary index? everything resides on the same disk. is there anything i have to tune in to increase the performance on retrieving summarized data.

Please advise.

Thanks
Praveen

0 Karma

Explorer

limit the fields to just the necessary ones as summary index by default has more summary and time fields than the normal index

0 Karma

Explorer

Thanks for your reply chris. below is the search which runs every 10 mins and it is summary indexed

index=catalogs earliest=-12min@min latest=-2min@min | bucket _time span=5min | eval status200=if(httpcode=="200", 1,0) | eval status4xx=if((httpcode>"399" AND httpcode<"500"),1,0) | eval status5xx=if((httpcode>499 AND httpcode<600),1,0) | eval rsppoint5=if(rsptime<500,1,0) | eval rsppoint5to1=if((rsptime>500 AND rsptime<1001),1,0) | eval rsp1to2=if((rsptime>1000 AND rsptime<2001),1,0) | eval rsp2to5=if((rsptime>2000 AND rsptime<5001),1,0) | eval rspg60=if((rsptime>60000),1,0) | eval rsp5to10=if((rsptime>5000 AND rsptime<10001),1,0) | eval rsp10to60=if((rsptime>10000 AND rsptime<60001),1,0) | sistats sum(status200) as twox, sum(status4xx) as fourx, sum(status5xx) as fivex, count as "Requests per minute", avg(rsptime) as "Average Response time", dc(clinetips) as uniquestbs, sum(rsppoint5) as rspp5, sum(rsppoint5to1) as rspp5to1, sum(rsp1to2) as rspp1to2, sum(rsp2to5) as rspp2to5, sum(rsp5to10) as rspp5to10, sum(rsp10to60) as rspp10to60 , sum(rspg60) as rsppg60 by _time, cataloghosts

below is the search which retrieves data from summary index.

index=summ_catalog_dropdowns earliest=-7day@day latest=-60min@min | bucket _time span=1day | stats sum(status200) as twox, sum(status4xx) as fourx, sum(status5xx) as fivex, count as "Requests per minute", avg(rsptime) as "Average Response time", dc(catalog_clinetips) as uniquestbs, sum(rsppoint5) as rspp5, sum(rsppoint5to1) as rspp5to1, sum(rsp1to2) as rspp1to2, sum(rsp2to5) as rspp2to5, sum(rsp5to10) as rspp5to10, sum(rsp10to60) as rspp10to60 , sum(rspg60) as rsppg60 by _time, cataloghosts

command.stats.execute_input is taking long time. is there any way i can reduce the time taken by stats command. Please advise.

262.381 command.stats.execute_input 259

66.298  dispatch.stream.remote  249 -   2,080,370,874
17.397  dispatch.stream.remote.che-splunk-index03   65  -   545,221,560
16.971  dispatch.stream.remote.che-splunk-index01.echodata.tv   63  -   542,609,628
16.466  dispatch.stream.remote.che-splunk-index04   61  -   518,057,416
15.459  dispatch.stream.remote.che-splunk-index02.echodata.tv   56  -   474,474,006
4.227   dispatch.writeStatus    213 -   -
0.204   startup.handoff 1   -   -
0 Karma

Motivator

300s is along time. How are you generating the data that goes to the summary index? Are you forwarding the summary data to the 4 search peers/indexers (you have a search head and 4 indexers right or do you mean 4 systems that send data to splunk)? What does the Job Inspector look like where is most of the time for the search spent?

0 Karma