Splunk Search

How to achieve Optimized Search time in Splunk

Contributor

Now its being a serious issue. I need some expert advice for this.
Scenario:
Splunk 5.0.2
Data Input : TCP
License: Splunk Enterprise 5,120 MB
on TCP we are getting events in every 15 second for
1.App usage EventID=3
2.CPU usage EventID=4
for App Usage Json events are like:

{"BoxID":"222333","EID":3,"TS":"Fri May 10 02:49:36 2013", "MU":969632}

For CPU usage json events are like:

{"BoxID":"111222","EID":4,"TS":"Fri May 10 02:16:00 2013","CPUusage":4.5}

we are collecting data from last 6 months. Now the thing is we have too much of data like (15sec X 6 months+). On UI we are displaying the charts. For that My searches are
(Since MU and CPUUsage are in bytes..I need to convert it in MB)

sourcetype="myagent" 
  | spath path="EID" output=EventID | search EventID=3
  | spath path="BoxID" output=UID
  |spath path="MU" output=mu |eval mu=(mu/1024)|eval mu=round(mu,2)|fields mu,UID
| timechart  limit=0 first(mu) by UID 

Now on UI even if I select last 24 hours ...It takes forever..!!
I am not very experienced with Splunk. Still learning according to the requirements and I believe Splunk is very powerful for processing the data..So I want to ask:
1. What solutions I can apply for quicker chart display on UI
2. Is there any way that I can cache the result of this search every few minutes. so When user select Last 24 hours..I could just fetch the result from cache with calculated fields and Display the chart. That will be very Fast.
Please suggest and show me the way.This is critical.

1 Solution

Champion

I must say this is tough question to answer and a big topic. Keep in mind this is my understanding. I would also recommend read Exploring Splunk SPL.

First would start by using adding KV_MODE = json in my props.conf so Splunk automatically knows its json, personal preference. After than I would look at my base search, to maximize search performance you want to be specific as possible to limit the number of result being return. Always specify Index, source and/or source type, if possible key words within your data. Low cardinality fields always result in quicker searches. Filter unnecessary fields as soon as possible. Do stats or evals after unnecessary events and fields have been discarded. Use buck or span where possible.


#base search
Index=someindex sourcetype=myagent “\”EID\”:3”| fields MU, BoxID, _time| eval mu=round((MU/1024),2)| rename BoxID as UID| timechart span=1h limit=0 first(mu) by UID


Index=someindex sourcetype=myagent EID=3| fields MU, BoxID, _time| eval mu=round((MU/1024),2)| rename BoxID as UID| timechart span=1h limit=0 first(mu) by UID

Also consider using Summary indexing and report acceleration. I highly recommend doing this.

Aboutsummaryindexing

  • Make Searches as specific as possbile
  • Limit time range if possible
  • Filter out unneeded fields
  • before using eval or doing calculations do as much filter as possible
  • use advanced charting not timeline
  • Turn off field discovery
  • use Summary Index for large data sets that span days or months.
  • Refrain from doing sparse or rare searches.
  • use Search Job Inspector to find where your search is taking the longest.

Hope this helps or gets you started. Don’t forget to vote and accept answers that help.

Cheers,

View solution in original post

Champion

I must say this is tough question to answer and a big topic. Keep in mind this is my understanding. I would also recommend read Exploring Splunk SPL.

First would start by using adding KV_MODE = json in my props.conf so Splunk automatically knows its json, personal preference. After than I would look at my base search, to maximize search performance you want to be specific as possible to limit the number of result being return. Always specify Index, source and/or source type, if possible key words within your data. Low cardinality fields always result in quicker searches. Filter unnecessary fields as soon as possible. Do stats or evals after unnecessary events and fields have been discarded. Use buck or span where possible.


#base search
Index=someindex sourcetype=myagent “\”EID\”:3”| fields MU, BoxID, _time| eval mu=round((MU/1024),2)| rename BoxID as UID| timechart span=1h limit=0 first(mu) by UID


Index=someindex sourcetype=myagent EID=3| fields MU, BoxID, _time| eval mu=round((MU/1024),2)| rename BoxID as UID| timechart span=1h limit=0 first(mu) by UID

Also consider using Summary indexing and report acceleration. I highly recommend doing this.

Aboutsummaryindexing

  • Make Searches as specific as possbile
  • Limit time range if possible
  • Filter out unneeded fields
  • before using eval or doing calculations do as much filter as possible
  • use advanced charting not timeline
  • Turn off field discovery
  • use Summary Index for large data sets that span days or months.
  • Refrain from doing sparse or rare searches.
  • use Search Job Inspector to find where your search is taking the longest.

Hope this helps or gets you started. Don’t forget to vote and accept answers that help.

Cheers,

View solution in original post

Champion

If you want to do calculations automagiclly consider using EVAL- with your props.conf. With regards to caching Report Acceleration and Summary Indexing are going to be the best answers, you could try building a lookup table. @disha sorry if that dont help.

Contributor

Thankyou. This will help. Working on it.Can you please tell me how I can calculate mu in MB in advance and can i store it somewhere? I am asking specific about my question point#2.

0 Karma
State of Splunk Careers

Access the Splunk Careers Report to see real data that shows how Splunk mastery increases your value and job satisfaction.

Find out what your skills are worth!