All Posts

Find Answers
Ask questions. Get answers. Find technical product solutions from passionate members of the Splunk community.

All Posts

1. 500GB/day is not that big 2. There are some general rules of thumb (which @livehybrid already covered) but the search - to be effective - must be well built from scratch. Sometimes it simply c... See more...
1. 500GB/day is not that big 2. There are some general rules of thumb (which @livehybrid already covered) but the search - to be effective - must be well built from scratch. Sometimes it simply can't be "fixed" if you have bad data (not "wrong", just inefficiently formed). 3. And there is no replacement for experience, unfortunately. Learn SPL commands, understand how they work, sometimes rethink your problem to fit better into SPL processing. 4. Use and love job inspector.
Right. If you don't have the values, you can use | fillnull '<1' '1<2' '2<5' '5<48'  '>48' | addtotals col=f row=t Alternatively you can fiddle with @ITWhisperer 's approach - use timechart to get... See more...
Right. If you don't have the values, you can use | fillnull '<1' '1<2' '2<5' '5<48'  '>48' | addtotals col=f row=t Alternatively you can fiddle with @ITWhisperer 's approach - use timechart to get the data with already filled with zeros and then retransform its results. The order of the fields you can change by adding fields command at the end.
Sorry, I know very basics of Splunk. I don't think I was able formulate the query you suggested as it return no output. Here is the query I ran: my query | rex field=_raw "Time=(?<NewTime>\d{4}\.\d+... See more...
Sorry, I know very basics of Splunk. I don't think I was able formulate the query you suggested as it return no output. Here is the query I ran: my query | rex field=_raw "Time=(?<NewTime>\d{4}\.\d+)" | eval TimeMilliseconds=(NewTime*1000) | eval timeperiod=case(TimeMilliseconds<1,"<1s",TimeMilliseconds>=1 AND TimeMilliseconds<2,"1-2s",TimeMilliseconds>=2 AND TimeMilliseconds<5,"2-5s",1=1,">5s") | untable _time msgsource count | eval group=mvindex(split(msgsource,": "),0) | eval msgsource=mvindex(split(msgsource,": "),1) | eval _time=_time.":".msgsource | xyseries _time group count | eval msgsource=mvindex(split(_time,":"),1) | eval _time=mvindex(split(_time,":"),0) | table _time msgsource total *
Sorry I didn't mention before that I know very basic of Splunk, so I couldn't follow everything you said. Here is the query that I tried and the result: | rex field=_raw "Time=(?<NewTime>\d{4}\.\d+)... See more...
Sorry I didn't mention before that I know very basic of Splunk, so I couldn't follow everything you said. Here is the query that I tried and the result: | rex field=_raw "Time=(?<NewTime>\d{4}\.\d+)" | eval TimeMilliseconds=(NewTime*1000) | eval timeperiod=case(TimeMilliseconds<1,"<1",TimeMilliseconds<2,"1<2",TimeMilliseconds<5,"<5",TimeMilliseconds<48,"5<48",1=1,">48") | bin _time span=1d | stats count by _time timeperiod msgsource | eval timesource=_time . "|" . msgsource | xyseries timesource timeperiod count | eval _time=mvindex(split(timesource,"|"),0),msgsource=mvindex(split(timesource,"|"),1) | fields - timesource output looks like this which is not what is expected. 5<48 >48 _time msgsource 1 439 2025-07-20 createAPI 1 1943 2025-07-20 RetrieveAPI   I was expecting an output something _time msgsource total <1s 1-2s 2-5s >5s    
Hi @zaks191  Do all your servers meet the minimum recommendations (16GB RAM/ 16 CPU Cores)? If so then your indexer configuration should suffice for a 500GB/day ingestion. It sounds like this is th... See more...
Hi @zaks191  Do all your servers meet the minimum recommendations (16GB RAM/ 16 CPU Cores)? If so then your indexer configuration should suffice for a 500GB/day ingestion. It sounds like this is the sort of task that would be better with the support of a Splunk Partner or Splunk Professional Services, but if tackling yourself then I would start with the following non-exchaustive list of query optimization techniques: Limit queries to only query the timerange required Ensure scheduled searches are not running more frequently than necessary Ensure dashboards utilise base searches where possible Ensure dashboards do not refresh/reload faster than necessary Use techniques such as tstats queries where possible Add TERM(<string>) values to your searches to help indexers find data faster Avoid wildcards in base searches; use specific terms or tags.  Did this answer help you? If so, please consider: Adding karma to show it was useful Marking it as the solution if it resolved your issue Commenting if you need any clarification Your feedback encourages the volunteers in this community to continue contributing
Hi Splunk Community, I'm new to Splunk and working on a deployment where we index large volumes of data (approximately 500GB/day) across multiple sources, including server logs and application metri... See more...
Hi Splunk Community, I'm new to Splunk and working on a deployment where we index large volumes of data (approximately 500GB/day) across multiple sources, including server logs and application metrics. I've noticed that some of our searches are running slowly, especially when querying over longer time ranges (e.g., 7 days or more). Here’s what I’ve tried so far: Used summary indexing for some repetitive searches. Limited the fields in searches using fields command. Ensured searches are using indexed fields where possible. However, performance is still not ideal, and I’m looking for advice on: Best practices for optimizing search performance in Splunk for large datasets. How to effectively use data models or accelerated reports to improve query speed. Any configuration settings (e.g., in limits.conf) that could help. My setup: Splunk Enterprise 9.2.1 Distributed deployment with 1 search head and 3 indexers Data is primarily structured logs in JSON format Any tips, configuration recommendations, or resources would be greatly appreciated! Thanks in advance for your help.
How is the action field populated as these events don't have "started", "blocked" nor "success"?
Hello, pardon my lack of proper vocab.  I hope I responded properly to your request for additional info. These are just the first two events from the data (.txt) file.   Thu Mar 31 2021 00:15:02 www... See more...
Hello, pardon my lack of proper vocab.  I hope I responded properly to your request for additional info. These are just the first two events from the data (.txt) file.   Thu Mar 31 2021 00:15:02 www1 sshd[4747]: Failed password for invalid user jabber from 118.142.68.222 port 3187 ssh2 Thu Mar 31 2021 00:15:02 www1 sshd[4111]: Failed password for invalid user db2 from 118.142.68.222 port 4150 ssh2 Thu Mar 31 2021 00:15:02 www1 sshd[5359]: Failed password for invalid user pmuser from 118.142.68.222 port 3356 ssh2 Thu Mar 31 2021 00:15:02 www1 su: pam_unix(su:session): session opened for user root by djohnson(uid=0) Thu Mar 31 2021 00:15:02 www1 sshd[2660]: Failed password for invalid user irc from 118.142.68.222 port 4343 ssh2  
Hello, pardon my lack of proper vocab.  Yes, for the second search I opened the "action" field in "Interesting Fields" and then clicked on the desired query (blocked, started, and success). The only ... See more...
Hello, pardon my lack of proper vocab.  Yes, for the second search I opened the "action" field in "Interesting Fields" and then clicked on the desired query (blocked, started, and success). The only one that produces found events is success.  Below please find a sample of the practice data used: Thu Mar 31 2021 00:15:02 www1 sshd[4747]: Failed password for invalid user jabber from 118.142.68.222 port 3187 ssh2 Thu Mar 31 2021 00:15:02 www1 sshd[4111]: Failed password for invalid user db2 from 118.142.68.222 port 4150 ssh2 Thu Mar 31 2021 00:15:02 www1 sshd[5359]: Failed password for invalid user pmuser from 118.142.68.222 port 3356 ssh2
Hello, pardon my lack of proper vocab.  For the second search, I opened the "action" field in "Interesting Fields" and then clicked on the desired query (blocked, started, and success). The only one ... See more...
Hello, pardon my lack of proper vocab.  For the second search, I opened the "action" field in "Interesting Fields" and then clicked on the desired query (blocked, started, and success). The only one that produces found events is success.  Below please find a sample of the practice data used: Thu Mar 31 2021 00:15:02 www1 sshd[4747]: Failed password for invalid user jabber from 118.142.68.222 port 3187 ssh2 Thu Mar 31 2021 00:15:02 www1 sshd[4111]: Failed password for invalid user db2 from 118.142.68.222 port 4150 ssh2 Thu Mar 31 2021 00:15:02 www1 sshd[5359]: Failed password for invalid user pmuser from 118.142.68.222 port 3356 ssh2 Thu Mar 31 2021 00:15:02 www1 su: pam_unix(su:session): session opened for user root by djohnson(uid=0) Thu Mar 31 2021 00:15:02 www1 sshd[2660]: Failed password for invalid user irc from 118.142.68.222 port 4343 ssh2 Thu Mar 31 2021 00:15:02 www1 sshd[1705]: Failed password for happy from 118.142.68.222 port 4174 ssh2 Thu Mar 31 2021 00:15:02 www1 sshd[1292]: Failed password for nobody from 118.142.68.222 port 1654 ssh2
Hello, pardon my lack of proper vocab.  For the second search, I opened the "action" field in "Interesting Fields" and then clicked on the desired query (blocked, started, and success). The only one ... See more...
Hello, pardon my lack of proper vocab.  For the second search, I opened the "action" field in "Interesting Fields" and then clicked on the desired query (blocked, started, and success). The only one that produces found events is success.  Below please find a sample of the practice data used: Thu Mar 31 2021 00:15:02 www1 sshd[4747]: Failed password for invalid user jabber from 118.142.68.222 port 3187 ssh2 Thu Mar 31 2021 00:15:02 www1 sshd[4111]: Failed password for invalid user db2 from 118.142.68.222 port 4150 ssh2 Thu Mar 31 2021 00:15:02 www1 sshd[5359]: Failed password for invalid user pmuser from 118.142.68.222 port 3356 ssh2 Thu Mar 31 2021 00:15:02 www1 su: pam_unix(su:session): session opened for user root by djohnson(uid=0) Thu Mar 31 2021 00:15:02 www1 sshd[2660]: Failed password for invalid user irc from 118.142.68.222 port 4343 ssh2 Thu Mar 31 2021 00:15:02 www1 sshd[1705]: Failed password for happy from 118.142.68.222 port 4174 ssh2 Thu Mar 31 2021 00:15:02 www1 sshd[1292]: Failed password for nobody from 118.142.68.222 port 1654 ssh2 Thu Mar 31 2021 00:15:02 www1 sshd[1560]: Failed password for invalid user local from 118.142.68.222 port 4616 ssh2 Thu Mar 31 2021 00:15:02 www1 sshd[59414]: Accepted password for myuan from 10.1.10.172 port 1569 ssh2 Thu Mar 31 2021 00:15:02 www1 sshd[1876]: Failed password for invalid user db2 from 118.142.68.222 port 1151 ssh2 Thu Mar 31 2021 00:15:02 www1 sshd[3310]: Failed password for apache from 118.142.68.222 port 4343 ssh2 Thu Mar 31 2021 00:15:02 www1 sshd[2149]: Failed password for nobody from 118.142.68.222 port 1527 ssh2 Thu Mar 31 2021 00:15:02 www1 sshd[2766]: Failed password for invalid user guest from 118.142.68.222 port 2581 ssh2 Thu Mar 31 2021 00:15:02 www1 sshd[3118]: pam_unix(sshd:session): session opened for user djohnson by (uid=0)
The typical issue when working in the ingest pipeline is that you don't have search-time field extracted at this point. You must work on raw event contents.
As often, there is more than one way to do things in SPL query | rex field=_raw "Time=(?<NewTime>\d{4}\.\d+)" This one is pretty OK. I'd _not_ go into multiplying it at the moment. You can... See more...
As often, there is more than one way to do things in SPL query | rex field=_raw "Time=(?<NewTime>\d{4}\.\d+)" This one is pretty OK. I'd _not_ go into multiplying it at the moment. You can multiply the aggregate at the end. The performance difference is minuscule but let's optimize it anyway. | eval timeperiod=case(Time<1,"<1",Time<2,"1<2",Time<5,"<5",Time<48,"5<48",1=1,">48) This way you have a "classifier" field by which you can do your count Now instead of timechart you can do | bin _time span=1d | stats count by _time timeperiod msgsource So now we can "pack" | eval timesource=_time . "|" . msgsource and table | xyseries timesource timeperiod count and split | eval _time=mvindex(split(timesource,"|"),0),msgsource=mvindex(split(timesource,"|"),1) | fields - timesource Warning: I'm writing this "on paper" because I don't have my Splunk instance at hand so it might contain some small syntax mistakes. The idea is there however.
Your description is not quite clear. First you're saying about  "ALLLOWED1" : "NONE" but then it suddenly turns out to be \"ALLOWEDFIELD\": \"NONE\". Make up your mind. Additionally, do you have yo... See more...
Your description is not quite clear. First you're saying about  "ALLLOWED1" : "NONE" but then it suddenly turns out to be \"ALLOWEDFIELD\": \"NONE\". Make up your mind. Additionally, do you have your fields extracted or do you have to dynamically pull the data from raw events?  
How did you come up with the second search? Is that the same as the first one just with one additional condition? What does your data look like?
Hi @LS1 , did you tried to click on the value in interesting fields to add to the search? on this way, you can see the exact syntax to use that you can add to your main search. Ciao. Giuseppe
It is a bit difficult to figure out what might be going on without some sample data. Please post some anonymised raw (unformatted) events in a code block using the </> format button above so we can s... See more...
It is a bit difficult to figure out what might be going on without some sample data. Please post some anonymised raw (unformatted) events in a code block using the </> format button above so we can see what you are dealing with.
This looks like it might be JSON data? If so, please post some sample data (anonymised appropriately) in raw format in a code block using the </> option, to preserve the formatting of your event.
Try something like this | untable _time msgsource count | eval group=mvindex(split(msgsource,": "),0) | eval msgsource=mvindex(split(msgsource,": "),1) | eval _time=_time.":".msgsource | xyseries _t... See more...
Try something like this | untable _time msgsource count | eval group=mvindex(split(msgsource,": "),0) | eval msgsource=mvindex(split(msgsource,": "),1) | eval _time=_time.":".msgsource | xyseries _time group count | eval msgsource=mvindex(split(_time,":"),1) | eval _time=mvindex(split(_time,":"),0) | table _time msgsource total *
I was able to write a query that group by api (msgsource) to show the response times, but I am trying to see if I can extract the result in a different format. Here is the query I used: query | rex ... See more...
I was able to write a query that group by api (msgsource) to show the response times, but I am trying to see if I can extract the result in a different format. Here is the query I used: query | rex field=_raw "Time=(?<NewTime>\d{4}\.\d+)" | eval TimeMilliseconds=(NewTime*1000) | timechart span=1d count as total, count(eval(TimeMilliseconds<=1000)) as "<1sec", count(eval(TimeMilliseconds>1000 AND TimeMilliseconds<=2000)) as "1sec-2sec" count(eval(TimeMilliseconds>2000 AND TimeMilliseconds<=5000)) as "2sec-5sec" count(eval(TimeMilliseconds>48000 )) as "48sec+", by msgsource Here is the output that I get today: _time total: retrieveApi total: createApi <1sec: retireveApi <1sec: createApi 1sec-2sec: retrieveApi 1sec-2sec: createApi 2sec-5sec: retrieveApi 2sec-5sec: createApi 25-07-13 1234 200 1200 198 34 1 0 1 2025-07-14 1000 335 990 330 8 5 2 0   This is what I would like to see, the results grouped by `_time` and `msgsource` both. _time msgsource total <1sec 1sec-2sec 2sec-5sec 2025-07-13 retrieveApi 1234 1200 34 0 2025-07-13 createApi 200 198 1 1 2025-07-14 retrieveApi 1000 990 8 2 2025-07-14 createApi 335 330 5 0