Hi everyone!
I am working on building a dashboard which captures all the firewall, Web proxy, EDR, WAF, Email, DLP blocked for last 6 months in a table format which should look like this -
I am able to write the query which will give me the count for each parameter and then I append all the single query into one which is making the final query run slower and taking forever to complete. Here is the final query-
| tstats `security_content_summariesonly` count as Blocked from datamodel=Network_Traffic
where sourcetype IN ("cp_log", "cisco:asa", "pan:traffic") AND All_Traffic.action="blocked" earliest=-6mon@mon latest=now by _time
| eval Source="Firewall"
| tstats `security_content_summariesonly` count as Blocked from datamodel=Web
where sourcetype IN ("alertlogic:waf","aemcdn","aws:*","azure:firewall:*") AND Web.action="block" earliest=-6mon latest=now by _time
| eval Source="WAF"
| append [search index=zscaler* action=blocked sourcetype="zscalernss-web" earliest=-6mon@mon latest=now
| bin _time span=1mon
| stats count as Blocked by _time
| eval Source="Web Proxy"]
| append [| tstats summariesonly=false dc(Message_Log.msg.header.message-id) as Blocked from datamodel=pps_ondemand where (Message_Log.filter.routeDirection="inbound") AND (Message_Log.filter.disposition="discard" OR Message_Log.filter.disposition="reject" OR Message_Log.filter.quarantine.folder="Spam*") earliest=-6mon@mon latest=now by _time
| eval Source="Email"]
| append [search index=crowdstrike-hc sourcetype="CrowdStrike:Event:Streams:JSON" "metadata.eventType"=DetectionSummaryEvent metadata.customerIDString=* earliest=-6mon@mon latest=now
| bin _time span=1mon
| transaction "event.DetectId"
| search action=blocked NOT action=allowed
| stats dc(event.DetectId) as Blocked by _time
| eval Source="EDR"]
| append [search index=forcepoint_dlp sourcetype IN ("forcepoint:dlp","forcepoint:dlp:csv") action="blocked" earliest=-6mon@mon latest=now
| bin _time span=1mon
| stats count(action) as Blocked by _time
| eval Source="DLP"]
| eval MonthNum=strftime(_time, "%Y-%m"), MonthName=strftime(_time, "%b")
| stats sum(Blocked) as Blocked by Source MonthNum MonthName
| xyseries Source MonthName Blocked
| addinfo
| table Source [| makeresults count=7
| streamstats count as month_offset
| eval start_epoch=relative_time(now(),"-6mon@mon"), end_epoch=now()
| eval start_month=strftime(start_epoch, "%Y-%m-01")
| eval month_epoch = relative_time(strptime(start_month, "%Y-%m-%d"), "+" . (month_offset-1) . "mon")
| where month_epoch <= end_epoch
| eval month=strftime(month_epoch, "%b")
| stats list(month) as search
]
I figured out the issue is with the firewall query-
| tstats `security_content_summariesonly` count as Blocked from datamodel=Network_Traffic
where sourcetype IN ("cp_log", "cisco:asa", "pan:traffic") AND All_Traffic.action="blocked" earliest=-6mon@mon latest=now by _time
| eval Source="Firewall"
Can someone guide me how to fix this issue. I have been stuck in this issue from 2 weeks 😞
Multiple appends: Fundamentally, this is the main performance killer. You're running 6 distinct "heavy" searches.
Main performance killer is on the transaction in EDR: The transaction "event.DetectId" command is notoriously resource-intensive, especially over a 6-month period. It should be avoided if at all possible, or its scope drastically limited
Regards,
Prewin
Splunk Enthusiast | Always happy to help! If this answer helped you, please consider marking it as the solution or giving a kudos/Karma. Thanks!
I removed the transaction command. Lets make it simple.
I need a table like this which plots the blocked numbers of emails, firewalls, DLP, EDR, Web proxy and WAF for last 6 months showing the count for each month and its total similar to this.
What I did was modified each query to give data for last 6months for each parameter and I then simply append that to one table which is not a good practice. Hence I am here asking help from the experts.
I can share the individual queries if that helps-
Email - | tstats summariesonly=false dc(Message_Log.msg.header.message-id) as Blocked from datamodel=pps_ondemand where (Message_Log.filter.routeDirection="inbound") AND (Message_Log.filter.disposition="discard" OR Message_Log.filter.disposition="reject" OR Message_Log.filter.quarantine.folder="Spam*") earliest=-6mon@mon latest=now by _time -- here from the datamodel pps, I am simply counting the Spam, inbound and discard emails
DLP- index=forcepoint_dlp sourcetype IN ("forcepoint:dlp","forcepoint:dlp:csv") action="blocked" earliest=-6mon@mon latest=now
| bin _time span=1mon
| stats count(action) as Blocked by _time
Web Proxy- index=zscaler* action=blocked sourcetype="zscalernss-web" earliest=-6mon@mon latest=now
| bin _time span=1mon
| stats count as Blocked by _time
EDR- index=crowdstrike-hc sourcetype="CrowdStrike:Event:Streams:JSON" "metadata.eventType"=DetectionSummaryEvent metadata.customerIDString=* earliest=-6mon@mon latest=now
| bin _time span=1mon
| search action=blocked NOT action=allowed
| stats dc(event.DetectId) as Blocked by _time
WAF- tstats `security_content_summariesonly` count as Blocked from datamodel=Web
where sourcetype IN ("alertlogic:waf","aemcdn","aws:*","azure:firewall:*") AND Web.action="block" earliest=-6mon@mon latest=now by _time -- web is an accelerated datamodel in my environment
`security_content_summariesonly` expands to summariesonly=false allow_old_summaries=true fillnull_value=null
lastly,
Firewall- | tstats `security_content_summariesonly` count as Blocked from datamodel=Network_Traffic
where sourcetype IN ("cp_log", "cisco:asa", "pan:traffic") AND All_Traffic.action="blocked" earliest=-6mon@mon latest=now by _time
How did you figure out that the problem was with that firewall search?
You are doing lots of appends, which is always going to slow down any search, as you are pushing everything to the search head, but also you are doing an unbounded 6 month transaction statement, which again is going to be pretty slow and potentially unpredictable.
What are your data volumes for each individual search? The separate times for your 6 individual searches will not sum up to give you the expected cost of the overall search and I would not actually expect the tstats searches per se to be the source of any performance problem. If you are searching summaries only from accelerated datamodels, they should be the least of your worries.
Your CrowdStrike search is getting ALL the data for 6 months and running transaction on that, which if you have any kind of volume there, is unlikely to be reliable, because transaction will silently drop results when it hits limits.
Are you sure this is your literal search? Because you cannot pipe to a tstats command unless it's with prestats=t append=t.
Also, what does your security_content_summariesonly macro expand to?
Also also - you're appending two "full" searches. Are you sure you're not hitting subsearch limits?
And back to the point - that's what job details and job log are for - see the timeline, see where Splunk spends its time. Check the scanned results vs. returned results...
@PickleRick ,
summariesonly=false allow_old_summaries=true fillnull_value=null expands to
summariesonly=false allow_old_summaries=true fillnull_value=null
I re-arranged the parameters a bit and now it seems to be loading in around 6 mins now. I need to optimize it. Here the expanded view of the query-
| tstats summariesonly=false dc(Message_Log.msg.header.message-id) as Blocked from datamodel=pps_ondemand where (Message_Log.filter.routeDirection="inbound") AND (Message_Log.filter.disposition="discard" OR Message_Log.filter.disposition="reject" OR Message_Log.filter.quarantine.folder="Spam*") earliest=-6mon@mon latest=now by _time
| eval Source="Email"
| append
[ search index=forcepoint_dlp sourcetype IN ("forcepoint:dlp","forcepoint:dlp:csv") action="blocked" earliest=-6mon@mon latest=now
| bin _time span=1mon
| stats count(action) as Blocked by _time
| eval Source="DLP"]
| append
[ search index=zscaler* action=blocked sourcetype="zscalernss-web" earliest=-6mon@mon latest=now
| bin _time span=1mon
| stats count as Blocked by _time
| eval Source="Web Proxy"]
| append
[ search index=crowdstrike-hc sourcetype="CrowdStrike:Event:Streams:JSON" "metadata.eventType"=DetectionSummaryEvent metadata.customerIDString=* earliest=-6mon@mon latest=now
| bin _time span=1mon
| transaction "event.DetectId"
| search action=blocked NOT action=allowed
| stats dc(event.DetectId) as Blocked by _time
| eval Source="EDR"]
| append
[| tstats summariesonly=false allow_old_summaries=true fillnull_value=null count as Blocked from datamodel=Web where sourcetype IN ("alertlogic:waf","aemcdn","aws:*","azure:firewall:*") AND Web.action="block" earliest=-6mon latest=now by _time
| eval Source="WAF"]
| append
[| tstats summariesonly=false allow_old_summaries=true fillnull_value=null count as Blocked from datamodel=Network_Traffic where sourcetype IN ("cp_log", "cisco:asa", "pan:traffic") AND All_Traffic.action="blocked" earliest=-6mon@mon latest=now by _time
| eval Source="Firewall"]
| eval MonthNum=strftime(_time, "%Y-%m"), MonthName=strftime(_time, "%b")
| stats sum(Blocked) as Blocked by Source MonthNum MonthName
| xyseries Source MonthName Blocked
| addinfo
| table Source Dec Jan Feb Mar Apr May Jun
The goal is to get a table like this -
This is one huge search. Check each of the "component searches" on their own and see how they fare. Since some of them are raw event searches over a half a year worth of data, possibly through a significant subset of that data, I expect them to be slow just because you have to plow through all those events (and one of those subsearches has a very ugly field=* condition which makes Splunk have to parse every single event!).
If you need that literal functionality from those searches, I see no other way than using some acceleration techniques - the searches themselves don't seem to be very much "optimizable". You might try to change some of them to tstats with PREFIX/TERM but only if your data fits the prerequisites.