About hettervik

isoutamo

Please accept that answer as Solution, so later other users can see it.

lalithasegu

In case if you find solution for this. please share it with me. really helpful.

hettervik

By Martin Hettervik, Senior Consultant and Team Leader at Accelerate at Iver, Splunk MVP The stats command is commonly used and well known. The tstats command is also well known, but for many only associated with searching data models. What if I told you that tstats can be used to gain the performance benefits of searching in tsidx files, without even using accelerated data models? In this post, we’ll explore how to super-optimize your Splunk searches using tstats, TERM, PREFIX, and a deeper understanding of how Splunk handles data under the hood. How do stats work, and why it’s not always optimal The stats command works by retrieving raw data from the indexers (stored in compressed journal files), performing relevant field extractions at search time, discarding the leftover raw data, and formatting a table. The flow is illustrated in the diagram below. As an example, you could search WinEventLog and create a table of most common computer names. In the picture below, the raw data that you would have so dig through is on the left, while the table you end up with is on the right. As you can see, most of the data in the raw event is not needed. Searching through all that unnecessary text creates longer search times and is not always optimal. What alternatives do we have for optimization? There are some commonly used tricks in Splunk for creating faster searches. Let’s have a quick look at some of them, and pros and cons for each. Accelerated data model Good if your search can utilize data that maps to standard CIM data models Can be rebuilt, if changes in data or fields Complex to create a new data model for a small single use case, e.g. a single dashboard Add continuous resource usage 24/7 Summary index Easy to understand and set up Not flexible for later changes in data or fields Add continuous resource usage 24/7 Accelerated report Easy to set up Not reusable for other searches Add continuous resource usage 24/7 One thing in common with all options above, is that they add continuous resource usage on Splunk. They run regular background jobs to create “summarized” data. This is not a problem if several searches can use the same summarized data, or if the searches are expected to run often, so that it’s worth it performance wise. However, for a single dashboard or report that is only expected to be used occasionally, the cost in resource usage could become higher than the value added by the search optimization. To avoid this continuous cost, can we use tstats directly on the data? Yes. The tstats command only searches in indexed metadata (tsidx files), not raw data. Even so, you don’t necessarily need indexed fields or an accelerated data model to use tstats, it depends on the minor and major breakers in the raw data. Learn how to use TERM and PREFIX together with tstats for amazing results. Let’s look into it! Enter tsidx files and breakers The tsidx files contains indexed fields and segmented keywords (term) from the raw data. The keywords points to the events in the raw data where the fields or keywords are present. This mechanic is used by Splunk under the hood by for search optimization. The diagram below illustrates how the pointers work. What segmented keywords ends up in the tsidx file is dependent on major and minor breakers in the raw data. Major breakers separate keywords. Examples on major breakers are space, tab, brackets and quotation marks. Minor breakers create different segment combinations from keywords. Examples on minor breakers are period, slash, colon and the equal sign. You can look up all the breakers in the Splunk documentation. As an example, se the event below, that shows how a key value pair in the raw data is stored as keywords in the tsidx file (notice the minor breakers). So that means we can use tstats directly raw data? In some cases, yes! Here is where the magic of TERM comes into play. Use TERM to encapsule keywords in tsidx files containing minor breakers. So, instead of writing a normal stats search like this: index=dc_logs source=WinEventLog:Security ComputerName=win-hp-25431 | stats count Try writing a tstats search like this: | tstats count where index=dc_logs source=WinEventLog:Security TERM(ComputerName=win-hp-25431) Note how we do not need to use TERM on the fields index and source, since they are already indexes fields. For the computer name keyword, we need to encapsule the key value pair in TERM because it contains minor breakers. How about if you wanted to create a timechart? The timechart command has the property that it adds empty time slots to the results (useful in visualizations), which is omitted by using simply stats. Can we re-create this effect with tstats? Yes. Instead of writing a normal timechart search like this: index=dc_logs source=WinEventLog:Security ComputerName=win-hp-25431 | timechart span=1d count Try writing a tstats search like this: | tstats count where index=dc_logs source=WinEventLog:Security TERM(ComputerName=win-hp-25431) by _time span=1d | timechart span=1d count Sweet! How about if we want to count by a field? Is this also possible with tstats? Indeed, we can use something called PREFIX. Instead of writing a stats search like this: index=dc_logs source=WinEventLog:Security | stats count by ComputerName Try writing a tstats search like this: | tstats count where index=dc_logs source=WinEventLog:Security by PREFIX(computername=) (Note that PREFIX always uses lowercase in the encapsulation.) Super nice, but when will using tstats not work? This trick by using tstats combined with TERM and PREFIX will not work if there are major breakers in the key value pairs in the raw events. Look at the example below. Note that the logon account is separated between key and value by a space (major breaker). We will therefore not find the key value pair “logon account: pburdytt2i” in the tsidx file. We will however find the username “pburdytt2i”, but cannot know the key context in which the use name exists. Unfortunately, this tstats trick only works if the data is formatted in the “correct” way, but when it does work, the efficiency boost is huge! Final notes Optimizing your Splunk searches with tstats, TERM, and PREFIX can dramatically improve performance and reduce resource usage in Splunk. While not every search can be converted, understanding how Splunk segments and stores data gives you some powerful knowledge to make faster and more optimized queries. If you haven’t done so already, have a look at your dashboards, reports and alerts, and see which searches you can super optimize. Happy splunking!

splunkmarroko · ‎04-25-2025

try this: on the app.conf file add below stanza: [ui] is_visible = true show_app_dark_theme = true

hettervik · ‎01-08-2025

The link seems to not be working, again. Anyone know what is the new link to reach the Splunk Core Consultant Labs? There is a guide here, but it does not contain any information on how to find the labs: https://www.splunk.com/en_us/pdfs/training/core-consultant-labs-course-description.pdf There is another guide here, but this also does not contain any information on where to find the labs: https://www.splunk.com/en_us/pdfs/training/splunk-core-certified-consultant-track.pdf

yeahnah · ‎12-12-2024

Thought I'd add to this post, in regards to using a curl command to push a lookup file to a Splunk instance, as other Splunk users may find it useful. It's not a replacement for @mthcht excellent python scripts but it is often easy to use curl commands when testing and validating things. Here's a worked example that creates a simple lookup file (tested against Cloud stack and lookup editor v4.0.4) ... curl -sk --request POST https://localhost:8089/services/data/lookup_edit/lookup_contents \ -H "Authorization: Splunk $MYTOKEN" \ -H "Content-Type: application/x-www-form-urlencoded" \ -d timeout=10 \ -d namespace=search \ -d lookup_file=lookupfilename.csv \ -d contents=[[\"field1\",\"field2\"],[\"value1\",\"value2\"]] \ -d owner=nobody # n.b. owner is only needed when creating new lookups - a 'user' name creates the new lookup file with private permissions, whereas 'nobody' results in it being shared globally Note, the 'contents' format must be a 2D JSON array. To make this easier, 'contents' can also be added via a file, like this ... $ cat <<EOF > myLocalLookup.json contents=[["field1","field2"],["value1","value2"]] EOF $ curl -sk --request POST https://localhost:8089/services/data/lookup_edit/lookup_contents \ -H "Authorization: Splunk $MYTOKEN" \ -H "Content-Type: application/x-www-form-urlencoded" \ -d timeout=10 \ -d namespace=search \ -d lookup_file=lookupfilename.csv \ -d @myLocalLookup.json \ -d owner=nobody Now, to really make this useful, existing CSV file's need to be formatted as JSON. There are multiple ways this could be done, but here is a simple python oneliner (*nix tested) that reads in a CSV file on stdin and outputs it as JSON. (python -c $'import sys;import csv;import json;\nwith sys.stdin as f: csv_array=list(csv.reader(f)); print("contents="+json.dumps(csv_array))' > mylocalLookup.json) < myLocalLookup.csv Hopefully. others may find this useful too.

hettervik · ‎10-21-2024

We have different lookup inputs into the Splunk ES asset list framework. Some values for assets change over time, for example due to DHCP og DNS renaming. When an asset gets a new IP due to e.g. DHCP, the lookup used as input into the asset framework is updated accordingly, but the merged asset lookup "asset_lookup_by_str" will contain both the new and the old IP. So the new IP is appended on the asset, it's not replacing the old IP. Due to "merge magic" that runs under the hood in the asset framework, over time this creates strange assets with many DNS names and many IPs. My question is, how long are asset list field values stored in the Splunk ES asset list framework? Are there any hidden values that keep track of say an IP, and will Splunk eventually remove the IP from the asset in the merged list? Or will the IP stay there forever, and these "multivalue assets" will thus just grow with more and more DNS names and IPs until the mv field limits are reached? And, if I reduce the asset list mv field limits, how does Splunk prioritize what values will be included or not? Does the values already on the merged list have priority, or does any new values have priority? Tried looking for answers in the documentation but could not find answers on my questions there. Hoping someone will share some insights here. Thanks!

AbhishekD · ‎10-17-2024

To send specific notable events from the Enterprise Security Incident Review page for investigation, an add-on called the ServiceNow Security Operations Add-on is available. This add-on allows Splunk ES analysts to create security-related incidents and events in ServiceNow. It features on-demand single ServiceNow event or incident creation from Splunk Event Scheduled Alerts, enabling the creation of both single and multiple ServiceNow events and incidents. For Detailed integrations steps refer The reverse integration between ServiceNow and Splunk for incident management can be achieved using an out-of-the-box method. If this reply is helpful, karma would be appreciated 🙂.

hettervik · ‎10-04-2024

We also had some inconsistencies with these field extractions. Figured out that we needed to push the new limits configuration to the indexers, as well as the search head. Only pushing to the search head will work if you have a centralizing command before the spath field extraction, but not for streaming field extractions.

hettervik · ‎05-27-2024

These dashboards are part of an app i made, to visualize Nessus security scans i Splunk. The idea is somewhat inspired by the existing Tenable App for Splunk from Tenable, but I wanted to to take the visualizations to the next level, and make the data easier to understand and navigate. The first dashboard is an overview dashboard. The picture below does not show the whole dashboard, but you get the point. It shows data from all vulnerability scans, with color coding differentiating the level of vulnerability severity. It's an easy way of seeing which environments and hosts have the most vulnerabilities, and see which type of vulnerabilities are most widespread. Also note that it shows what period there are scan data from (which might not be the same as the time picker) and how many networks have been scanned (out of the total number of networks). This next picture shows one of many drilldown dashboards in the app. It allows for a more detailed view of vulnerabilities per host, and also the possibility to get more information about a specific host if you click on the top table. This table also uses the same color coding as the overview dashboard. The bottom table links directly to the Tenable website, with more information about the specific vulnerability ID clicked on. All dashboards allows for various types of filtering, or example only show vulnerabilities with a minimum severity, e.g. at least medium. The dashboards are also utilizing the Splunk ES asset list to get more information about the hosts, so that it's possible to sort on vulnerabilities per business group or environment, among other things. Also, there is a lookup of "ignored vulnerabilities", for which the users can add vulnerabilities to ignore them in the dashboards, e.g. by editing it in the Splunk App for Lookup File Editing. Summary of functionality used in the dashboards: Color coding of vulnerability severity Drilldowns to other dashboards with more detailed information Drilldowns to external URLs with information on severity IDs Various filtering options on the dashboards Host enrichment from Splunk ES asset list Dynamic whitelisting of vulnerabilities through lookup file Correlation with other sources to show meta-information about vulnerability scans

hettervik · ‎05-14-2024

Thanks! We see now, after some digging, that the bug is probably caused by a notable event being too big. The error message is "events are not displayed in the search results because _raw fields exceed the limit". Seems like this one too big event have caused bugs in the "Incident Review - Main" search, which also caused other incidents to fail to load. We are deleting the event and fixing the correlation search now, to add a fail-safe to avoid creating this big notable events in the future. Hope this fixes the issue!

jbillings21 · ‎03-21-2024

$SPLUNK_HOME/etc/system/local/web.conf is the one you would want to adjust the splunkdConnectionTimeout in.

hettervik · ‎03-14-2024

I also have the same question. The new Outlook client makes links not clickable, I guess for security reasons. I want to make it so that my link in the Splunk email alert becomes clickable again, but cannot find any way of doing so?

naliniasb · ‎03-08-2024

Have you fixed this issue.If so please share the solution

roberto_baggio · ‎02-08-2024

Hey so did you find the solution? We stacked with the same issue and seams no one knows how to fix it.

raz_gp · ‎10-26-2023

Same issue. +0000 ERROR ModularInputs [18816 TcpChannelThread] - Argument validation for scheme=proofpoint_tap_siem: killing process, because executing it took too long (over 30000 msecs). For me , i saw this was an OS issue. On Ubuntu the input works, the Redhat boxes dont so ..

2cB2sVJCSp · ‎10-19-2023

I would say this partly covers shipping systemd journal logs to splunk. What I would really love is for splunk to be able to accept data sent by systemd-journal-upload ( https://www.freedesktop.org/software/systemd/man/latest/systemd-journal-upload.service.html ). That way you'd not need a forwarder on any popular systemd distribution anymore. You could just use systemd.

hettervik · ‎08-16-2023

I changed the calculated eval field used in the lookup in the datamodel to an extracted field, and rebuilt the datamodel, and now it works. It very much looks like there is a bug with the calculated eval and lookup fields in datamodels, were the lookup field can get some sort of "mixup" if it is using an calculated eval field. Perhaps some sort of race condition between the calculation of the eval field and the lookup field.

hettervik · ‎08-01-2023

I think the way to go for combining tstats searches without limits is using "prestats=t" and "append=true". Note that you maybe have to rewrite the searches quite a bit to get the desired results, but it should be possible. Also note that you maybe have to create a table before using the normal stats command to merge the tstats searches (don't know why, but worked for me, perhaps something with having all the data on the search head instead of distributed on the indexers). See thread here as well: https://community.splunk.com/t5/Splunk-Search/How-do-I-join-two-data-models-in-a-TSTATS-without-using-JOIN-or/m-p/479132

hettervik · ‎08-01-2023

I think the way to go for combining tstats searches without limits is using "prestats=t" and "append=true". Note that you maybe have to rewrite the searches quite a bit to get the desired results, but it should be possible. Something like so: | tstats summariesonly=true prestats=t latest(_time) as _time count AS "Count of Web" dc(Web.src) AS "Distinct Count of src" from datamodel=Web where (nodename = Web) groupby Web.src, Web.dest, Web.url, Web.http_user_agent, sourcetype, Web.user | tstats prestats=t append=true count AS "Count of Allowed Traffic" dc(All_Traffic.src_ip) AS "Distinct Count of src_ip" from datamodel=Network_Traffic where (nodename = All_Traffic.Traffic_By_Action.Allowed_Traffic) (All_Traffic.dest_ip!="10.*") (All_Traffic.direction="out*") groupby All_Traffic.src_ip, All_Traffic.dest_ip, sourcetype, All_Traffic.action, _time | rename Web.* as *, All_Traffic.* as * | table _time index dest url src_ip | stats latest(_time) as _time index values(dest) as dest_ip values(url) as url count by src_ip

marand · ‎03-06-2023

Just what I was looking for 🙂

ManjunathN · ‎01-12-2023

Hi @tauliang , @hettervik Was this fixed by any chance? Having same kind of issues of no format information found on the 2016 servers. Can someone help on this topic please. Thanks!

xori22 · ‎01-04-2023

Hi stuck on the same issue here.. works fine for specific time but when we go far from 50 days~ it breaks any help?

hettervik_new · ‎11-23-2022

Yes, I know, but say someone indeed does run the delete command on all data, it could still create some a fair amount of downtime before the Splunk admins are able to figure out what's wrong and restore all the data. If somebody deletes data say before a weekend or a holiday, the downtime would be even greater. Also, I'm aware that normally the admin rights are needed to access the delete command, but in my Splunk environment the delete command is basically never needed, so it adds no benefit, but adds a risk. I'm guessing this is the case for a lot of other customers as well. Thus, removing the option completely from the search head would be the best and most secure solution.

opoplawski · ‎11-18-2022

I voted for it here: https://ideas.splunk.com/ideas/EID-I-611

Posts	268
Solutions	21
Karma Given	111
Karma Received	51
Member Since	‎09-25-2015

Online Status	Offline
Date Last Visited	Friday

Super Optimize your Splunk Stats Searches: Unlocki...

Is there a way to re-index data from an index to a...

How long are asset list field values stored in the...

Nessus security scans dashboard and drilldown

How do Splunk ES create incidents from notable eve...

Why is my lookup field from KV store in datamodel ...

Is an admin-user available by default on a Splunk ...

Is there any way to securely disable the delete co...

How to make the Splunk ES Risk-Based Alerting risk...

How to add setup parameters to a new shell input i...

Re: Is there a way to re-index data from an index ...

Re: Has anyone encountered this error: Asset and I...

Super Optimize your Splunk Stats Searches: Unlocki...

Re: How to add dark theme compatibility to custom ...

Re: Core Certified Consultant exam prerequisite

Re: Can you create/modify a lookup file via REST A...

How long are asset list field values stored in the...

Re: Does the service now integration work as an ad...

Re: Not all field values are extracted for long JS...

Nessus security scans dashboard and drilldown

Re: How do Splunk ES create incidents from notable...

Re: install add-on: Error connecting to /services/...

Re: Add hyperlink in email alert

Re: Webhook - error sending webhook request: HTTP ...

Re: Why aren't Risk Score, Risk Event and Risk Obj...

Re: Timeout when configuring Proofpoint TAP SIEM M...

Re: Send journald logs to Splunk

Re: Why is my lookup field from KV store in datamo...

Re: How to combine two tstat search to create one ...

Re: How do I join two data models in a TSTATS with...

Re: How to make the Splunk ES Risk-Based Alerting ...

Re: How to get a reasonable input for WindowsUpdat...

Re: Why do we get errors on the REST command in th...

Re: Is there any way to securely disable the delet...

Re: Can I use the Microsoft Cert Store for Univers...

Are you a member of the Splunk Community?