All Posts

Find Answers
Ask questions. Get answers. Find technical product solutions from passionate members of the Splunk community.

All Posts

To give further examples, a distributable streaming command that can run on an indexer can also run on the search head, so take this example index=_audit ``` This eval runs on the indexer ``` | eval... See more...
To give further examples, a distributable streaming command that can run on an indexer can also run on the search head, so take this example index=_audit ``` This eval runs on the indexer ``` | eval isAdmin=if(user="admin", 1, 0) ``` This lookup runs on the indexer ``` | lookup actions.csv action OUTPUT action_name ``` This stats runs on both indexer and search head, i.e. the indexer will generate stats and then pass its set of stats to the search head, along with all other stats from other indexers and then the final counters are merged on the search head ``` | stats count by user action_name isAdmin ``` This lookup runs on the search head, as the data now exists on the SH. Once the data is on the SH, it will not go back to the indexer. ``` | lookup users.csv user OUTPUT user_name ``` So now this eval runs on the search head ``` | eval do_alert=if(isAdmin, 1, 0) As you can see it contains some eval, lookup and stats commands. This search will be sent from the SH to the "search peers", which are the indexers it can use to search against. Each indexer will run this same search on the set of data it owns. The key point here is that once it hits the stats command, that is the trigger for the indexers to return their dataset to the search head. If you look at the job properties of any search that does a stats command, you will see in the phase0 detail something like the following for a simple "index=_audit | stats count by user" litsearch index=_audit | addinfo type=count label=prereport_events track_fieldmeta_events=true | fields keepcolorder=t "prestats_reserved_*" "psrsvd_*" "user" | prestats count by user this is showing that the indexer will return some "prestats", which is its own reduced data set that it will send to the search head. In the above example, the first lookup will run first on the indexer then the second on the SH. So when it talks about 'invoking' the command, it's really about where the data happens to be in the execution of the entire SPL.  As you can see, as soon as you use a dataset processing command or a transforming command, the data is shifted from the indexers to the search head, so you immediately lose parallelism, so it is best to put those type of commands as far down the SPL pipeline as possible. If you look at the command types table, you can see some commands can work differently depending on how it's called, e.g. fillnull is a dataset processing command with no parameters, but distributable streaming when used with a field name, so be aware of these subtle distinctions when considering search performance.
Do you have the same issue when referencing the lookup definition itself instead of the CSV file? Example:   <base_search> | lookup <lookup definition pointing to networks.csv> ip as src_ip OU... See more...
Do you have the same issue when referencing the lookup definition itself instead of the CSV file? Example:   <base_search> | lookup <lookup definition pointing to networks.csv> ip as src_ip OUTPUT category I think that the advanced settings may only be applied when referencing the definition.  
Hello, I have a search that's coming back with 'src' which is the source IP of a client, and I have a lookup file  called "networks.csv" that has a column with a header 'ip' which is a list of CID... See more...
Hello, I have a search that's coming back with 'src' which is the source IP of a client, and I have a lookup file  called "networks.csv" that has a column with a header 'ip' which is a list of CIDR networks. I have gone into the lookup definitions and set under the advanced options "CIDR(ip)" for that lookup file. I can see the headers being automatically being extracted in that UI. However, when I run the search and try to pull the category for the 'src' respective network, it does not work.  basesearch | lookup networks.csv ip as src_ip OUTPUT category I have validated that it's a CIDR issue by doing a "...| rex mode=sed field=src_ip " and placing a literal CIDR entry in there and having the category come out. Thank you for your help!
Oh dang, good catch with the trailing comma! So just tried to limit the initially called out events as much as possible on the base search with the additional filter of  (disposition.disposition=... See more...
Oh dang, good catch with the trailing comma! So just tried to limit the initially called out events as much as possible on the base search with the additional filter of  (disposition.disposition="TERMINATED" OR "connections{}.left.facets{}.number"=*) and then limiting the stats aggregations to just the fields that are required for your downstream analysis and display. Glad its running faster!
Hi,  Is it possible to create a tab on a dashboard while also creating a redirection to a new dashboard when the tab is clicked without having to click the clone the dashboard. Thanks in adva... See more...
Hi,  Is it possible to create a tab on a dashboard while also creating a redirection to a new dashboard when the tab is clicked without having to click the clone the dashboard. Thanks in advance! 
Ahh I see, Note: this response is assuming usage of classic Splunk dashboards (XML) So for panel_1 (used to gather the top source IP) You can add a <done> tag and set a token based on the valu... See more...
Ahh I see, Note: this response is assuming usage of classic Splunk dashboards (XML) So for panel_1 (used to gather the top source IP) You can add a <done> tag and set a token based on the value of Source_Network_Address. Example of Search_1: index="windows_logs" LogName="Security" Account_Domain=EXCH OR Account_Domain="-" EventCode="4625" OR EventCode="4740" user="john@doe.com" OR user="johndoe" | where NOT cidrmatch("192.168.0.0/16", Source_Network_Address) | stats count as count, values(Account_Domain) as Account_Domain, values(EventCode) as EventCode, values(user) as user by Source_Network_Address | sort 1 -count This token can then be referenced in panel_2 index="iis_logs" sourcetype="iis" s_port="443" sc_status=401 cs_method!="HEAD" c_ip=$ip$   In the XML this would look something like this, . . . <search> <query> index="windows_logs" LogName="Security" Account_Domain=EXCH OR Account_Domain="-" EventCode="4625" OR EventCode="4740" user="john@doe.com" OR user="johndoe" | where NOT cidrmatch("192.168.0.0/16", Source_Network_Address) | stats count as count, values(Account_Domain) as Account_Domain, values(EventCode) as EventCode, values(user) as user by Source_Network_Address | sort 1 -count </query> <earliest>-24h@h</earliest> <latest>now</latest> <done> <set token="ip">$result.Source_Network_Address$</set> </done> </search> . . .  Notice the <done><set token="ip">$result.Source_Network_Address$</set></done> nested in the <search> tags. This is taking the final result's value from the field Source_Network_Address and assigning it to a token named $ip$. This token can then be referenced by panel_2.
Ok, i had to make some minor changes, things I left out of my original json to simplify, but YES this did run faster! What is the reason for it tho? Seems like it avoids the (costly?) | stats ... See more...
Ok, i had to make some minor changes, things I left out of my original json to simplify, but YES this did run faster! What is the reason for it tho? Seems like it avoids the (costly?) | stats values(*) as * by guid or passing in the guids from a subsearch. I'll have to stare at that 'max(eval)' statement for awhile to get it! Here is my updated SPL and changed i made (some renaming and mvdedup for multivalue field stuff and removed a trailing comma you had after guid_terminated=1)   index="my_data" resourceId="enum*" (disposition.disposition="TERMINATED" OR "connections{}.left.facets{}.number"=*) | rename "connections{}.left.facets{}.number" as sourcenumber | eval sourcenumber=mvdedup(sourcenumber) | rename disposition.disposition as disposition | stats max(eval(if(disposition=="TERMINATED", 1, 0))) as guid_terminated, values(sourcenumber) as sourcenumber by guid | where 'guid_terminated'==1 | stats count as count by sourcenumber | sort -count      
Hi @gcusello and @dtburrows3 , Thanks for getting back to me. Sorry if my question wasn't 100% clear. So my current goal is that I'm attempting to create a dashboard. In one panel I have a base sear... See more...
Hi @gcusello and @dtburrows3 , Thanks for getting back to me. Sorry if my question wasn't 100% clear. So my current goal is that I'm attempting to create a dashboard. In one panel I have a base search of: index="windows_logs" LogName="Security" Account_Domain=EXCH OR Account_Domain="-" EventCode="4625" OR EventCode="4740" user="john@doe.com" OR user="johndoe" This is to grab the reason an account was locked out and would also show the source IP of that information. I essentially need to grab the IP information from this initial search so I can use it in the follow search: index="iis_logs" sourcetype="iis" s_port="443" sc_status=401 cs_method!="HEAD" c_ip=<source IP information from initial search> I tried to use a subsearch, but being the I am pulling from an index with iis logs, it's too large of a search and times out before it can complete.
Hi @cooldude1812  >>>it looks like all versions of SPlunk (up to 9.1.2) will work on Linux 3.x and 4.x kernels.  Yes, right. in fact, 5x also good (All Linux x86 (64-bit) 3x, 4x and 5x are good wit... See more...
Hi @cooldude1812  >>>it looks like all versions of SPlunk (up to 9.1.2) will work on Linux 3.x and 4.x kernels.  Yes, right. in fact, 5x also good (All Linux x86 (64-bit) 3x, 4x and 5x are good with Splunk 9.1.2)   >>>My concern about the 8.2.x is that SPlunk clearly says this is no longer supported since 9/30/23.x Yes, you are right.  Splunk Enterprise version 8.2.x is no longer supported as of September 30, 2023. To have Splunk support and to avoid the security issues on older versions, the Splunk Customers should upgrade from 8.2.x to 9.x.x.  hope you understand the issue better now, thanks. 
Hi some additional comments. ”depending on where in the search the command is invoked” I understand this like the position of command in SPL query define is this command run as normally parallel on... See more...
Hi some additional comments. ”depending on where in the search the command is invoked” I understand this like the position of command in SPL query define is this command run as normally parallel on indexers or is it run on SH. The key event is, are there any transforming commands positioned in SPL query before streaming command. When the search process returns back to SH it’s never go back to parallel mode to indexers. Example of this is e.g. .... | fields a b | stats values(a) by b vs .... | table a b | stats values(a) by b in first example stats is run on indexers and 2nd one in SH. r. Ismo
So I think on your streamstats example, the first usage of streamstats should be replaced with at "| stats" command instead. So something like this. | inputlookup direct_deposit_changes_v4_1_sinc... See more...
So I think on your streamstats example, the first usage of streamstats should be replaced with at "| stats" command instead. So something like this. | inputlookup direct_deposit_changes_v4_1_since_01012020.csv | eval _time=strptime(_time,"%Y-%m-%d") | stats count as daily_count by _time | sort 0 +_time | streamstats window=28 list(daily_count) as values_added_together, sum(daily_count) as sum_daily_count | table _time, daily_count, values_added_together, sum_daily_count  
Just out of curiosity, how fast does a search like this run?   index="my_data" resourceId="enum*" guid=* (disposition="TERMINATED" OR sourcenumber=*) | stats max(eval(if(disposition=="T... See more...
Just out of curiosity, how fast does a search like this run?   index="my_data" resourceId="enum*" guid=* (disposition="TERMINATED" OR sourcenumber=*) | stats max(eval(if(disposition=="TERMINATED", 1, 0))) as guid_terminated, values(sourcenumber) as sourcenumber by guid | where 'guid_terminated'==1, | stats count as count by sourcenumber   I'm going to try and simulate your scenario with a dataset I have locally and see what can be done to speed it up.
@dtburrows3  Did I set this up correctly? Note: I posted 5 days to simplify the use case but I need 28-day sums. | inputlookup direct_deposit_changes_v4_1_since_01012020.csv | eval _time = strpti... See more...
@dtburrows3  Did I set this up correctly? Note: I posted 5 days to simplify the use case but I need 28-day sums. | inputlookup direct_deposit_changes_v4_1_since_01012020.csv | eval _time = strptime(_time,"%Y-%m-%d") | sort 0 _time | streamstats count as daily_count by _time | streamstats window=28 list(daily_count) as values_added_together, sum(daily_count) as sum_daily_count | table _time, daily_count, values_added_together, sum_daily_count I ended up with over 3 million rows when it should have been around 1,460. Because it wasn't grouped by _time (%Y-%m-%d).   However, the foreach code produced the results I was looking for. | inputlookup direct_deposit_changes_v4_1_since_01012020.csv | eval _time = strptime(_time,"%Y-%m-%d") | stats count as daily_count by _time | mvcombine daily_count | eval cnt=0 | foreach mode=multivalue daily_count [| eval summation_json=if( mvcount(mvindex(daily_count,cnt,cnt+27))==28, mvappend( 'summation_json', json_object( "set", mvindex(daily_count,cnt,cnt+27), "sum", sum(mvindex(daily_count,cnt,cnt+27)) ) ), 'summation_json' ), cnt='cnt'+1 ] | rex field="summation_json" "sum\"\:(?<sum_daily_count>\d+)\}" | fields sum_daily_count | mvexpand sum_daily_count   I confirmed these were correct using Excel. Now I must add _time (%Y-%m-%d) to the results. Thanks and God bless, Genesius
I have an index that is receiving JSON data from a HEC, but with 2 different data sets and about 2M per day: DS1 {guid:"a1b2",resourceId="enum",sourcenumber:"55512345678"} DS2 {guid:"a1b2",re... See more...
I have an index that is receiving JSON data from a HEC, but with 2 different data sets and about 2M per day: DS1 {guid:"a1b2",resourceId="enum",sourcenumber:"55512345678"} DS2 {guid:"a1b2",resourceId="enum",disposition:"TERMINATED"} Now, counting terminated is easy and fast, this runs in 1s for all calls yesterday.   index="my_data" resourceId="enum*" disposition="TERMINATED" | stats count   But counting TOP 10 TERMINATED is not so much, this takes almost 10m on the same interval (yesterday):   index="my_data" resourceId="enum*" | stats values(*) as * by guid | search disposition="TERMINATED" | stats count by sourcenumber   I found some help before using subsearches and found this | format thing to pass in more than 10k values, but this still takes ~8m to run:   index="my_data" resourceId="enum*" NOT disposition=* [ search index="my_data" resourceId="enum*" disposition="TERMINATED" | fields guid | format ] | stats count by sourcenumber | sort -count   The issue is I need 'data' from DS1 when it 'matches guid' from DS2, but I've learned that 'join' isn't very good for Splunk (it's not SQL!) Thoughts on the 'most optimized' way to get Top 10 of data in DS1 where certain conditions of DS2? NOTE - I asked a similar question here, but can't figure out how to get the same method to work since it's not excluding, it's more 'joining' the data: https://community.splunk.com/t5/Splunk-Search/What-s-best-way-to-count-calls-from-main-search-excluding-sub/m-p/658884 As always, thank you!!!    
I'm not exactly sure what I need here.  I have a multiselect:       <input type="multiselect" token="t_resource"> <label>Resource</label> <choice value="*">All</choice> <prefi... See more...
I'm not exactly sure what I need here.  I have a multiselect:       <input type="multiselect" token="t_resource"> <label>Resource</label> <choice value="*">All</choice> <prefix>IN(</prefix> <suffix>)</suffix> <delimiter>,</delimiter> <fieldForLabel>resource</fieldForLabel> <fieldForValue>resource</fieldForValue> <search base="base_search"> <query>| dedup resource | table resource</query>       Table visual search:     | search status_code $t_code$ resource $t_resource$ HourBucket = $t_hour$ | bin _time span=1h | stats count(status_code) as StatusCodeCount by _time, status_code, resource | eventstats sum(StatusCodeCount) as TotalCount by _time, resource | eval PercentageTotalCount = round((StatusCodeCount / TotalCount) * 100, 2) | eval 200Flag = case( status_code=200 AND PercentageTotalCount < 89, "Red", status_code=200 AND PercentageTotalCount < 94, "Yellow", status_code=200 AND PercentageTotalCount <= 100, "Green", 1=1, null) | eval HourBucket = strftime(_time, "%H") | table _time, HourBucket, resource, status_code, StatusCodeCount, PercentageTotalCount, 200Flag     I also have a table, sample data below: _time resource 1/10/2024 Red 1/10/2024 Green   When the user select the multiselect dropdown and selects "ALL" (which is the default) the resource column should aggregate all the resource and display the resource as "All". But If the user select individual resources, such as "Red" and "Green" these should be shown and broken down by resource.   
I have a Dashboard created in Dashboard Studio and have added a simple dropdown to select "Production", "UAT, "SIT',"Development" and it sets a correspnding value that I use in the $api_env$ token as... See more...
I have a Dashboard created in Dashboard Studio and have added a simple dropdown to select "Production", "UAT, "SIT',"Development" and it sets a correspnding value that I use in the $api_env$ token as shown below.  This works correctly and results in CA03430-cmsviewapi-prodox as I expect. I want to use the value in the $api_env$ token to programmatically change the index between wf_wb_cbs and wf_cb_cbs_np. How do I do that?  I tried adding eval idx=if() at the front of my query but when it gets to the existing index= portion it flags an error "Unknown search command 'index' Thanks for any assistance! Here is the query as it now shows in my dashboard: "ds_search_1_new_new": {             "type": "ds.search",             "options": {                 "query": "index=wf_wb_cbs CA03430 sourcetype=\"cf:logmessage\" cf_app_name=\"CA03430-cmsviewapi-$api_env$\"| spath \"msg.customerIdType\" \r\n| eval eventHour = strftime(_time,\"%H\") | where eventHour >= \"07\" and eventHour < \"20\" \r\n| stats count by \"msg.customerIdType\"",                 "queryParameters": {                     "earliest": "$global_time.earliest$",                     "latest": "$global_time.latest$"                 }             },             "name": "cmsviewapi_activitybyrole"         },   And here is my input:         "input_w8NFtYlK": {             "options": {                 "items": [                     {                         "label": "Production",                         "value": "prodox"                     },                     {                         "label": "UAT",                         "value": "uathra"                     },                     {                         "label": "SIT",                         "value": "sit"                     },                     {                         "label": "Development",                         "value": "dev"                     }                 ],                 "token": "api_env",                 "defaultValue": ""             },             "title": "Environment",             "type": "input.dropdown",             "dataSources": {}         }  
Oh wow okay, you are using dashboard studio which is pretty different than legacy XML dashboards. I can try doing your use-case using dashboard studio and see if I can get it figured it out.
Splunk uses the map/reduce model for searching data.  The search head is the map/reduce coordinator.  The SH sends the query to each indexer for execution against the data stored on that indexer.  Ea... See more...
Splunk uses the map/reduce model for searching data.  The search head is the map/reduce coordinator.  The SH sends the query to each indexer for execution against the data stored on that indexer.  Each indexer sends its results back to the SH where they are combined and presented. Some SPL commands, however, must be executed against the full result set and so must be performed by the SH.  When such a command is reached, the indexer stops it portion of the search and returns the current results to the SH.  The SH then completes the query.  For example, the stats command must be performed by the SH.  A distributable streaming command that follows stats will be performed by the SH; otherwise it will be performed by the indexer.
O interesting! I thought I could reference "$data_source.result$" but I see that you need to tokenize it to make the result available. Also I thought there was some kind of automatic binding where an... See more...
O interesting! I thought I could reference "$data_source.result$" but I see that you need to tokenize it to make the result available. Also I thought there was some kind of automatic binding where an update to one component would be available through a binding to another component that references a token. So I need to use the done handler to create the token, as you show with XML. Unfortunately, my page markup is in JSON and my effort at translation does not work. The JSON will not save: "ds_lo0yYuYR": { "type": "ds.chain", "options": { "query": "| where type=\"SHIP_FROM_STORE\"\n| stats count as sfsCount", "done": { "set": { "token": { "Numerator": "$result.count$" } } }, "extend": "ds_z3a1i32Y", "enableSmartSources": true }, "name": "SFS_Count" },  
I agree with @gcusello here. I did notice the use of the | top limit=1 Source_Network_Address in the original subsearch which I think implies that you are trying to scope the search down to a single... See more...
I agree with @gcusello here. I did notice the use of the | top limit=1 Source_Network_Address in the original subsearch which I think implies that you are trying to scope the search down to a single IP address that shows up the most often in the windows_logs index and not in the 192.168.0.0/16 range. Which I think can be done with a couple of additional lines like this. (index="iis_logs" sourcetype="iis" s_port="443" sc_status=401 cs_method!="HEAD") OR (index="windows_logs" LogName="Security" Account_Domain=EXCH OR Account_Domain="-" EventCode="4625" OR EventCode="4740" user="john@doe.com" OR user="johndoe") | eval c_ip=coalesce(Source_Network_Address,c_ip) | stats dc(index) AS index_count, count(eval('index'=="windows_logs")) as win_log_count, values(*) AS * BY c_ip | where index_count=2 AND NOT cidrmatch("192.168.0.0/16", c_ip) | sort 1 -win_log_count