About skawasaki_splun

skawasaki_splun · ‎06-21-2018

You have an extra " in front of your REGEX in transforms.conf .

skawasaki_splun · ‎05-10-2018

So does the namespace parameter in GET or POST just not work or doesn't do what you expect it to do?

skawasaki_splun · ‎05-08-2018

Just like stats , be careful for when either a , b, or z is null() since then it won't show up at all. If you need to account for null values, then use either fillnull or coalesce() before stats . Since you can't do that in tstats ( tstats has to be the first command), then you'll have to modify your data model and do something like a=coalesce(a, "null") as an evaluated field.

skawasaki_splun · ‎05-08-2018

I'm pretty confused on what you're trying to do. Is it | tstats sum(bytes) from datamodel=foo by a b z ? This will give you the sum of bytes for every unique combination of a , b , and z .

skawasaki_splun · ‎05-08-2018

This won't give exactly the same thing since the first one will give the field values(y) while the latter gives just y . Aside from that, I think those two can give the same results if for every unique z , there is one unique y . Just to make sure you understand that values() gives all distinct values, right? So it will remove duplicate values. It also sorts the distinct values alphabetically.

skawasaki_splun · ‎05-03-2018

It turns out you have to use epoch time. I found this out when I actually opened the fill_summary_index.py script and saw Usage: splunk cmd python fill_summary_index.py [OPTIONS] ***Note: <boolean> options accept the values "1", "t", "true", or "yes" for true and "0", "f", "false", or "no" for false -et <string> Earliest time (required). Either a UTC time (integer since unix epoch) or a Splunk search relative time string [1]. -lt <string> Latest time (required). Either a UTC time (integer since unix epoch) or a Splunk search relative time string [1].

skawasaki_splun · ‎11-30-2017

If you're gonna group everything under 5% as "Other", then do you really want to limit to the top 100? Your percentage won't add up to 100%.

skawasaki_splun · ‎11-17-2017

I've done all of the steps above and my generic S3 input is constantly stuck on 2017-11-18 05:56:44,694 level=INFO pid=71734 tid=Thread-4 logger=splunk_ta_aws.modinputs.generic_s3.aws_s3_data_loader pos=aws_s3_data_loader.py:_do_index_data:95 | datainput="irs_990" bucket_name="splunk4good-irs-form-990" | message="The last data ingestion iteration hasn't been completed yet."

skawasaki_splun · ‎03-16-2017

Actually your dataset works but it's probably not what you wanted:

skawasaki_splun · ‎03-14-2017

Great! I'll be updating the app a lot to do minor fixes and updates so be sure to get the latest version.

skawasaki_splun · ‎03-14-2017

All of your rows are the same. Every row should be unique. That's why my example uses ... | stats count by foo bar .

skawasaki_splun · ‎03-14-2017

Can you upload your table search results in CSV format somewhere on the web? Do you have all of the required field columns?

skawasaki_splun · ‎03-14-2017

I just updated the app recently since it was missing the doc page. Are you saying the app "crashes" with a 404 Page Not Found (doc not found) or the actual visualization crashes?

skawasaki_splun · ‎12-15-2016

The short answer: You have caching issues. To fix server caching: Restart Splunk (this will force Splunk to look for new JS files or check for modifications in your app's appserver/static/ ) To fix client caching: Either reload clearing cache (ctrl-shift-r or command-shift-r) or delete all caches on your browser (but this will affect all other websites) If you don't want to restart Splunk for every JS modification (I wouldn't), then add this to web.conf : [settings] minify_js = False minify_css = False js_no_cache = True cacheEntriesLimit = 0 cacheBytesLimit = 0 enableWebDebug = True This will disable server caching in Splunk. NOTE that you probably don't want to do this on your production instances since caching helps performance. Source: https://docs.splunk.com/Documentation/Splunk/6.5.1/AdvancedDev/CustomVizTutorial#Development_mode_settings For client caching, you can do the keyboard shortcut, but I like to the option to disable cache when the dev tool is open (Chrome Developer Tools has this option; not sure about other browsers).

skawasaki_splun · ‎11-15-2016

Isn't it just... | eval bytes=len(r1)+len(r2) ? https://docs.splunk.com/Documentation/Splunk/6.5.0/SearchReference/CommonEvalFunctions

skawasaki_splun · ‎03-30-2016

@mikev If it's helpful then just upvote the answer :-).

skawasaki_splun · ‎03-30-2016

See my .conf 2014 talk: https://conf.splunk.com/session/2014/conf2014_SatoshiKawasaki_Splunk_WhatsNew.mp4 https://conf.splunk.com/session/2014/conf2014_SatoshiKawasaki_Splunk_WhatsNew.pdf Full list here: https://conf.splunk.com/speakers/2014.html

skawasaki_splun · ‎03-30-2016

At a minimum it just needs 2 columns: lat and lon Just make sure the your field names matches the value of lat_field and lon_field <div id="globe_search" class="splunk-manager" data-require="splunkjs/mvc/searchmanager" data-options='{ "search": "| inputlookup sample_geo.csv", "preview": true, "earliest_time": "0", "latest_time": "now" }'> </div> <div id="globe" class="splunk-view" data-require="app/custom_vizs/components/globe/globe" data-options='{ "managerid": "globe_search", "world_image_path": "app/custom_vizs/components/globe/world_nature.jpg", "lat_field": "lat", "lon_field": "lon", "group_by_field": { "type": "token_safe", "value": "$$grouping$$" }, "spin_speed": 1 }'> </div>

skawasaki_splun · ‎03-30-2016

Have you tried passing tokens to another page using something my_dashboard?form.foo=value ? Where form.foo is an input that sets the token foo . Then you can $foo$ in the other Gantt chart.

skawasaki_splun · ‎03-30-2016

Nice analysis! Quick tip: It's better to use | makeresults instead of | stats count :-).

skawasaki_splun · ‎03-30-2016

Here is the code that calculates the IQR (not written by me): boxplot.js // Returns a function to compute the interquartile range. function iqr(k) { return function(d, i) { var q1 = d.quartiles[0], q3 = d.quartiles[2], iqr = (q3 - q1) * k, i = -1, j = d.length; while (d[++i] < q1 - iqr) ; while (d[--j] > q3 + iqr) ; return [i, j]; }; } d3.box.js box.quartiles = function(x) { if (!arguments.length) return quartiles; quartiles = x; return box; }; and function boxQuartiles(d) { return [ d3.quantile(d, .25), d3.quantile(d, .5), d3.quantile(d, .75) ]; } So it looks like the bottom and upper whiskers are the first and third quartiles, respectively. https://en.wikipedia.org/wiki/Quartile These files are found under $SPLUNK_HOME/etc/apps/custom_vizs/appserver/static/components/boxplot

skawasaki_splun · ‎03-30-2016

"search": { "type": "token_safe", "value": "index=foo sourcetype=$$sourcetype$$ | stats count" } The answer is in the Tutorial page of the Custom Visualization app: Use the "token_safe" and the double-dollar sign syntax ($$token$$) to use tokens inside "data-options". This also applies to the search query. This is because if a regular $token$ is used then the <html> panel will clear everything in it if that $token$'s value changes. "earliest_time": { "type": "token_safe", "value": "$$earliest$$" },

skawasaki_splun · ‎03-17-2016

Thanks. I've added INDEXED_EXTRACTIONS .

skawasaki_splun · ‎10-20-2015

tstats is faster than stats since tstats only looks at the indexed metadata (the .tsidx files in the buckets on the indexers) whereas stats is working off the data (in this case the raw events) before that command. Since tstats can only look at the indexed metadata it can only search fields that are in the metadata. By default, this only includes index-time fields such as sourcetype , host , source , _time , etc. You can do this: | tstats count by index sourcetype source But you can't do this: | tstats count where status>200 by username Since status and username are not index-time fields (they are search-time). tstats can run on the index-time fields from the following methods: An accelerated data models A namespace created by the tscollect search command Index-time fields manually via fields.conf , props.conf , and transforms.conf INDEXED_EXTRACTIONS in props.conf for structured data like CSV Generally, I recommend using accelerated data models. References: http://docs.splunk.com/Documentation/Splunk/latest/SearchReference/tstats http://docs.splunk.com/Documentation/Splunk/latest/SearchReference/tscollect http://docs.splunk.com/Documentation/Splunk/latest/Knowledge/Acceleratedatamodels http://docs.splunk.com/Documentation/Splunk/latest/Indexer/Indextimeversussearchtime http://docs.splunk.com/Documentation/Splunk/latest/Data/Configureindex-timefieldextraction http://docs.splunk.com/Splexicon:Tsidxfile ============================ EDIT #2 ============================= For even more in-depth articles then see 2016 talk by David Veuve: http://conf.splunk.com/sessions/2016-sessions.html#search=tstats& 2017 talk again by David Veuve: http://conf.splunk.com/sessions/2017-sessions.html#search=tstats& 2017 talk by me: http://conf.splunk.com/sessions/2017-sessions.html#search=speed%20up& ============================ EDIT #1 ============================= Since this seems to be an popular answer, I'll get in even more details: For our example, let's use the out-of-the-box data model called "Splunk's Internal Server Logs - SAMPLE" at http://127.0.0.1:8000/en-US/app/search/data_model_editor?model=%2FservicesNS%2Fnobody%2Fsearch%2Fdatamodel%2Fmodel%2Finternal_server First, run a simple tstats on the DM (doesn't have to be accelerated) to make sure it's working and you get some result: | tstats count from datamodel=internal_server Note that I use the DM filename internal_server (ie Object ID), not the "pretty" name. If the DM isn't accelerated then tstats will translate to a normal search command, so the above command will run: index=_internal source=*scheduler.log* OR source=*metrics.log* OR source=*splunkd.log* OR source=*license_usage.log* OR source=*splunkd_access.log* | stats count The translation is defined by the base search of the DM (under "Constraints"). You can verify that you'll get the exact same count from both the tstats and normal search. Make sure you use the same fixed time range (ie from X to Y). Don't do "Last X minutes" since the time range will be different when you run the search ad-hoc. If the data model is accelerated then the new *.tsidx indexed files are created on the indexers at $SPLUNK_DB$/<index_name>/datamodel_summary/<bucket_id>_<indexer_guid>/<search_head_guid>/DM_<app>_<data_model_name> . If you wanna "see" what's inside these tsidx files then you can do something like: ./splunk cmd walklex /opt/splunk/var/lib/splunk/_internaldb/datamodel_summary/53_FC88CEC5-A07C-40AB-AC9A-6098C3336C42/FC88CEC5-A07C-40AB-AC9A-6098C3336C42/DM_search_internal_server/1457686104-1457046881-3297202533729992781.tsidx "*" Anyway, tstats can basically accesses and searches on these special, DM-created tsidx files. You tell tstats which DM to use with the from datamodel=internal_server clause. Now accelerate the internal_server DM if you haven't already. Pick a window big enough like 7 days and search the last 24 hours for testing. Once the DM is finished accelerating to 100%, then try running this: | tstats count where index=_internal by sourcetype source And note how much faster this is compared to index=_internal | stats count by sourcetype source If you don't see a big difference then either try increasing the time range/acceleration window or create your own DM on a much chattier data source/index. Remember that everything has a cost. And in this case, you'll trading disk space to gain faster search. That's why the DM status in the UI tells you how much disk space the DM acceleration takes up. The size is how big the .tsidx files are across all indexers. But this still a great trade. Disks are cheap; CPUs are not. As mentioned before, another drawback of tstats is that you can only access either the default index-time fields or custom index-time fields created by DM acceleration. And if you forgot to add a field in the DM then you must stop/delete the DM acceleration, modify the DM, then re-accelerate. On a big DM (like from a large Splunk Enterprise Security environment), this could take hours or days depending on how much data needs to be accelerated and how busy the servers are. Note that tstats is like stats but more "SQL-like". It can take a where and a by clause too. For example: | tstats count from datamodel=internal_server where source=*scheduler.log Which happens to be the same as | tstats count from datamodel=internal_server where nodename=server.scheduler Because this DM has a child node under the the Root Event. The name, once again, comes from the "Object ID", not the pretty name label (ie use summaryindexing , not "Summary Indexing Searches"). Also note that every field you want to reference in the DM, must be prefixed by the node name Object ID server . Except the default 4 fields in the DM: _time , host , source , sourcetype . So you this won't work: | tstats count from datamodel=internal_server by name current_size_kb name and current_size_kb aren't one of the 4 default DM fields, so it must be server.name and server. current_size_kb . This is also the main reason I choose very short (usually one-letter) node names since it can become very annoying to write server. all the time. One last thing worth mentioning is tstats performance. We all know tstats on an accelerated DMs is fast since it's mostly reading from disk and minimizing computation, but tstats isn't very good at returning many, many results. So although you can do this: | tstats count from datamodel=foo by a.apple a.pear a.orange _time span=1s You really shouldn't. tstats can't return raw events and trying to "trick" it to return raw events by using span=1s is going against its design principle. You can be clever and get raw events if you use tstats inside of a subsearch like this: index=data [| tstats count from datamodel=foo where a.name="hobbes" by a.id a.user | rename a.* as * | fields - count] So basically tstats is really good at aggregating values and reducing rows. tstats will have as bad performance as a normal search (or worse) if your search isn't trying to reduce. For example, if you have 10 million rows in a DM and your tstats is grouping everything by _time span=1s and returning 8 million rows, then that's going to be a slow search even if your tstats is searching on an accelerated DM. But if your tstats is doing something like avg(all.foo) by all.group and returning only 1000 rows (but still searching on 10 million events) then it'll be blazing fast since it's reducing. Also note that if you do by _time in tstats then tstats will automatically group _time based on the search time range similar to timechart (ie if you search the last 24 hours then the bucket/group size will be 30 minutes). You also can't go any granular than 1 second so all microseconds will be group together. Lastly tstats has an prestats=t option, but that's another lesson for another day (prestats is like si-commands in Splunk). If you really want to know more then check out my 2017 conf talk "Speed up your search!". Hopefully that is enough to get you started. If you get stuck then troubleshoot your tstats by keep removing extra clause until you get results again (like removing the by and where clauses). Eventually you'll end up with the most basic tstats command that will give you results: | tstats count from datamodel=foo then work backward again until you spot where you get more than 0 results. Common pitfalls include Typos (check your datamodel name) Using the pretty name instead of the Object ID Not including the prefix to the non-default DM fields (ie you need to do server.cpu_seconds or all.foo , not just cpu_seconds or foo , but you can just do sourcetype or source ) Your by clause include null events (common pitfalls in stats too); one way to remedy that is to create an evaluated field in the DM and do something like foo=coalesce(foo, "NULL") Good luck and may your searches be fast!

skawasaki_splun · ‎08-25-2015

That's not how you use SimpleXML for custom visualization at all. Please read the Tutorial page of the Custom Visualization app. The SimpleXML should look like: <panel> <html> <h2>Top 100 Most Common Terms in an Ad-hoc Search Query</h2> <div id="tagcloud_search" class="splunk-manager" data-require="splunkjs/mvc/searchmanager" data-options='{ "preview": true, "search": "index=_audit NOT REST: search=* | regex search_id=\"'\\d+\\.\\d+'\" | rex field=search max_match=0 \"(?<terms>\\w+)\" | top limit=100 terms | eval r=random() | sort r", "earliest_time": { "type": "token_safe", "value": "$$earliest$$" }, "latest_time": { "type": "token_safe", "value": "$$latest$$" } }'> </div> <div id="tagcloud" class="splunk-view" data-require="app/custom_vizs/components/tagcloud/tagcloud" data-options='{ "minFontSize": 14, "maxFontSize": 55, "managerid": "tagcloud_search", "valueField": "count", "labelField": "terms" }'> </div> </html> </panel>

Posts	53
Solutions	12
Karma Given	210
Karma Received	134
Member Since	‎08-09-2013

Online Status	Offline
Date Last Visited	‎06-05-2020 02:04 AM

Re: Multivalue field extraction

Re: REST API with namespace?

Re: Tstats - What factors come into play when deci...

Re: Tstats - What factors come into play when deci...

Re: Tstats - What factors come into play when deci...

Re: Entering earliest and latest time for backfill...

Re: Percentage - sum all values below 5% into a va...

Re: How do I re-index an indexed S3 bucket?

Re: Halo - Custom Visualization: When trying to us...

Re: Halo - Custom Visualization: When trying to us...

Re: Halo - Custom Visualization: When trying to us...

Re: Halo - Custom Visualization: When trying to us...

Re: Halo - Custom Visualization: When trying to us...

Re: Why are Javascript files not loading after upd...

Re: Can you use transpose to eval fields for colum...

Re: Box Plot isn't calculating outliers correctly

Re: Integrating D3js Diagrams into Splunk

Re: WebGL Globe... how to make it work?

Re: Custom Visualizations: How to drill down from ...

Re: Box Plot isn't calculating outliers correctly

Re: Box Plot isn't calculating outliers correctly

Re: Custom Visualizations app: How to pass paramet...

Re: What is tstats and why is so much faster than ...

Re: What is tstats and why is so much faster than ...

Re: Unable to plot custom Tag Cloud chart in splun...