About mattness

mattness · ‎08-18-2020

Turns out this particular example is a bug. Splunk 8.0.0 through 8.0.6 generates this "info message" when you use sparkline without an argument (such as sparkline(count) or sparkline(count(cpu)). This isn't supposed to happen. The bug is fixed in upcoming versions of Splunk. As for supporting deprecated syntax - well, sometimes we do that for backwards-compatibility, when people are upgrading from older versions of the product.

mattness · ‎01-04-2018

Hi Simpkins - The answer to your question depends on the data you are working with and what you are trying to do. Data models are in fact collections of hierarchically arranged datasets--you might want to create a data model if you are working with a large dataset that can be divided into lots of very specifc subsets. A data model allows you to see the overall data model dataset hierarchy and then work with specifc elements (datasets) within that hierarchy. You can run searches on specifc data model datasets. You can also use the Pivot tool to build visualizations based on specific data model datasets. Also, when you accelerate a data model, you can potentially accelerate all of the datasets within that data model (see the Knowledge Manager Manual docs for for information about data model acceleration restrictions). This means that searches and dashboards that use datasets in that data model can return results quicker than they would without acceleration. To create a data model, you need to have a pretty solid understanding of your data and have a clear idea of how you'd like to subdivide it into smaller datasets. The Data Model Builder is not really designed for data exploration. It requires that you have a decent understanding of the search processing language (SPL). On the other hand, you can also create table datasets with the Datasets Add-on. You might do this if you just want to work with a simple dataset that represents the results of a simple search. You can use the Table Editor to refine and focus the boundaries of that dataset without interacting with SPL. You can also use it to better understand the contents of a particular dataset. This might be a better solution if any of the following are true: You are working with a relatively small dataset You do not know your data very well You do not know SPL well You do not want to spend time designing complicated searches of a dataset You do not need to design a collection of hierarchically-related datasets You can accelerate table datasets in much the same way that you accelerate data models. You can also use Pivot to design visualizations based on specific table datasets. You can also use the from command to create table datasets that "extend" other datasets. This creates a hierarchical relationship--a change to dataset can also affect any datasets that extend it--but currently the Splunk platform's ability to show you table dataset dependencies is pretty limited. I think this sums it up. Hopefully this helped you more than it confused you. Let me know if you have more questions.

mattness · ‎11-29-2017

This was a great example of a metrics input from a CSV file! We've added it to the metrics documentation: http://docs.splunk.com/Documentation/Splunk/7.0.0/Metrics/GetMetricsInOther#Example_of_a_CSV_file_metrics_input We have also updated this topic to clarify the CSV file format requirements for this kind of metrics input.

mattness · ‎11-09-2017

Hi jsinnott_ At this time, union behaves alternately like multisearch (for distributable streaming subsearches) or append (for subsearches that are not distributable streaming). This is not adequately explained in the doc topic for the union command at present and I'll see what I can do to fix that. (For more information about the types of streaming search commands, see Command types in the Splunk Enterprise Search Manual.) Let's take your first search: | union maxout=10000000 [ search index=union_1 ] [ search index=union_2 ] [ search index=union_3 ] | stats count by index In this case, all of the searches are distributable streaming, so they area all unioned with multisearch . This is why you see 60k in each. Your second search uses the head command for one of the subsearches. Because head is centralized streaming rather than distributable streaming, it causes the subsearches that follow it to use the append command. "Under the hood," the search is converted to: | search index=union_1 | head 60000 | append [ search index=union_2 ] | append [ search index=union_3 ] | stats count by index When union is used in conjunction with a search that is not distributable streaming, the default for the maxout argument applies: 50k events. This is mentioned in the doc topic for the union command. Your third search also ends up being an append search, because the second subsearch is not distributable streaming due to the head command. Here's how it looks "under the hood": | search index=union_1 | append [ search index=union_2 | head 60000 ] | append [ search index=union_3 ] | stats count by index Again, the maxout argument default applies here, limiting the results of the appended searches to 50k events. In your last example, the first two subsearches are distributable streaming, so they are unioned with multisearch . But the final subsearch has the head command, so it gets unioned with append at the end. | multisearch [ search index=union_1 ] [ search index=union_2 ]| | append [ search index=union_3 | head 60000 ] | stats count by index The maxout argument applies to that last subsearch because it is not distributable streaming due to the head command. So it returns 50k events rather than 60k events. Note that multisearch has to be the first command. If your union search unpacks in a way that puts append first, you won't get multisearch to follow it. Kindest regards, Matt (Splunk Docs Team)

mattness · ‎10-10-2017

At the end of the day, the only way to know for sure is to accelerate the report and see if you get a performance improvement that makes the cost of doing the acceleration* worth it. This guideline in the docs is mainly there to explain why some report accelerations give you larger performance gains than others. *Cost = the cost of having a scheduled search run in the background to build the summary, and the cost in terms of disk storage space for the summary itself.

mattness · ‎10-09-2017

Hi kiril123-- To answer this question, it's probably helpful to revisit what report acceleration does. Say you have an unaccelerated search that runs over the last month, and when it does, it searches through an average of 500k events to return maybe 300 events, which means it has low cardinality--only a few events from the original set are returned. This search takes awhile to complete because it has to look through all 500k events to find those 300 events. So you accelerate that search. This starts a search running in the background. It runs on a schedule using the same criteria as the original search, and builds a summary of just the results the original search was looking for. This summary will be far smaller than 500k events--it may contain just a few thousand events at any given time. The next time you run your accelerated search, it runs against that summary. Because the summary is far smaller than the original search set, the search should complete much faster than it did before. Ok, now imagine you have a second unaccelerated search that, like the first search, runs over the past month, and when you run it, it also searches through an average of 500k events. But this search has far higher cardinality than the first search: each time you run it, it matches at least 50k events, maybe more. This second search is slow to complete as well, because it also has to look through 500k events each time it runs. However, when you accelerate the second search, it ends up with a much larger summary than the first search. This is because the second search has high cardinality. Each time its background search runs, it adds at least 50k events to the summary. If the summary has a range of 3 months, that's an average of 150k events at any given time. When you run the accelerated search, it runs against that summary. It will complete faster than it did before, but not that much faster, because it's still running over a lot of events. Its report acceleration summary just isn't much of a summary. So the lesson here is--when you must search across large volumes of data, try to design low cardinality searches--searches that return significantly fewer events than the total amount of events searched. Then, when you accelerate them, you'll get searches that are actually accelerated. I'll see if I can rewrite the documentation to clarify this point.

mattness · ‎08-23-2017

There's some overlap between the sort of information you're looking for here and the contents of the new Inherit a Splunk Enterprise Deployment manual. That manual is specifically designed to help admins who find themselves in command of a Splunk deployment that has been up and running for some time. You might find the final topic in that manual of particular interest, as it includes some of the items in your list and covers other subjects that are similar to those items. It's called Investigate knowledge object problems. It includes: Knowledge object naming conflicts Object permissions Object interdependency considerations Finding and reassigning orphaned objects Scheduled searches and search concurrency Report and data model acceleration considerations

mattness · ‎05-16-2017

In 6.5 there was a terminology change. The term "data model object" was replaced by "data model dataset." Nothing about the functionality actually changed. Splunk is just clarifying that data models are in fact made up of hierarchies of datasets. Nothing has changed for Pivot with regard to how it works with data models except for this terminology change. Previous to 6.5, you had to select a data model object and open it in Pivot. Now you select a data model dataset and open it in Pivot. The concept of datasets was formally introduced in 6.5. You can use the new datasets functionality to work with three new dataset types. Two of these types already existed as knowledge objects: lookups and data model datasets. The third type, table datasets, is new to Splunk. For more information about dataset types, see http://docs.splunk.com/Documentation/Splunk/6.5.0/Knowledge/Aboutdatasets One difference for Pivot now is that any dataset type can be opened in Pivot. You can do this from the Datasets listing page. For more information see http://docs.splunk.com/Documentation/Splunk/6.5.0/Knowledge/Workwithdatasets#Open_a_dataset_in_Pivot

mattness · ‎03-21-2017

If you want definitive proof that a schedule window is being applied to a search, inspect scheduler.log and see if a window_time field is associated with the search. The true test, of course, is whether the schedule window is effective. You should only apply it to searches that seem to be causing other searches to skip their scheduled runs. If you apply it to a scheduled search and find that the skip frequency for the other searches decreases, that is a good indication that the window is doing its job.

mattness · ‎02-03-2017

Ok. The same restriction applies to user-based search filters, unfortunately. The plain truth is that no search filters whatsoever can be applied to accelerated data models or their objects. I'll update the documentation to reflect this. The fact that the filter isn't working for ordinary indexed data is puzzling, however, and I don't have any immediate suggestions to resolve it. If I do, I'll respond here.

mattness · ‎02-02-2017

You're partially correct about role-based search filters not being applied to tstats searches. By default they are applied to tstats searches of ordinary indexed data. But they are not applied to tstats searches of accelerated data models and accelerated data model objects. There is a tstats setting that you can use in limits.conf to change this default. This is discussed in the documentation of the tstats command: http://docs.splunk.com/Documentation/Splunk/6.5.1/SearchReference/Tstats#Selecting_data

mattness · ‎11-10-2016

This method is documented more clearly here: http://docs.splunk.com/Documentation/Splunk/6.5.0/Knowledge/Workwithdatasets#Delete_datasets I'll update the documentation so that the two sets of instructions correspond with each other better. Matt Ness, Splunk Documentation

mattness · ‎10-27-2016

See also Dataset types and usage in the Knowledge Manager Manual. Data model objects are now classified as a type of dataset. You can open any data model dataset in Pivot from the Datasets listing page (which replaced the Pivot page).

mattness · ‎04-15-2015

Ah, looks like someone beat me to this response!

mattness · ‎04-15-2015

Can your users see any other Settings page links? If they cannot see other admin-only Settings pages that you have access to, it could be that you need to expand their access in the local.meta file as described here: http://docs.splunk.com/Documentation/Splunk/6.2.2/Security/Addmanagementaccesstocustomroles Note that in this topic, what is referred to as "Manager" is now called "Settings" in the product.

mattness · ‎03-25-2015

Report acceleration summary verification is documented here: http://docs.splunk.com/Documentation/Splunk/6.2.2/Knowledge/Manageacceleratedsearchsummaries#Verify_a_summary Report acceleration summaries fail verification when the data they contain is inconsistent, meaning that the newer data in the summary has fundamental differences from the older data in the summary. This can happen when (sometimes quite subtle) changes are made to components of the base search, such as a change to the definition of a tag or event type. In your case it could be that the wildcarded source=*license_usage.log is finding data from a wider range of log files now than it did originally. If you're certain that the base search is working properly you can rebuild the summary and then verify the rebuilt summary. You may get better results. As for skipped buckets: the verification process skips buckets that are hot, as well as buckets that are in the process of being built.

mattness · ‎11-12-2014

For more information about reusing eval statements, see this topic on calculated fields: http://docs.splunk.com/Documentation/Splunk/latest/Knowledge/definecalcfields

mattness · ‎09-24-2014

Not in 6.0, unfortunately. That's why the additional capability was added in v6.1--with that you can take away report acceleration and still schedule searches.

mattness · ‎09-24-2014

Apologies for your confusion, Ben--that was a typo. You are correct that the accelerate_search capability does not exist in v 6.0.5. It was introduced in v6.1, and the documentation should have been fixed so it gave the correct information for the correct version at that time. In v6.0 the only capability required for report acceleration is schedule_search. If you remove schedule_search from the role, the ability to accelerate reports should be removed from users with that role. In v6.1 and later, the ability to accelerate reports requires both schedule_search and accelerate_search. I've updated the topic so it shows the correct capability requirements for 6.0.x and 6.1+

mattness · ‎08-07-2014

Try taking advantage of the tstatshomepath setting on indexes.conf by defining a different one for each of your indexes. From indexes.conf.spec : tstatsHomePath = <path on index server> * Location where datamodel acceleration TSIDX data for this index should be stored * If specified, MUST be defined in terms of a volume definition (see volume section below) * If not specified it defaults to volume:_splunk_summaries/$_index_name/datamodel_summary, where $_index_name is the name of the index * CAUTION: Path must be writable. * Must restart splunkd after changing this parameter; index reload will not suffice. This subtopic, which covers setup of size-based retention across indexers, provides an example configuration.

mattness · ‎07-28-2014

Martin is right in that you can't build a line or area chart without at least one split row element in the table. Split row elements provide the x-axis for the line/area chart, while column value elements provide the y-axis values for the line/area chart. You already have a column value element (the "Count of " column that you get when you first enter the Pivot). Unfortunately, there's another limitation with search-based objects: Line and area charts in Pivot require that _time be auto-extracted as an attribute. Currently, search-based objects do not extract _time, because they are designed to return table rows for transforming searches. If you are basing this pivot on a root search object, this is probably why the line and area chart types are unavailable to you. Try to base your pivot on an event-based object if possible. Event-based objects are far more versatile. You really only need to use search-based objects if you have to base your pivot on a transforming search that does not return events but rather tables of statistical information. For more information on search-based object cons and pros, see: http://docs.splunk.com/Documentation/Splunk/6.1.2/Knowledge/Designdatamodelobjects#Add_a_root_search_object_to_a_data_model

mattness · ‎06-12-2014

I documented it at the note in the "Configure a filter element" subtopic: http://docs.splunk.com/Documentation/Splunk/latest/Pivot/UsingthePivotvisualizationeditor#Configure_a_filter_element

mattness · ‎06-04-2014

This isn't an answer exactly, so I won't frame it as such--but you probably won't get a better idea of what's going on until you investigate the logs of the summarize creation search. It sounds like you have a problem bucket that is causing the summarize creation search to stall or crash. If you're not sure how to perform this analysis, contact Splunk Support.

mattness · ‎03-17-2014

This is in response to your comment to rsennett.... First off, if you haven't tried the Data Model and Pivot Tutorial, you probably should. I think it would answer many of your questions. If you have tried it and you're still confused, here's another brief tutorial that hopefully will clarify things. You should be able to build a fairly simple data model that delivers the results you need, given the example you've provided. You'd start by creating a data model that consists of one event-based object. Name the object "Sales." Give it a constraint that isolates the events where a revenue-generating sale occurred, such as Sale_Point = * (I'm assuming this provides the location or store name where something was sold). Assuming the four fields are automatically extracted, add them to the object as auto-extracted attributes: Customer , Amount , Sale_Point . (Date should be added automatically as _time unless it's different from the event timestamp.) Make sure that the attribute types are correctly identified: Customer and Sale_Point are strings and Amount is a number. That should be all you need to do. Save your changes. To accelerate the data model go to the Data Model Manager page (it says "Data Models" at the top and has an Actions column; you get to it from the Data Model Editor page by clicking "Back to Data Models"). Click Edit and select Edit Permissions. Share the object with the App or All Apps. (Only shared objects can be accelerated.) Click Edit again and click Edit Acceleration. In the Edit Acceleration dialog select Accelerate and then select a Summary Range. Summary range is the amount of time that you need to be accelerated. The bigger the range, the more space the acceleration summary will take up on disk and the longer it will take to create, so don't choose a range that is longer than you need it to be. For example, if you don't plan to search over more than the last week or two, select a range of 1 Month. Save your acceleration changes. Your model is now accelerated. Now open the object in Pivot. When you go in, straight away it will give you a count of your events with a Sale_Point value over all time (the column will be titled "Count of Sales"). You can adjust the time filter to search over a shorter time range if you don't need all that data (ideally you should cut it down to within your acceleration Summary Range). In Pivot you can fiddle around to get the charts you're interested in. For example, you could add a Split Row element for the Sale_Point attribute and then replace the "Count of Sales" Column Value element with one that sums up Amount . This would give you the total sum of sales by sale point. Or you could set up a Split Row element for the Customer attribute instead and get the total sum of sales by customer. And then you could use the Filter element to filter out all results but those for a specific sale point or customer (if you wanted to). Hopefully this helps!

mattness · ‎02-11-2014

Glad you feel that Splunk's new Pivot feature is off to a good start! Keep in mind that we're just getting rolling with Pivot and will be expanding its range over time. The filter limit attribute list currently only displays the top values in your dataset. This is a bug--a performance-related limitation--that we intend to clear up in an upcoming release. And with regard to subsearches and similar Search features--you can include these and many other advanced search mechanics in root search objects. Root search objects are designed to use just about any kind of search string as long as it returns statistical results in table format (uses transforming commands, in other words). You just have to keep in mind that object hierarchies based on root search objects cannot take advantage of persistent data model acceleration (but they will be covered by ad-hoc acceleration). So at least for now there's trade-off if you want to use a complex search as the foundation of a data model.

Posts	77
Solutions	19
Karma Given	115
Karma Received	226
Member Since	‎03-20-2010

Online Status	Offline
Date Last Visited	‎10-24-2023 05:37 PM

When using the rangemap command, how do I determin...

What are some good methods for identifying depende...

Re: deprecated 'stats' command syntax notification...

Re: Data Model vs. Datasets - when to use?

Re: How to format a metrics_csv file for a metrics...

Re: What causes unioned data sets to be truncated?

Re: Why won't reports with high data cardinality g...

Re: Why won't reports with high data cardinality g...

Re: Is there a User/Developer "Rules for Good Splu...

Re: Splunk new Datasets feature

Re: Is there a way to specify the Splunk search Sc...

Re: Why are Search Filters not being applied in sc...

Re: Why are Search Filters not being applied in sc...

Re: How do I delete a data model or a data table b...

Re: Why is Pivot missing from the Splunk menu?

Re: What are the permissions / capabilities requir...

Re: What are the permissions / capabilities requir...

Re: Could someone give more details on the report ...

Re: Reusable knowledge objects

Re: Why am I not able to disable acceleration of r...

Re: Why am I not able to disable acceleration of r...

Re: How can I change the location of accelerated d...

Re: Why line and area chart are not available tryi...

Re: Pivot table filter flexibility?

Re: Splunk Acceleration Summary Stuck at 33%

Re: Topic of data model acceleration

Re: Pivot Datamodel limitations (Where's the beef?...

Are you a member of the Splunk Community?