sideview's Topics

Find Answers
Ask questions. Get answers. Find technical product solutions from passionate members of the Splunk community.

sideview's Topics

1) If I run a regular timechart command against normal rows. * | timechart span=1h count by sourcetype limit=500 then for timebuckets and sourcetypes where no data existed, the timechart c... See more...
1) If I run a regular timechart command against normal rows. * | timechart span=1h count by sourcetype limit=500 then for timebuckets and sourcetypes where no data existed, the timechart command fills in a "0" for me. All is well. 2) If I use timechart however with data that's already been aggregated. ie (forgive the artificial example here) * | bin _time span=1h | stats count by _time sourcetype | timechart span=1h sum(count) by sourcetype limit=500 or | tstats prestats=t count WHERE index=* GROUPBY sourcetype _time span=1h | stats count by _time sourcetype | timechart span=1h sum(count) as count by sourcetype limit=500 Then it's weird. I get the same chart obviously but the timechart command fails to fill in the zeros. Instead I get null values everywhere there should be a zero. This is throwing things off for me, and I'm curious if anyone knows the root cause or if there's any way to work around the problem so I get my zeros back. Notes: * I actually need the zeros back in the actual search result data, rather that just being graphed as zeros in the charting stuff. ) * if I didn't have a split-by field I could just do fillnull, but I don't know the values of all the split-by field's field values in advance.
Sometimes I want to run reports calculating things about timebuckets that have no data in them. the timechart command is awesome because it knows that even if no data occurred in a given timebuck... See more...
Sometimes I want to run reports calculating things about timebuckets that have no data in them. the timechart command is awesome because it knows that even if no data occurred in a given timebucket, it goes ahead and creates a row with that _time value, and with all 0 counts and null statistics as necessary. in other words | bin _time span="1h" | stats count by _time clientip is in many ways similar to timechart span="1h" count by clientip , except that former wont have any buckets to represent times when no data was found. However sometimes you need to do further calculations on the rows and you need the flexibility of the stats output format, with the "fill in my blank buckets" behavior of timechart. here's what I have to do, and I don't like it very much. As an example, here's a report that I can run over 7 days and it'll give me hosts that had 24 consecutive hours where no data was reported. To get this done I have to pile all 400 hosts into the "split-by" part of timechart command, and then I have to use the untable command to unpack them all. To give an example, say I have 400 hosts, and I want a search that runs over 7 days and returns the subset of hosts for which any consecutive 24 hour period had zero data in it.
So quite often I end up in a situation where I have four fields. Let's say they're _time, clientip, method and count. I want to end up with a report that gives me, for each unique combination of... See more...
So quite often I end up in a situation where I have four fields. Let's say they're _time, clientip, method and count. I want to end up with a report that gives me, for each unique combination of _time and clientip, give me the counts for each value of method. The desired results look like this: | _time | clientip | HEAD | GET | POST | | 1/17/2014 | 216.34.12.21 | 2 | 213 | 12 | | 1/17/2014 | 216.34.12.25 | 12 | 59 | 11 | | etc.... | In other words I want the _time and clientip field to be my "group-by" fields, grouped the way the stats command groups them. But I want the method field to be my "split-by" field, split out by value the way the chart and timechart command does its "by" field. If I could do without _time, or without clientip, then this would just be chart count over _time by method , or chart count over clientip by method . But I want to have each row be a unique combination of both _time and clientip. (Actually to be honest I quite often want 3 group by fields, not always 2. ) I feel like I want to be able to either a) tell the chart command | chart count over _time clientip by method even though the Splunk charting framework will not be able to graph any meaningful visualizations for me. That is fine. OR b) have some way for xyseries to handle "chartifying" the stats output. | stats count by _time clientip method | xyseries groupby=_time,clientip splitby=method cellvalue=count
I'm having to use wildcarded stanzas for a lot of my sourcetypes in props.conf, and although I'd like to have the core config appear just once in the file, I'm finding that some keys actually do not... See more...
I'm having to use wildcarded stanzas for a lot of my sourcetypes in props.conf, and although I'd like to have the core config appear just once in the file, I'm finding that some keys actually do not function in wildcarded stanzas - these keys only work when present in a plain old [actualSourcetypeName] stanza. So far I've found that CHECK_FOR_HEADER , SHOULD_LINEMERGE and pulldown_type really have to be in a plain old stanza and do not work in wildcarded props stanzas. On the other extreme, all EVAL-* , LOOKUP-* and REPORT-* seem to work fine in the wildcarded stanzas. I'm still testing my way through this and I have yet to test TIME_FORMAT , TIME_PREFIX , BREAK_ONLY_BEFORE_DATE MAX_TIMESTAMP_LOOKAHEAD and initCrcLength . It's feeling like these too will also not work in the wildcarded stanzas. But does anyone know of a reference in the docs that comes out and says which attributes work this way and which don't?
First, the answer here may be to simply not use span=1h at all, but rather to use bins=500 or some similar number in the timechart command, and let timechart command itself figure out what span ... See more...
First, the answer here may be to simply not use span=1h at all, but rather to use bins=500 or some similar number in the timechart command, and let timechart command itself figure out what span best fits that total number of bins. But, let's say I have a timechart and the user can pick last 3 hours, last 48 hours or last month. For last 3 hours I want span=30min, for last 48 hours I want span=4h, and for last month I want span=1d. Is there a way I can make my dashboard itself pick the right argument to the timechart command based only on the timerange that the user picked?
Does anyone know any way to tickle outputMode json in the Splunk REST API, such that it will actually give back multivalue field values, appropriately separated? OutputMode XML divides up the i... See more...
Does anyone know any way to tickle outputMode json in the Splunk REST API, such that it will actually give back multivalue field values, appropriately separated? OutputMode XML divides up the individual values nicely. Here's what you get for a single result row (coming off index=_internal group="per_*_thruput" | head 1000 | stats values(group) as group ) <result offset="0"> <field k="groups"> <value> <text>per_host_thruput</text> </value> <value> <text>per_index_thruput</text> </value> <value> <text>per_source_thruput</text> </value> <value> <text>per_sourcetype_thruput</text> </value> </field> </result> (indentation added) but outputMode json just joins the values together on space and gives back a giant string... [{"groups": "per_host_thruput per_index_thruput per_source_thruput per_sourcetype_thruput"}] Which seriously limits my options. (If I knew that the values didn't themselves contain spaces then I could split on space, but this is framework code and that's not an option) Apparently there's an outputmode=json_cols, at least in 5.0 and up. And the same results in that mode are: { "preview":false, "init_offset":0, "messages":[ {"type":"DEBUG","text":"base lispy: [ AND index::_internal per source::*metrics.log thruput ]"}, {"type":"DEBUG","text":"search context: user=\"admin\", app=\"sideview_utils\", bs-pathname=\"C:\\Program Files\\Splunk\\etc\""} ], "fields":["groups"], "columns":[ [ ["per_host_thruput","per_index_thruput","per_source_thruput","per_sourcetype_thruput"] ] ] } (indentation added) As you can see, it does split up the multivalue values nicely. However when you use this outputMode on SplunkWeb, the server still sends back content type of text/xml, which of course triggers parse errors in any client that pays attention to the response headers... Which doesn't give a lot of confidence. So is there any known workaround here for someone wanting to use the REST API and wanting to rretrieve multivalue field values? Is json_cols considered a better and more stable outputmode then outputMode json or is json_cols still experimental? I dont want to port all my code to use a new outputformat just for this, if that new outputformat isn't fully baked or supported.
I'm trying to get a python endpoint on a custom module that can basically take an existing savedsearch and make some simple edits to it. I actually had such a thing working for a long time but it s... See more...
I'm trying to get a python endpoint on a custom module that can basically take an existing savedsearch and make some simple edits to it. I actually had such a thing working for a long time but it seems that at some point (possibly 5.0), the getEntity/setEntity methods starting behaving inconsistently. When I run this code in 5.0, if the user does not have the ability to create alerts, then even if it's a search that they themselves saved, when the same entity comes back through setEntity it generates an error that "action.email" is not a valid argument. This error message is implying that the user has attempted to save an alert, but it's no more an alert than it was when it came out of getEntity. # Copyright (C) 2010-2013 Sideview LLC. All Rights Reserved. import cherrypy, logging import controllers.module as module import splunk.auth as auth import splunk.entity as entity import urllib, json import splunk #logger = logging.getLogger('splunk.modules.CustomRESTForSavedSearch.foo') SAVED_SEARCHES_PATH = 'saved/searches' class CustomRESTForSavedSearch(module.ModuleHandler): def generateResults(self,app,savedSearchName,serializedContext,editView, **args): response = {} currentUser = auth.getCurrentUser()['name'] sessionKey = cherrypy.session['sessionKey'] try : ss = entity.getEntity(SAVED_SEARCHES_PATH, savedSearchName, namespace=app, owner=currentUser, sessionKey=sessionKey) except Exception, e: response["hypothesis"] = "saved search name incorrect" response["message"] = str(e) response["success"] = False return json.dumps(response) ss["search"] = ss["search"] ss["request.ui_context"] = serializedContext ss["request.ui_edit_view"] = editView try : response["success"] = str(entity.setEntity(ss)) except Exception, e: response["message"] = str(e) response["success"] = False return json.dumps(response) I see this sort of thing looks very easy in the Python SDK, and there's a good set of examples http://dev.splunk.com/view/SP-CAAAEK2 . Unfortunately it seems that the way you connect to Splunk in the Python SDK requires hardcoding username and password which wont work. ( http://dev.splunk.com/view/SP-CAAAEE4 ) Can anybody shed some light on a nice simple direction to pull down an existing savedsearch and make an edit and save it? Or can anyone point me in the right direction on how to make my existing code work? I'm really sick of the Entity class and I'd be happy to get rid of it, but if I can make it work I'll also happily stick with it. Thanks in advance. PS. In entity.py, there's this line logger.debug("entity.setEntity() is deprecated") but unfortunately it doesn't leave anyone the wiser as to what to use instead of setEntity.
This is a question that comes up once in a while so I thought I'd write it up as a sort of master-answer. Say I have several different views in my app that all have the same set of Pulldowns (or ... See more...
This is a question that comes up once in a while so I thought I'd write it up as a sort of master-answer. Say I have several different views in my app that all have the same set of Pulldowns (or TextField) modules at the top. When a user picks a value in one or more of these form elements, I want that value to stick with them even if they just randomly click away to one of the other views using the app navigation. Can this be done? In other words I'm not talking about maintaining the selection through a drilldown click, but rather if the user uses the app navigation menu to go to a different page in the app, that new page has these same form elements there, I want the same elements from the Pulldown menu to be selected when the next page loads. And the question is - how to do this.
I haven't tested the setup.xml workflows in my apps in a while but for some reason they all seem to be broken now, even with configurations that I know used to work. I also tried the example from... See more...
I haven't tested the setup.xml workflows in my apps in a while but for some reason they all seem to be broken now, even with configurations that I know used to work. I also tried the example from the docs page but I couldn't get that one to work anymore either. http://docs.splunk.com/Documentation/Splunk/latest/Developer/SetupXML Long story short: in some of them I get the error 500 KeyError: Elements, that at least some people have run into in april/may/june of this year. http://splunk-base.splunk.com/apps/22296/real-time-license-usage In other configurations the setup buttons/links redirect to a URL that returns 404. In yet other configurations the setup screen loads fine for a few seconds, then redirects to a page that says I don't have permissions to access this configuration. Before I investigate further, does anyone know of any recent changes around setup.xml? There also doesn't seem to be any way to debug this problem, or at least I don't see anything informative in the web_service.log... thanks in advance. UPDATE: I figured out the redirect-to-permissions-error message issue; that was an issue in my own code. The other more serious problems - the "KeyError: elements", and the setup links being broken, I'm a little mystified still. My response right now, unless someone has an answer reassuring me that all this stuff didn't just break, is that I'm going to have to remove setup.xml from all of my apps, which is a pain.
I know that many of Splunk apps include their own datagen tools to generate nice sample data for demos and POC's. And I know that when you look at these datagen tools across all the app sthey ship ... See more...
I know that many of Splunk apps include their own datagen tools to generate nice sample data for demos and POC's. And I know that when you look at these datagen tools across all the app sthey ship with, often they share a common core of datagen code. However it looks like all of those apps are released under the Splunk Master Software License Agreement, which by my reading really doesn't technically allow any pieces to be copied out and reused in other third-party apps. So, my question is : is there a plan to release any datagen tools to the public under more open licensing? If so, is there any ETA on that? I've had it on my list to evolve my own simple datagen tools for my various apps into something that I could package and release with the apps themselves. However from what I hear Splunk's generic datagen toolset is vastly better than whatever I would end up with on my own. Thanks in advance. I'd also love to hear about other open tools that people in the community have put together or used.
Let's say you've got a custom application log that has a lot of sensibly named fields. But in addition to the sensible ones, there are a few other fields with generic names like "field1", "field2", ... See more...
Let's say you've got a custom application log that has a lot of sensibly named fields. But in addition to the sensible ones, there are a few other fields with generic names like "field1", "field2", "field3", etc... The semantics of what these fields actually are varies with the value of another field called "event_type_id". For example, if we have an event with event_type_id="download", the field1 value might be the filename and the field2 value might be the size in bytes. Likewise with event_type_id="client_error", the field1 value might be an error code, and the field2 value might be a description. I'm looking for the cleanest way to repair this and get these generic fields renamed back to the relevant sensible name. My question is -- what's the cleanest way to solve this problem and get the field names all renamed automatically in all my searches.
I have a macro that implements a conversion algorithm. At one point in that algorithm I have to add leading zeros to make sure that a hex value has 8 digits, and not less. What I've done is a... See more...
I have a macro that implements a conversion algorithm. At one point in that algorithm I have to add leading zeros to make sure that a hex value has 8 digits, and not less. What I've done is a little clunky, and it only zeropads from 7 digits to 8, but in my particular case it's sufficient: eval myInt=if(len(myInt)==7,"0".myInt,myInt) Anyone have any suggestions? I see that eval does not have anything obvious here like a zeropad function. I wonder if it would be a good thing to add, or if there's some other way of doing this more cleanly, or in such a way that it can add the correct number of zeros and not just one, with eval or with another command. Any help appreciated.
I have an interesting situation where I want to be able to display a little summary table, showing a few statistics about a small number of fields, as calculated from a restricted set of events. ... See more...
I have an interesting situation where I want to be able to display a little summary table, showing a few statistics about a small number of fields, as calculated from a restricted set of events. Basically I want it to look like the below: field avg min max avg_age 0.000000 0.000000 0.000000 eps 0.935385 0.064514 2.836625 ev 34.869565 2 86 kb 6.600976 0.244141 16.830078 The closest I've gotten is this search: foo | fields avg_age eps ev kb | fields - _* | stats values(*) as * | transpose | rename column as field "row 1" as value | eval value=split(value, " ") | stats avg(value) as avg min(value) as min max(value) as max by field which looks like it works. However I think the use of multivalue fields here is going to lead to truncation and thus the statistics aren't going to be trustworthy. Can anyone help? I have this feeling that there's something simple I'm missing? Like if there was a 'summary' command: summary stats="min,max,avg" fields="avg_age eps ev kb"
props.conf has a boolean setting called "pulldown_type". If you set it to true, then the name of your sourcetype will appear in the end-users' sourcetype dropdowns in manager. For app develope... See more...
props.conf has a boolean setting called "pulldown_type". If you set it to true, then the name of your sourcetype will appear in the end-users' sourcetype dropdowns in manager. For app developers, who are often defining their own custom sourcetypes as a part of their shipping app, this key is obviously quite nice for their end users. In the documentation however, pulldown_type is listed under "internal fields", with a note saying "Not yours - do not set". My question is -- is there some hidden pitfall to setting that key? Is that warning really warranted? If not I'm going to start setting it to true in all my apps that define one or more custom sourcetypes.
I'm trying to create a scripted lookup and I'm finding it a little frustrating because any time there's a python exception the lookup just throws an error in the UI Script for lookup table 'call... See more...
I'm trying to create a scripted lookup and I'm finding it a little frustrating because any time there's a python exception the lookup just throws an error in the UI Script for lookup table 'call_quality' returned error code 1. Results may be incorrect. and nothing gets written to splunkd.log nor python.log. So I have to figure out what the python exception is on my own, and in practice this is a pretty painful way to develop something. Am I just missing something obvious? Thanks.
Say that you have a huge volume of events, and they come in big batches. Each batch is a discrete unit, and mixing information from the most recent batch with the previous batch is unacceptable. ... See more...
Say that you have a huge volume of events, and they come in big batches. Each batch is a discrete unit, and mixing information from the most recent batch with the previous batch is unacceptable. more givens: the events within a particular batch are spread out over a few minutes. we do have control over the data so we could write a particular event at the start and at the end of the batch if necessary. We could even create a start/end event that had a different source or sourcetype. Given all this, Is there a good clean way to construct a custom search or a custom view that will be sure to operate only on the events of the most recent batch?
Just curious if this is in the roadmap. It's more than a little inconvenient that when people use WMI, the sourcetypes are all "WMI:foo", and when they dont, the sourcetype is just "foo". I fin... See more...
Just curious if this is in the roadmap. It's more than a little inconvenient that when people use WMI, the sourcetypes are all "WMI:foo", and when they dont, the sourcetype is just "foo". I find myself having to make macros and cover both cases and either duplicate stanzas in props.conf, or pull out what I can into transforms.conf. And this all seems silly since the underlying data is otherwise identical. It looks like the sort of thing that is a 'less-than-best' practice, ie abusing the sourcetype field to sneak in some other metadata. Also, has anyone tried sourcetype renaming to make this problem go away?
In complex reporting views I often use the FlashTimeline module near the top, to allow the user to regenerate the FlashCharts and other reports for just the timerange that they click or drag on the F... See more...
In complex reporting views I often use the FlashTimeline module near the top, to allow the user to regenerate the FlashCharts and other reports for just the timerange that they click or drag on the FlashTimeline. Unfortunately when the reports on such a view get converted to pull data from a summary index, the y-axis scale on the FlashTimeline becomes confusing, because of course each individual 'event' in the summary data is actually representing N events, but nobody tells the FlashTimeline this. One approach I've taken elsewhere is to strip the FlashTimeline down so that it has no y-axis, and so the bars are all the same height and it becomes effectively a big 'navigation strip'. However I feel like there's maybe some cruel and unusual search language that can turn my summary rows with count=5 back into 5 rows. if I could get count=5 turned into count=5,5,5,5,5, then I could split and then mvexpand the rows, and if I did the foo NOT foo | append [] trick, I could theoretically get FlashTimeline's y-axis correct again. Probably with all the duct tape I'm throwing around here, this isnt a great idea, but if anyone could point me in the right direction I'd like to at least evaluate it.
I'm trying to write instructions for some people to set up an app while onsite, and one of the steps involves backfilling a lot of summary index data. I've followed the steps to use the script ... See more...
I'm trying to write instructions for some people to set up an app while onsite, and one of the steps involves backfilling a lot of summary index data. I've followed the steps to use the script Splunk provides for this (fill_summary_index.py), http://www.splunk.com/base/Documentation/4.2.1/Knowledge/Managesummaryindexgapsandoverlaps But this process is incredibly slow, much slower than I would expect. One big 'stats count by foo bar' over my entire test dataset takes only about 30 seconds but running this backfill script against the same data is going to take an hour or more for each saved search at this rate, which is crazy. I expected the backfill to take a little longer than one giant search but not thousands of times longer. This is a big problem because if it takes hours on this tiny dataset it'll take days on bigger data, which isnt OK at all. So now I'm thinking maybe advanced users arent supposed to use the python script? That with the oldschool collect command and a bit of stats count by foo bar and a dash of bin to get the timestamps and a dash of addinfo maybe to add the search-time, and a backgrounded search I could probably generate the entire run of backfilled events with one long running search. http://www.splunk.com/base/Documentation/latest/SearchReference/Collect And at this point though I'm sure someone's way ahead of me which is what brings me here. Anyone have an emerging best practice they care to share? Or have I just completely missed a piece of documentation? thanks.
I've got a lot of CSV data that I'm indexing and for one of the fields in the csv, the values are themselves big jumbles of different fields joined together. eg: MLQK=3.5000;MLQKav=3.7370;MLQK... See more...
I've got a lot of CSV data that I'm indexing and for one of the fields in the csv, the values are themselves big jumbles of different fields joined together. eg: MLQK=3.5000;MLQKav=3.7370;MLQKmn=3.2782;MLQKmx=5.3000;ICR=0.0000;CCR=0.0003;ICRmx=0.0027;CS=1;SCS=0;MLQKvr=0.93 The extract command (aka kv ), springs to mind but extract only runs against _raw as far as I know. Are there any good tricks to using extract with other fields besides _raw? http://www.splunk.com/base/Documentation/latest/SearchReference/Extract Right now I'm thinking of: <my search> | rename _raw as _actualRaw | eval _raw=myCrazyField | extract | eval _raw=_actualRaw but it seems really clunky and I thought maybe there's a better way.