About bochmann

bochmann · ‎03-25-2024

Well, Splunk doesn't treat inf and -inf, mentioned in that same section, as a number either. Anyways, I need to add additional logic to sanitize inputs that might have fields with the text "NaN" (does occasionally happen when the source is a SQL query) either way - for most purposes it just isn't a number, and tends to cause problems in further processing.

bochmann · ‎03-25-2024

So, I've been talking to Splunk support, which directed me to the documentation at SearchReference/Eval that kind of mentions that NaN is special, and also pointed to typeof() as alternative. Initially, this seemed like a good idea, but unfortunately typeof() is even more interesting: | makeresults | eval t=typeof("NaN") | eval num="NaN" | eval tnum=typeof(num) ...returns t = String tnum = Number Oh well....?

bochmann · ‎03-01-2024

Has anyone run into the interesting effect that isnum() thinks that "NaN" is a number? So isnum("NaN") is true "NaN" * 2 = "NaN" but tonumber("NaN") is NULL Are there any other odd, uh, numbers besides Not a Number? I made up the following silly query as an illustration: | makeresults | eval num="blubb;NaN;100;0.5;0,5;-0;NULL;" | makemv delim=";" allowempty=true num | mvexpand num | eval isnum=if(isnum(num),"true","false") | eval isint=if(isint(num),"true","false") | eval isnull=if(isnull(num),"true","false") | eval calcnum=num*2 | eval isnumcalcnum=if(isnum(calcnum),"true","false") | eval isnullcalcnum=if(isnull(calcnum),"true","false") | eval numnum=tonumber(num) | eval isnumnum=if(isnum(numnum),"true","false") | eval isnullnumnum=if(isnull(numnum),"true","false") | table num,isnum,isint,isnull,calcnum,isnumcalcnum,isnullcalcnum,numnum,isnumnum,isnullnumnum which results in num isnum isint isnull calcnum isnumcalcnum isnullcalcnum numnum isnumnum isnullnumnum blubb false false false false true false true NaN true false false NaN true false false true 100 true true false 200 true false 100 true false 0.5 true false false 1 true false 0.5 true false 0,5 false false false false true false true -0 true true false -0 true false -0 true false NULL false false false false true false true false false false false true false true (Post moved over from the Splunk Enterprise group.)

bochmann · ‎10-01-2021

Huh. Reading the documentation for coalesce, I can see how this happens to work for specific cases where you want to keep the original value of x if it's not NULL, and fill in something else if it is. ...which is not what I showed in my example above, but exactly what happens in the dashboard I'm looking at, and where the third parameter is just bogus. Ouch.

bochmann · ‎10-01-2021

Hi - I have a few dashboards that use expressions like eval var=ifnull(x,"true","false") ...which assigns "true" or "false" to var depending on x being NULL Those dashboards still work, but I notice that ifnull() does not show up in any of the current documentation, and it seems the current way to get the same result would be eval var=if(isnull(x),"true","false") Did I miss some kind of deprecation of that syntax ages ago (must have been before 6.3.0), and it just happens to still be parsed?

bochmann · ‎07-08-2021

To use the Splunk Add On for NetApp Data ONTAP, you need a Data Collection Node running the SA-Hydra framework. Traditionally you would install that on a dedicated Heavy Forwarder, or use the VMware DCN, https://splunkbase.splunk.com/app/3216/ The current documentation is somewhat confusing, as it seems that the NetApp Addon is being disentangled from the old VMware addon infrastructure - I have not yet fully understood what has happened in the latest release (but I'm about to find out, since we're planning for an update). Depending on the size your NetApp installation, this will collect a lot of data from the NetApp performance APIs.

bochmann · ‎08-27-2020

We missed that that the Splunk_TA_ontap App is incompatible with Splunk 8 in our upgrade planning. Apparently the configuration dashboards use Advanced XML, so they're not usable anymore. The existing app infrastructure seems unaffected though, so data ingest continues and the accompanying splunk_app_netapp and our own dashboards still work (as long as we don't need to make any changes to the data collection configuration).

bochmann · ‎10-01-2015

I'm getting vector::_M_range_check messages on our Splunk Enterprise 6.3 instance, too. The affected queries use another string function, though: convert ... ctime() I have the impression that this primarily happens with long-running queries in dashboards, but I haven't had the time for any debugging up to now. The same queries were fine before the update (in 6.2.4).

bochmann · ‎01-05-2015

Okay, I think I've found the primary problem in my scheduler configuration: In splunk-launch.conf, I had settings for http_proxy and https_proxy to make Splunk connect to apps.splunk.com through our proxy server. This seems to have confused some of the NetApp app scripts on the scheduler - I have removed the proxy configuration, and now my data collector picks up perf data from our filer.

bochmann · ‎12-22-2014

Throwing strace at the "HttpListener - Socket error from 127.0.0.1 while accessing /services/hydra/hydra_gatekeeper/hydra_gateway: Broken pipe" message on the data collector seems to suggest this happens in a splunkd thread where it tries to write data to an incoming connection (fd 4 is splunkd listening on port 8089): 13:10:31.083367 accept(4, {sa_family=AF_INET, sin_port=htons(48132), sin_addr=inet_addr("127.0.0.1")}, [16]) = 75 13:10:31.083504 setsockopt(75, SOL_TCP, TCP_NODELAY, [1], 4) = 0 [..] 13:10:31.092568 read(75, "7\10M\n\23\233\204\333I\tfx~qp*\251"..., 352) = 352 [..] 13:10:37.913122 write(75, "\27\3\3\3\20\232\200v\363ob:I\24^\230HV"..., 789) = -1 EPIPE (Broken pipe) 13:10:37.913348 --- SIGPIPE (Broken pipe) @ 0 (0) --- 13:10:37.913605 write(3, "12-22-2014 13:10:37.913 +0100 WARN HttpListener - Socket error from 127.0.0.1 while accessing /services/hydra/hydra_gatekeeper/"..., 155) = 155 13:10:37.913785 epoll_ctl(44, EPOLL_CTL_DEL, 75, {EPOLLRDNORM|EPOLLRDBAND|EPOLLWRBAND|EPOLLERR|EPOLLHUP|0xb1fd800, {u32=32761, u64=801602388203962361}}) = 0 13:10:37.913910 close(75) = 0 [edit] From strings exchanged early in the connection I assume the other endpoint is one of the ta_ontap_collection_worker.py processes. But that one gets terminated a few seconds before receiving this answer (which explains the broken pipe above): 13:10:31.092227 write(4, "\27\3\3\1`7\10M\n\23\233\204\333I\tfx~qp*\251"..., 357) = 357 13:10:31.092437 poll([{fd=4, events=POLLIN}], 1, 30000) = ? ERESTART_RESTARTBLOCK (To be restarted) 13:10:32.260084 --- SIGTERM (Terminated) @ 0 (0) --- Guess that's far as I get right now, christmas is coming up fast 🙂

bochmann · ‎12-22-2014

Thanks for your answer and the dash hint. Current documentation for the NetApp app still sais: "To build a data collection node: Install a CentOS or RedHat Enterprise Linux version that is supported by Splunk version 5.0.4." (http://docs.splunk.com/Documentation/NetApp/2.0.2/DeployNetapp/InstalltheSplunkAppforNetAppDataONTAP#Create_a_data_collection_node) I see nothing listening on port 8080 on the DCN. Which process is supposed to use that?

bochmann · ‎12-20-2014

I've been trying to whip up a quick proof-of-conecpt installation of the NetApp app on our existing Splunk enterprise instance (running on Debian/wheezy)... Unfortunately, the data collector doesn't actually seem to want to connect to our cDOT systems - I do see connections from the search head as I add ONTAP collection targets in the app settings, though - presumably checking credentials. (I have the Splunk App for NetApp Data ONTAP 2.0.2 on Splunk 6.2.1) I set up an additional Splunk heavy forwarder on a CentOS box for the data collector. The scheduler still runs on our Debian search head. Right now, on the CentOS data collector, there is a Socket error visible in splunkd.log: 12-20-2014 14:46:21.238 +0100 INFO ExecProcessor - New scheduled exec process: python /opt/splunk/etc/apps/Splunk_TA_ontap/bin/ta_ontap_collection_worker.py 12-20-2014 14:46:21.238 +0100 INFO ExecProcessor - interval: run once 12-20-2014 14:46:21.239 +0100 INFO ExecProcessor - New scheduled exec process: /opt/splunk/bin/splunkd instrument-resource-usage 12-20-2014 14:46:21.239 +0100 INFO ExecProcessor - interval: 0 ms 12-20-2014 14:46:21.925 +0100 WARN HttpListener - Socket error from 127.0.0.1 while accessing /services/hydra/hydra_gatekeeper/hydra_gateway: Broken pipe I can't find out where that orginates from, though. At the same time, I see the following messages in hydra_scheduler_ta_ontap_collection_scheduler_nidhogg.log on the scheduler: 2014-12-20 14:46:21,286 INFO [ta_ontap_collection_scheduler://nidhogg] [HydraWorkerNode] New meta data is distributed: Owner: admin, Namespace: Splunk_TA_ontap, Name: metadata, Id: /servicesNS/nobody/Splunk_TA_ontap/configs/conf-hydra_metadata/metadata. 2014-12-20 14:46:21,286 DEBUG [ta_ontap_collection_scheduler://nidhogg] [HydraWorkerNodeManifest] checking the status of all nodes 2014-12-20 14:46:21,293 DEBUG [ta_ontap_collection_scheduler://nidhogg] [HydraWorkerNodeManifest] checking health of node=https://172.16.123.12:8089 2014-12-20 14:46:21,331 DEBUG [ta_ontap_collection_scheduler://nidhogg] [HydraWorkerNode] no heads regrown after they cried for help on node=https://172.16.123.12:8089 2014-12-20 14:46:21,331 DEBUG [ta_ontap_collection_scheduler://nidhogg] Updated status of active nodes 2014-12-20 14:46:21,331 DEBUG [ta_ontap_collection_scheduler://nidhogg] Checked status of dead nodes 2014-12-20 14:46:21,331 DEBUG [ta_ontap_collection_scheduler://nidhogg] [HydraWorkerNodeManifest] checking the status of all nodes 2014-12-20 14:47:16,119 ERROR [ta_ontap_collection_scheduler://nidhogg] [HydraWorkerNode] node=https://172.16.123.12:8089 is likely dead, could not get info on current job count, msg : <urlopen error Tunnel connection failed: 502 cannotconnect> Traceback (most recent call last): File "/opt/splunk/etc/apps/SA-Hydra/bin/hydra/hydra_scheduler.py", line 933, in getActiveJobInfo job_info = self.gateway_adapter.get_job_info() File "/opt/splunk/etc/apps/SA-Hydra/bin/hydra/hydra_common.py", line 199, in get_job_info resp = self.opener.open(req) File "/opt/splunk/lib/python2.7/urllib2.py", line 404, in open response = self._open(req, data) File "/opt/splunk/lib/python2.7/urllib2.py", line 422, in _open '_open', req) File "/opt/splunk/lib/python2.7/urllib2.py", line 382, in _call_chain result = func(*args) File "/opt/splunk/lib/python2.7/urllib2.py", line 1222, in https_open return self.do_open(httplib.HTTPSConnection, req) File "/opt/splunk/lib/python2.7/urllib2.py", line 1184, in do_open raise URLError(err) URLError: <urlopen error Tunnel connection failed: 502 cannotconnect> Any hints where I can look to further debug this error? Possibly unrelated, I have noticed that there still are two ta_ontap_collection_scheduler.py - processes lingering around on the scheduler even after I stop Splunk. Does the scheduler maybe have problems on a Debian host, too (the install docs just note that the data collector has to run on RHEL or CentOS)?

bochmann · ‎11-27-2014

Thanks - I think I'll try that - we set a sourcetype directly on our inputs, so I don't usually need to force a different sourcetype based on a regex. The other possibility would be to exclude data that already has certain sourcetypes set. Also a general thank you for publishing this app - it has already been useful to us 🙂

bochmann · ‎11-26-2014

I have noticed that the eventtype cisco_ios-diag in TA-cisco_ios matches on some of the log file entries generated by our Juniper switches (primarily log lines matching facility=KERN). As far as I have traced this through TA-cisco_ios, force_sourcetype_for_cisco_ios in the TA's transforms.conf matches on some JunOS log entries (and overwrites our custom source type). I don't have a good idea on how to exclude these log lines yet, though - especially in a way that could be included into the TA so we don't have to apply a local fix after every update... Did anyone run into this yet and has adapted the TA-cisco_ios ruleset? Example Juniper log entry: Nov 26 14:09:48 jp45-xxx /kernel: %KERN-5-KERN_LACP_INTF_STATE_CHANGE: lacp_update_state_userspace: cifd xe-0/0/33 - CD state - ready to carry traffic

bochmann · ‎11-15-2013

Thanks. I hope it successfully gets past the consideration phase 🙂

bochmann · ‎11-13-2013

In earlier versions of Splunk, it was possible to CTRL-click on statistics results, and have a search with the appropriate field opened in a new browser window. In Splunk 6, the new search is always opened in the same window, and I have to rely on the browser back button to go back to the statistics view. That's a major hassle when trying to quickly go through event details from statistics - especially as going back will always end up on the first page of the statistics, not the one you were coming from. Simple example: I have a search in the line of host=server | top limit=100 id. When I click on one of the results in the id row of the statistics page, Splunk 6 will start a new search with host=server id=result in the same browser window, losing the statistics overview. Is there any way to get the old behaviour back?

bochmann · ‎06-26-2012

For now I just need the maximum of concurrent sessions for each day (which is a metric for the licensing on the system - so I'd like to get a view on how that develops over time, and get an early warning when I might need additional licenses). I thought that's what I would get by charting max(activeusers) in this case. A more fine resolution would be nice, but is not required.

bochmann · ‎06-26-2012

I try to count the maximum of concurrent sessions on a system where the data I have are login and logout events. I'm using eval() - something I found in another post here, incrementing a counter when someone logs in, and decrementing on a logout. Looks like this: eval count=if (id="login",1,-1) | sort + _time | accum count as activeusers | timechart span=1d max(activeusers) This seems to work in general, but obviously the data is slightly inconsistent - I don't have 0 users at the end of the day. The errors add up over time so that the end result is rather useless. I know that having consistent data would be more useful, but I can't correct that at this time. So I'm looking for a way to have my counter start at zero on each day. Any idea how to do that (or solve the problem somehow completely different in a better way)?

Posts	19
Solutions	1
Karma Given	3
Karma Received	9
Member Since	‎08-04-2011

Online Status	Offline
Date Last Visited	‎03-25-2024 05:00 AM

isnum() vs. NaN

what happened to eval ifnull(,,)?

Splunk app for NetApp, hydra_gateway: Broken pipe

TA-cisco_ios picking up events from Juniper switch...

Splunk 6 web interface - not possible to open even...

resetting a counter for each day

Re: isnum() vs. NaN

Re: isnum() vs. NaN

isnum() vs. NaN

Re: what happened to eval ifnull(,,)?

what happened to eval ifnull(,,)?

Re: DATA Configuration for Splunk Add-on for NetAp...

Re: NetApp Data ONTAP support for Splunk Enterpris...

Re: Why is my search with the strcat command faili...

Re: Splunk app for NetApp, hydra_gateway: Broken p...

Re: Splunk app for NetApp, hydra_gateway: Broken p...

Re: Splunk app for NetApp, hydra_gateway: Broken p...

Splunk app for NetApp, hydra_gateway: Broken pipe

Re: TA-cisco_ios picking up events from Juniper sw...

TA-cisco_ios picking up events from Juniper switch...

Re: Splunk 6 web interface - not possible to open ...

Splunk 6 web interface - not possible to open even...

Re: resetting a counter for each day

resetting a counter for each day