About kmarx

kmarx · ‎07-27-2022

Just a note: I had to use gzip.open() with mode='rt' for text instead of binary

kmarx · ‎02-10-2020

I was seeing a similar error in trying to customize simple XML drilldown. error:321 - Masking the original 404 message: 'Unexpected query string parameters: q' with 'Page not found!' for security reasons Wasted a fair bit of time before finding this post. As OP says: seems broken in edit mode. Works after saving the dashboard.

kmarx · ‎07-22-2019

Excellent. It does clearly state An index called “nmon” must be created manually by Splunk administrators to use the default TA-nmon indexing parameters. (this can be tuned) I know you didn't just add that now, but I was so sure I searched the doc for stuff like this. I'm seeing the indexed data now. Thanks very much!

kmarx · ‎07-22-2019

Hi @guilmxm and thanks for the reply. I didn't realize that I had to create the nmon index manually. I can do this but can you point me to doc on what settings I need to specify? E.g., events vs. metrics, anything else? I assume the App should be "NMON Performance by Octamis". Thank you!

kmarx · ‎07-18-2019

I have a 50G dev license sandbox where I've installed NMON on the indexer and TA_nmon on one of the universal forwarders (manually since my dev instance doesn't seem to allow a deployment server). But I never see data arrive at the indexer. On the forwarder, I can see csv files cyclically come and go in /opt/splunkforwarder/var/log/nmon/var/csv_repository/ But nothing ever shows up on the indexer. E.g., index=mon or index=*mon* show no results. [Note that the above us under .../var/log/ on my install and not .../var/run/ per the trouble shooting article] If I search on index=_internal host=myUFHost *nmon* I see lots of results saying things like: WatchedFile - WatchedFile - Checksum for seekptr didn't match, will re-read entire file='/opt/splunkforwarder/var/log/nmon/var/csv_repository/dev-app01_57_VM.nmon.csv'. and WatchedFile - File too small to check seekcrc, probably truncated. Will re-read entire file='/opt/splunkforwarder/var/log/nmon/var/csv_repository/dev-app01_11_VM.nmon.csv'. If I constrain the search for a given file= , I can see that at least some these messages repeat roughly hourly for a given file name. (I'm guessing the numbers are minute w/in the given hour?) I did some searching on these messages and saw some suggestion that perhaps the UF tries to read the file before it's populated? Or perhaps it's getting deleted before processing completes? With some help from folks on the Splunk Slack#getting-data-in channel I blithely tried index=_internal "drop" "index" and got a few hits like this on sourcetype=mongod: 2019-07-18T22:01:01.226Z I STORAGE [conn967] dropCollection: s_nmon1Dpb033BBAauqdcA1GXmim53_kv_nmoyLxvM60i16Ei2OkLQ@wn5GLC.c (7bdb7e61-4fa5-48ff-bf30-2fe97841eaa6) - index namespace 's_nmon1Dpb033BBAauqdcA1GXmim53_kv_nmoyLxvM60i16Ei2OkLQ@wn5GLC.c.$_UserAndKeyUniqueIndex' would be too long after drop-pending rename. Dropping index immediately. Any guidance would be greatly appreciated. Platform: - Splunk Enterprise 7.0.3 - Linux RHEL5 64bit (2.6.18-419.el5) Places I've looked: - https://answers.splunk.com/answers/400165/nmon-performance-monitor-for-unix-and-linux-system-5.html - http://nmonsplunk.wikidot.com/documentation:userguide:troubleshoot:troubleguide - https://answers.splunk.com/answers/126878/what-more-can-i-do-to-solve-file-too-small-to-check-seekcrc-probably-truncated-will- Thanks!

kmarx · ‎04-29-2019

Thanks for the feedback. I admit to not following all your suggestions. I fixed up my #1 above and added a #4, which is background to the below approach in which I split the match field and assign fields manually. (I was obviously hoping that there was some magical way of having the TH tags become in essence the CVS header, and the TD tags the values. I suppose I could make a custom command in python to do this as well, but here's what I hacked for now. Not sure if which of your suggestions it's closest to (if any): | webscrape selector="#ContentTbl tr" url="http:/server1.ourcompany.com/cgi-bin/reportStatus.cgi" depth_limit=25 text_separator="|" empty_matches=0 | fields match | rex field=match mode=sed "s/\|\|/|/g" | rex field=match mode=sed "s/^\|//" | rex field=match mode=sed "s/\|$//" | mvexpand match | eval flds=split(match, "|") | fields - match | eval Seq=mvindex(flds,0) | eval Time=mvindex(flds,1) | eval Source=mvindex(flds,2) | eval Event=mvindex(flds,3) | fields - flds The goal is to eventually do this for multiple servers 1, 2, 3..etc. Then splunkify somehow to do reporting against them. For this, I expect I'd have to refetch the data on a schedule and then dedup before indexing or adding to a lookup. I'm still a splunk newbie/lame-o, so feeling my way here.

kmarx · ‎04-26-2019

We have several servers that support an HTTP request that presents a page of activity in a simple HTML table. I'd like to GET the page, find the table, and use the values as Splunk fields, and then the s as the values, with one Splunk event per table row. Website input seems like the right app for this, but I can't quite get it working. I defined the input to produce a preview like this: | webscrape selector="#myTable tr" url="http://server1.ourcompany.com/cgi-bin/reportStatus.cgi" depth_limit=25 text_separator="|" empty_matches=0 And I use a sourcetype of, say, "server_status". The results include the html table such as: <table id="myTable" border="1"> <tbody> <tr> <th>Seq</th> <th>Time</th> <th>Source</th> <th>Event</th> </tr> <tr style="background: rgb(224, 224, 224);"> <td>#1</td> <td>2012_02_07 03:02:08</td> <td>some-process_log</td> <td>file received</td> </tr> <tr>...</tr> </table> I'm aiming for Splunk events that look like: Seq=#1 Time=2012_02_07 03:02:08 Source=some-process_log Event=file receive What I see is: The header row with <TH>'s isn't captured as part of the the match field, I've played with other CSS selectors and, while results differ, they're still not right In the above preview query I can sort of get close by piping to ** | fields match | mvexpand match |...** where I could then probably further split on the '|' separator, however When I do a Splunk query such as sourcetype="server_status" I get a single event in which match is single valued and I have nothing seemingly to split on The match field actually gets leading/trailing and doubled up "|" delimiters, presumably because of the start/end <> tags? A la |39||#45||2013_08_09 22:20||some processing source||Sending confirmation email| This all seems like I'm going down the wrong rabbit hole. Can someone advise the right way to get this done? Thanks!

kmarx · ‎11-29-2018

I'm have a custom command that parses an input field in each given record and emits 0 to N records as its output. I'm doing this to avoid a bunch of mvzip/mvexp and logic in the calling SPL. It does seem to work fine, but I'd like some reasurance that this is really supported in the SDK. The SDK doc for StreamingCommand of the PythonSDK (http://docs.splunk.com/Documentation/PythonSDK) says (bold highlighting mine), Streaming commands typically filter, augment, or update, search result records. Splunk will send them in batches of up to 50,000 records.... This (and the rest of the article and others like it that I've found) don't really seem to specify how many records can be returned. The above seems to suggest that it's really supposed to be 1:1 and not 1:n. Here's an edited down version of my code. (I'm also a Python newbie, so apologies for any ugliness there.) import sys from mytokeninfo import Info sys.path.append("splunk_sdk-1.6.5-py2.7.egg") from splunklib.searchcommands import dispatch, StreamingCommand, Configuration @Configuration(local=True) # Per doc on "stateful" streaming commands class ExStatefulCommand(StreamingCommand): def stream(self, records): for record in records: tokens = self.parseRecordForTokens(record) for token in tokens: info = self.processToken(token) record['newField1'] = info.field1 # Application specifics simplified here for clarity (hopefully) record['newField2'] = info.field2 # ...etc yield record So, for each record, I'm augmenting it one or more times and also yield'ing it each time. If so, I'd love to see the doc for it. If not, can I get an explanation as to why and also suggestions for how best to deal with this in a clean and proper manner?

kmarx · ‎11-29-2018

I'm trying to optimize execution of a custom command by caching information it processes, but just for the duration of the currently execution SPL. My custom command does something like this for each input record Parse a field from an application event in Splunk into 1 or more tokens for further processing For each token, do some expensive proccessing on the token and cache the results. This cache is really only valid for the scope of the SPL query since the results of the processing can differ each time. E.g., a restful API query for information that can be updated over time The SDK doc for StreamingCommand of the PythonSDK (http://docs.splunk.com/Documentation/PythonSDK) says (bold highlighting mine), Streaming commands typically filter, augment, or update, search result records. Splunk will send them in batches of up to 50,000 records. Hence, _a search command must be prepared to be invoked many times during the course of pipeline processing. _ Each invocation should produce a set of results independently usable by downstream processors. My question here is: How can I maintain my cache of expensive processing results for the full scope of the SPL query and only for the query duration? That is, maintain the cached information over multiple command invocations for a given SPL query. I do see multiple invocations causing certain tokens to be needlessly processed multiply. My current cache is simply a Python dict() but I'm not picky. I do, however, need to know SPL start/end somehow so I can init and delete the chache. Such as a pre/post query callback. Or, for that matter, some way of hooking up my stateful command data to the query so that I can fetch it again via, say, the self.service.token or some such. Here's an edited down version of my code. (I'm also a Python newbie, so apologies for any ugliness there.) import sys from mytokeninfo import Info sys.path.append("splunk_sdk-1.6.5-py2.7.egg") from splunklib.searchcommands import dispatch, StreamingCommand, Configuration # A global cache of already processed "token" results to avoid # doing more than absolutely necessary knownTokens = None @Configuration(local=True) # Per doc on "stateful" streaming commands class ExStatefulCommand(StreamingCommand): def stream(self, records): for record in records: token = self.parseRecordForToken(record) if token not in knownTokens: self.processAndCache(token) # Call one or more restful APIs for this token and save results info = knownTokens[token] record['newField'] = info.field # Application specifics simplified here for clarity (hopefully) yield record Also, in the working code, I've put in some logging and definitely see the knownTokens size go back to zero in the search log. So, to restate question, how to I keep my cache populated, but just until the end of the calling SPL?

Posts	9
Solutions	0
Karma Given	3
Karma Received	0
Member Since	‎11-26-2018

Online Status	Offline
Date Last Visited	‎06-22-2023 06:51 PM

NMON Performance Monitor for Unix and Linux System...

Input a simple html table to splunk fields/values ...

Custom Search Command - Can I emit multiple record...

Custom Search Command - How can I maintain state f...

Re: How to pass alert results to custom alert acti...

Re: Bug? Drilldown search causes "error:113", in d...

Re: NMON Performance Monitor for Unix and Linux Sy...

Re: NMON Performance Monitor for Unix and Linux Sy...

NMON Performance Monitor for Unix and Linux System...

Re: Input a simple html table to splunk fields/val...

Input a simple html table to splunk fields/values ...

Custom Search Command - Can I emit multiple record...

Custom Search Command - How can I maintain state f...