About amrit

amrit · ‎03-05-2020

This has been filed internally as SPL-183467. Please stay tuned.

amrit · ‎02-11-2020

Hm, this sounds like a bug. I suppose in implementing Python versioning support, we inadvertently excluded system pythons by being very explicit about which Python interpreter we're going to execute. No promises yet, but perhaps on the Universal Forwarder we can attempt to just execute whichever "python" is in $PATH...

amrit · ‎07-19-2019

So I still need an answer to my question #1 (also the post title): why does monitoring the same directory twice with inotify work so poorly (i.e. incur such high data latency)? It sounds like your tricky scenario is managing to trick our file monitoring code more than we originally thought. I suppose the short answer here is, "it's complicated". 2 separate UFs monitoring the same dir twice would likely not experience this bug, even with each using inotify. Also: is there some way in the configuration to ask it to use the classic implementation instead? You can set SPLUNKD_USE_KERNEL_FILE_NOTIFY=0 in splunk-launch.conf.

amrit · ‎07-11-2019

Wow, creative. I guess the usage of crcSalt is helping make this work. Personally, I would have just run 2 UFs, and then phase out the old one. 🙂 Which version of the UF are you on? It looks like this bug (failing to use inotify on a symlink) was fixed in 7.2.0 (bug: SPL-153035). Whereas with the logs2 symlink, it knew it couldn't use inotify, so it did... something else... instead? It fell back to the classic implementation of the Monitor Input. This actively polls all monitored files with a TCP-like backoff of idle files, such that it still scales reasonably well on all platforms (but obviously can't approach inotify-levels of performance). If you can't upgrade your UF, you can ditch the symlink and use a bind mount[1]: $ rm /services/net-logs/logs2 $ mkdir /services/net-logs/logs2 $ mount /services/net-logs/logs /services/net-logs/logs2 -o bind This will be seen as a directory instead of a symlink (because it is). [1] Hope you're using Linux! 🙂

amrit · ‎02-26-2018

Any chance this app/conf file were copied from another Splunk installation?

amrit · ‎02-26-2018

I downvoted this post because it recommends hardcoding credentials, which is a big security risk and should never be advised.

amrit · ‎12-06-2017

If anyone has suggestion on how to get rid of "Could not send data to output queue (parsingQueue)" It's important to remember that this message in the presence of no other problems does not really indicate a problem. In your case, all it probably means is that data is coming into your forwarder at a higher rate than the forwarder is able to send over the network. For example: on a 1Gbit network, the HWF can accept network data at somewhere around 110MB/s (in theory). While the HWF is also capable of sending at 110MB/s, its destination is a single Splunk indexer, at any given time - whose indexing thruput is generally 25MB/s or lower. So, you potentially have data coming into the system at 100+MB/s, and exiting the system at <25MB/s. The math doesn't add up, so periodically you will see the forwarder back up, and emit the log message you're seeing. I'm going to assume you have multiple indexers in your environment, and are using Auto Load Balancing (the default) as your forwarding strategy: Analyze: Ensure that your HWF is connecting to all of the indexers you have deployed. Either your indexers are explicitly listed in outputs.conf, or you're using "indexer discovery" along with indexer clustering, in which case only a misconfigured firewall would prevent such connections. Analyze:Look in var/log/splunk/metrics.log on the HWF and study the thruput listed in "tcpout_connections" - make sure you're getting the network performance you expect. Keep in mind that if an indexer becomes blocked, this number will drop for the time being. Change: Assuming you are using AutoLB, consider dropping autoLBFrequency on the HWF from the default of 30 seconds to 10 seconds (maybe later try 5). This will cause the HWF to more evenly distribute load across the indexers. Change: Assuming sufficient network thruput and system resources on the HWF, consider setting parallelIngestionPipelines=2 (maybe later try 4) on the HWF (NOT ON THE INDEXERS!). This will significantly increase the CPU usage on the HWF, but will also allow the HWF to forward data to multiple indexers at the same time, effectively increasing its forwarding rate. Assuming other things are healthy, these changes should help reduce or eliminate blockages at the forwarder level.

amrit · ‎03-25-2016

Based on "too few groups in regex", how about if you change the regex line to: REGEX=(.)

amrit · ‎03-24-2016

I would just replace the contents of the cs_cookie field with an empty string. The following change to the above configuration should do it, although I haven't tested this: transforms.conf: [throw_some_away] SOURCE_KEY=field:bar REGEX=. FORMAT=

amrit · ‎03-09-2016

You can probably reduce to one regex/transform by using a negative lookahead: REGEX=(?!/owa/auth.owa) Haven't verified it, but... it should work. See more at: http://www.regular-expressions.info/lookaround.html

amrit · ‎03-08-2016

I just posted an answer explaining how to use INDEXED_EXTRACTIONS indextime fields to throw away events: https://answers.splunk.com/answers/118668/filter-iis-logs-before-indexing.html#answer-119031

amrit · ‎03-08-2016

@shogan is right, and here's an example config of how to make this happen when ingesting INDEXED_EXTRACTIONS logs on a Universal Forwarder: Given sample csv file: tak_tak.log: foo,bar,baz abc,123,456 abcd,234,567 abcd,abcd,567 bcd,345,678 If you want to only exclude lines where the value of field 'bar' is 'abcd', setup a sourcetype as usual: props.conf: [jiggy_jiggy] INDEXED_EXTRACTIONS=csv TRANSFORMS-throw_some_away=throw_some_away And add a transform that makes use of "SOURCE_KEY" and "field:" transforms.conf: [throw_some_away] SOURCE_KEY=field:bar REGEX=abcd DEST_KEY=queue FORMAT=nullQueue To test, ingest the file on the UF while setting the correct sourcetype: $ ./myfwder/bin/splunk add oneshot ~/tak_tak.log -sourcetype jiggy_jiggy And you should see 3 lines indexed, instead of 4. Note that the line where only column 'foo' contains 'abcd' WILL be present, while the line where column 'bar' contains 'abcd' will NOT be present.

amrit · ‎09-10-2015

Per etbe's comment below, setting ram size to 3x-10x should be unnecessary.

amrit · ‎04-28-2015

Wow, that's awful. I would initially echo dwaddle's suggestion to find a provider that offers a more sane experience. As an implementer of the file monitor data input in Splunk, I've seen a lot of odd logging/filesystem/NFS/etc behaviors - this is among the worst. This is the first time I've seen someone actually observe a FIFO-like file (they should call it the Logness Monster.. you hear about it, and you never see it, but if you do -- get the eff out!). You mentioned you "set up a process to pull in the log files" on your forwarder box -- what is this process? Do you have NFS access to the logs, or are you rsync'ing them across, or something else? If you have direct access to the raw logs, can you see how "tail -F" behaves on the file? Does it continue to follow the file, or is the file being truncated and thus confusing even tail? If tail -F works reasonably well, you may be able to get away with a scripted input that just tails the log and dumps the contents to stdout - you would reduce the chance of dupes, but increase the chance of gaps if your forwarder has to be restarted (or if file truncation trips up tail -F). If you need higher log fidelity than this (honestly, this would be "good enough" for web trends and the like), you're out of luck. If you need help setting up a scripted input to capture this data, feel free to ask or stop by in EFNet/#splunk - we can help.

amrit · ‎04-13-2015

This is a very old, little known feature... 🙂 For your case, it is recommended to instead suffix the files with .splunk.log or similar. When it comes to monitor inputs, files ending in .splunk have been reserved for quite a long time now, as a metadata file. The TailingProcessor (aka monitor input) will ignore any such files until a corresponding file that lacks the .splunk extension is found in the same directory. For example: /tmp/foo.txt.splunk /tmp/foo.txt Splunk creates these files for the following command: ./splunk spool /var/log/foo -sourcetype bar In the default $SPLUNK_HOME/var/spool/splunk/, this will create foo.splunk with some metadata specifying sourcetype=bar, and then copy /var/log/foo to the same destination - the TailingProcessor will wait for the non-.splunk file, then read the metadata and consume & delete both files. For more info on what can be specified, see ./splunk help spool. These .splunk files can be used in any [DESTRUCTIVE!!] batch+sinkhole input.

amrit · ‎12-31-2014

This message is part of an assertion in the Splunk CLI, and thus should never be hit/visible outside of "strings" and friends. Why is it there? Every good assertion should have a specific message associated with it, and "foo37" would be pretty boring... So it looks like at 01:37:29 on 2012/02/11, one of our Russian engineers decided to poke a little fun at our assortment of Albanian engineers. This is Splunk, so naturally, this is all factual. 🙂

amrit · ‎12-09-2014

Martin's right. Per the special auto-doccing URL /servicesNS/user_name/app_name/data/ui/views/_new , the required parameters are eai:data and the implicit name. You can accomplish a Create by POSTing to views/my_view_name and specifying the raw XML content in the eai:data parameter.

amrit · ‎05-13-2014

Agreed. There's nothing you can do here other than to increase the amount of the time the file sticks around.

amrit · ‎12-17-2013

You have my apologies - I did come across as a tool there. Pressed for time, I rushed out that answer and did not proofread, thus missing that it ended up sounding negative. Criticizing community effort was certainly not my intention!

amrit · ‎12-16-2013

While it is okay to continue reading from a file that has been deleted (the OS guarantees this), this is not "safe" in that your Splunk instance could be restarted (or crash) before the data is indexed (especially if it is held up due to hardware errors or similar). In such a case, you will have deleted your source file and will not have any way to index the missing data. "lukejadamec" 's suggestion that data is living in the index queue upon return 0 from "splunk add oneshot" is incorrect - this would imply that, for a 5 GB file, we load all 5 GB into memory before returning from the command. Instead, the file is read & indexed in a streaming fashion. The best way to tell whether a file has been fully indexed is to verify that the eventcount for the file is correct in the index (in other words, do a search source=foo | stats count, or metasearch, or similar). However, this is obviously difficult in the case of multiline events and/or incorrect event parsing settings. Therefore, the most reliable way to tell whether a oneshot file has been indexed is the following type of heuristic: 1) $ splunk add oneshot foo.log 2) Query the REST API at /services/data/inputs/oneshot and observe status of the item named foo.log (Bytes Indexed vs Size) 3) Eventually the file will be fully read and mostly indexed, with the remaining bits sitting in various queues, awaiting indexing. Upon hitting this condition, foo.log will not longer display in the REST API. At this point, data should finish indexing quickly - however, there could still be various issues preventing proper indexing, such as running out of disk space, a downed network connection to a downstream indexer, influx of data from other sources, etc. Therefore, the last step is: 4) Run timed searches (perhaps every 30 seconds) checking the eventcount for foo.log, until it stabilizes, meaning the eventcount hasn't changed for a few minutes. At this point, it is reasonable to consider the data fully indexed.

amrit · ‎12-16-2013

This answer is not correct - this operation is not technically safe. I will reply in another answer...

amrit · ‎04-19-2013

http://wiki.python.org/moin/Python2orPython3

amrit · ‎04-19-2013

The API is versioned now, but there's a catch - the base version is v5.0.0. However, there should be very few, if any, breaking changes to the API in 5.0. Can you share which endpoints your project is using? One way to tell may be to inspect splunkd_access.log and check whether there are any endpoints returning 400, 401, or 404 response codes. Post 5.0, there have been no breaking API changes, so currently there are no differences based on requested version. For future reference, here's how API versioning works: Users have three options when making API requests as of Ace/5.0: Specify no version, which defaults to the newest endpoint behavior: /services/.... Specify explicitly that the latest version should be used: /vLatest/services/... Specify a version. This can be done with 3 levels of granularity: Specify only a major version. For example, /v5/ will result in the newest 5.x.y.z endpoint behavior being used, but none of the new behavior in 6.0. Also specify a minor version. For example, /v5.1 results in new 5.1.y.z endpoint behavior being used, but none from 5.2 or 6.0. Also specify a bugfix version. For example, /v5.1.2 results in the API behavior being locked to exactly version 5.1.2. Upgrading Splunk to 5.1.3 will not change any API behavior. This will most likely be used when an endpoint bug is discovered, and fixing the bug would require breaking existing consumers of the API. Note that there granularity stops at the bugfix version number, and that patchfix-level versioning (x.y.z.a) is considered unnecessary and not supported. Suffix strings such as "-beta" are also not supported, as they are of dubious benefit.

amrit · ‎04-15-2013

Easy: splunk start splunkd --nodaemon

amrit · ‎12-27-2012

Most of the CLI tools need to be run in the format: splunk cmd

Posts	82
Solutions	25
Karma Given	91
Karma Received	214
Member Since	‎11-13-2009

Online Status	Offline
Date Last Visited	‎08-09-2021 03:45 PM

What's the best easter egg in Splunk 4.2?

one-way distributed searches

Re: Python 3 modular input on a universal forwarde...

Re: Python 3 modular input on a universal forwarde...

Re: Why does monitoring the same directory twice w...

Re: Why does monitoring the same directory twice w...

Re: Sophos Central app for Splunk: ExecProcessor e...

Re: Sophos Central app for Splunk: ExecProcessor e...

Re: could not send data to the output queue?

Re: Filter iis logs before indexing

Re: Filter iis logs before indexing

Re: Filtering on UF for Specific Events then Delet...

Re: Splunk 6, structuredparsing + nullQueue on UF ...

Re: Filter iis logs before indexing

Re: Calculating IOPS from bonnie++ results

Re: how to index a log file with the same name acr...

Re: Why does a Splunk 5.0.2 universal forwarder ig...

Re: "This is Cindy, it is my new goatfriend"

Re: Can you use the Splunk REST API or Javascript ...

Re: Issues Monitoring Fast Rotating Logs - UNIX

Re: when is it safe to delete oneshot input file?

Re: when is it safe to delete oneshot input file?

Re: when is it safe to delete oneshot input file?

Re: Is Python 3.x in Splunk's Future?

Re: How do you specify which version of the REST A...

Re: Can run splunkd in foreground?

Re: RHEL 6 64bit on Splunk 4.1

Are you a member of the Splunk Community?