About RicoSuave

RicoSuave · ‎09-07-2018

I am experiencing periodic duplicate notable events in my search head cluster. I have a feeling this has something to do with how a SHC handles notable event syncing between search heads. Has anyone else run into this?

RicoSuave · ‎10-11-2016

OOOOH LOOOONG JOOOOOHNSON

RicoSuave · ‎09-26-2016

I downvoted this post because it sucks. the actual attribute lives in alert_actions.conf and is called reportfilename

RicoSuave · ‎08-17-2016

OHS MAIS GAWDS! IS EXACTLIES GUATS I GUAS LEWKANS FOUR! JEW HLEPS ME FIINDS THESE LAGERS IN MAI ENVIRONMENTS!

RicoSuave · ‎10-08-2015

Last i checked with amrit, yes. Batch and tail, i was told, are identical besides the fact that we implement a size limit for batch.

RicoSuave · ‎10-06-2015

CarlosDanger approves of this message.

RicoSuave · ‎10-06-2015

Technically there is no limitation on the amount of files that one splunk instance can monitor. We will read from disk as fast as the underlying storage will let us (assuming maxkbps=0 and a healthy unsaturated indexing tier). That being said, the real world limiting factor is how fast we can read from disk and/or forward data to the indexing tier. If these files never grow past 20MB we will always use tailing processor to read them. Once they grow beyond 20MB they get defered to batch reader. In this case, if all 100k files are being constantly updated, we can run into a case where tailingproc/batchreader will be stuck reading a file. This is because, by default, both batch and tailingproc will read until EOF of file, and then wait 3 seconds (time before close) before switching to the next file in the queue. So, you can very easily run into a case where we are stuck reading a file for a long time and we fail to read other files before they are rotated or deleted. You can use the tailing processor rest endpoint to determine which files are currently being read and in the queue. (pre 6.3 - i will add more details about new features in 6.3 shortly) Based on experience, 100k being monitored by a single instance will always lead to these kind of issues and/or high indexing latency. I would highly recommend that the customer split the monitoring workload between at least 2 instances. Also, you can check maxkbps in limits.conf and make sure it is tuned to an acceptable value. We will log warn messages about hitting maxkbps as this will indirectly throttle how fast we are reading files. Additional comments from jrodman Typically, it's true that at very large numbers of files, the aggregate data rate becomes a problem first (how much data can the forwarder digest per second (cpu limited), how fast can it transmit over the network (sometimes cpu limited, sometimes network limited), or most commonly how fast can the indexing tier push to disk per second in aggregate (tricky, can involve contention with searches). This means that the ability of tailing to read from large NUMBERS of files typically isn't relevant, as with very large installations the above problems are hit first. This means that when data is "spotty", almost always this means that the system as a whole is not able to keep up with the aggregate data rate, not that there is a problem with tailing (file monitoring) itself. Of course we need to look at the system diagnostics to confirm this, but file count would not be the first guess. Aside: As of 6.3 Splunk Enterprise (and other downstream splunkd products) Splunk has implemented parallel pipelines, allowing use of more cpu to process data, which should lower some of those classic cpu bottlenecks, at the price of higher total core count in use for incoming data. Regardless of those truths, tailing can still get into trouble in two sorts of situations: Situation 1, uncached metadata in the hundreds of thousands: In brief, if the files are stored somewhere that the current size and time can remain "warm", like in caches, then 10k to 100k files can work okay, but if files are somewhere this cannot be done, like NFS, the metadata requests are likely too demanding at file numbers like 100k. Situation 2, millions of files: Into the millions of monitored files, either per-file memory cost of tracking the files, or the I/O cost of retreiving the curent size or time of the files will become too large. Situation 1 in detail: Tailling checks frequently for whether files have changed, in order to queue those files to be read promptly. Because the goal is to achieve near-realtime in well-maintained conditions, the max intended wait time between checks is around a second. (There are other timers for error conditions, files currently open, etc). This means a minimum intended 10,000 to 100,000 (relative to file count) stat() calls for unix or GetFileInfo calls for windows every second. If the backing data (the file size and time) information is in memory, this isn't a big problem. However, if the files are stored on a storage system that cannot cache the information locally (some types of networked or clustered filesystems) that may result in a unworkably high number of I/O operations per second. The file monitoring code will gracefully degrade if it cannot achieve this intended schedule of file checks. The files will still be checked, and changed files will still be read from. However, not meeting that intended rate of file checks will typically mean that the storage system is being significantly taxed by the random IOPS of metadata retreival. This will lessen its capability to provide the practical file data, and on a shared-function storage system could introduce contention with other applications as well. Situation 2 in details: It may be surprising, but it's unavoidably necessary for the file monitoring component to keep some amount of memory in use for every discovered file (even files you rule out with controls like IgnoreOlderThan). This means that as the number of files grows to infinity, the ram needed by file monitoring will too. In practice, 100k files will use a significant amount of ram (perhaps hundreds of megabytes), but millions of files will easily reach multiple gigabytes. That can be accomodated (with grumbling perhaps), but at some point, e.g. 100million files, it will not work. More likely to be a problem is the same issue from Situation 1. With millions of files, it becomes more likely that the operating system's cacheing logic will not keep all the file status information in ram, or simply that the system calls to request that information will become too much of a cpu bottleneck for the system to perform smoothly. If there is a true need (especially if growing), to monitor a set of multi-millions of files from one installation, please present that case very actively through both support and sales channels.

RicoSuave · ‎10-06-2015

is there a limit on the number of files splunk can monitor? Say for example if i have a directory with 100k+ files. Is it reasonable for me to expect indexing latency/missing event issues?

RicoSuave · ‎10-02-2015

This is actually technically possible but NOT SUPPORTED BY SPLUNK! reference this question and please upgoat. http://answers.splunk.com/answers/314192/use-srchfilter-to-anonymize-data-based-on-role.html#answer-314196

RicoSuave · ‎10-02-2015

Yes you can. The reason you see a message about unbalanced quotes has to do with how splunk adds the search terms to the base of the search and apparently, a | will cause the search to fail at runtime. There is a dirty workaround for this. And it involves bypassing the GUI and adding two closing quotes before your search and two after your search. Like this: Authorize.conf [yourAwesomeRole] srchFilter = )) | rex "(?<testfield>\d)" (( By adding the double parentheses before and afterwards, you can now use search commands that require a pipe, such as the rex command with mode=sed which should allow you to anonymize at search time. CAUTION: If you use this method to anonymize data, please be aware that this can have a major performance impact on your search speed. The rex command with mode=sed can be very expensive, especially if traversing very large events. Use this with caution. This is NOT A SUPPORTED CONFIGURATION as of this moment. The proper way to restrict sensitive data would be to clone your data to another index and anonymize data at index time on the cloned copy.

RicoSuave · ‎10-02-2015

Haros my friends! I would like to anonymize sensitive data at search-time but, only for certain roles and without having to resort to index-time data anonymize via props/transforms. Is this possible? I tried using srchFilter in the roles via the GUI, but Splunk complains about unbalanced quotes if I use any type of commands after a pipe. Por Favor, ayudenme!

RicoSuave · ‎09-22-2015

You can use this in your search: http://docs.splunk.com/Documentation/Splunk/6.2.5/SearchReference/mvexpand Or configure props.conf with: KV_MODE = multi

RicoSuave · ‎08-25-2015

No, they will not. There is no way to inform the Search Head Cluster about these dispatch directories so the search heads will not be aware of them if you copy them in.

RicoSuave · ‎06-01-2015

The following regex should work for you: String\:(?<String>\w+)\s\|\sAge\=INT\:(?<age>\w+)

RicoSuave · ‎06-01-2015

This is because The python2.7 package is dependent to the libssl1_0_0 package (openssl_1.0 runtime librairies). Your OS is missing these. Install them and it will get rid of this error.

RicoSuave · ‎04-08-2015

This means that you are referencing a group name in your outputs.conf that does not exist. for example: defaultGroup = heavy_forwarder In order for that to be a valid configuration, you must have the appropriately named stanza like this: [tcpout:heavy_forwarder] server = splunkLB.example.com:4433 autoLB = true

RicoSuave · ‎04-08-2015

Can you elaborate on what you mean by reprocess? I'm assuming you are looking to have splunk extract the new fields that are now in the logs that previously weren't. Is this correct?

RicoSuave · ‎03-09-2015

No. You just point the members of the cluster to the deployer. Reference our docs here: http://docs.splunk.com/Documentation/Splunk/6.2.2/DistSearch/SHCdeploymentoverview You don't even need to point the cluster members to a deployer for it to function. But, the only supported method to deploy apps to the cluster is by the use of the deployer. You are probably getting a little tripped up in how the deployer knows it's a deployer. In 6.2 any instance can be a deployer. Mothership splunkd is always "listening" for comms from a search head cluster. And when you first bootstrap a cluster, if you point it to a instance you have decided to be the deployer, then the cluster members communicate with it to make sure it is available.

RicoSuave · ‎02-10-2015

No, regardless of that setting, a minimum of 1 peer will be restarted per round: From server.conf spec: percent_peers_to_restart = <integer between 0-100> * suggested percentage of maximum peers to restart for rolling-restart * actual percentage may vary due to lack of granularity for smaller peer sets * regardless of setting, a minimum of 1 peer will be restarted per round

RicoSuave · ‎02-10-2015

Yes! Via whitelists and black lists located in server.conf. These default stanzas control our default replications: From Spec: conf_replication_include. = * Controls whether Splunk replicates changes to a particular type of *.conf file, along with any associated permissions in *.meta files. * Defaults to false. conf_replication_summary.whitelist. = * Whitelist files to be included in configuration replication summaries. conf_replication_summary.blacklist. = * Blacklist files to be excluded from configuration replication summaries. And our default values: /opt/splunk/etc/system/default/server.conf conf_deploy_repository = $SPLUNK_HOME/etc/shcluster /opt/splunk/etc/system/default/server.conf conf_deploy_staging = $SPLUNK_HOME/var/run/splunk/deploy /opt/splunk/etc/system/default/server.conf conf_replication_include.alert_actions = true /opt/splunk/etc/system/default/server.conf conf_replication_include.collections = true /opt/splunk/etc/system/default/server.conf conf_replication_include.commands = true /opt/splunk/etc/system/default/server.conf conf_replication_include.datamodels = true /opt/splunk/etc/system/default/server.conf conf_replication_include.event_renderers = true /opt/splunk/etc/system/default/server.conf conf_replication_include.eventtypes = true /opt/splunk/etc/system/default/server.conf conf_replication_include.fields = true /opt/splunk/etc/system/default/server.conf conf_replication_include.history = false /opt/splunk/etc/system/default/server.conf conf_replication_include.html = true /opt/splunk/etc/system/default/server.conf conf_replication_include.literals = true /opt/splunk/etc/system/default/server.conf conf_replication_include.lookups = true /opt/splunk/etc/system/default/server.conf conf_replication_include.macros = true /opt/splunk/etc/system/default/server.conf conf_replication_include.manager = true /opt/splunk/etc/system/default/server.conf conf_replication_include.models = true /opt/splunk/etc/system/default/server.conf conf_replication_include.multikv = true /opt/splunk/etc/system/default/server.conf conf_replication_include.nav = true /opt/splunk/etc/system/default/server.conf conf_replication_include.panels = true /opt/splunk/etc/system/default/server.conf conf_replication_include.props = true /opt/splunk/etc/system/default/server.conf conf_replication_include.quickstart = true /opt/splunk/etc/system/default/server.conf conf_replication_include.savedsearches = true /opt/splunk/etc/system/default/server.conf conf_replication_include.searchbnf = true /opt/splunk/etc/system/default/server.conf conf_replication_include.searchscripts = true /opt/splunk/etc/system/default/server.conf conf_replication_include.segmenters = true /opt/splunk/etc/system/default/server.conf conf_replication_include.tags = true /opt/splunk/etc/system/default/server.conf conf_replication_include.times = true /opt/splunk/etc/system/default/server.conf conf_replication_include.transactiontypes = true /opt/splunk/etc/system/default/server.conf conf_replication_include.transforms = true /opt/splunk/etc/system/default/server.conf conf_replication_include.ui-prefs = true /opt/splunk/etc/system/default/server.conf conf_replication_include.ui-tour = true /opt/splunk/etc/system/default/server.conf conf_replication_include.user-prefs = true /opt/splunk/etc/system/default/server.conf conf_replication_include.views = true /opt/splunk/etc/system/default/server.conf conf_replication_include.viewstates = true /opt/splunk/etc/system/default/server.conf conf_replication_include.workflow_actions = true /opt/splunk/etc/system/default/server.conf conf_replication_summary.blacklist.lookup_index = (system|(apps/*)|users(/_reserved)?/*/*)/lookups/*.(tmp$|index($|/...)) /opt/splunk/etc/system/default/server.conf conf_replication_summary.whitelist.lookups = (system|(apps/*)|users(/_reserved)?/*/*)/lookups/* /opt/splunk/etc/system/default/server.conf conf_replication_summary.whitelist.refine.local = (system|(apps/*)|users(/_reserved)?/*/*)/(local/...|metadata/local.meta) /opt/splunk/etc/system/default/server.conf conf_replication_summary.whitelist.repo = system/replication/*.json

RicoSuave · ‎02-10-2015

From our docs http://docs.splunk.com/Documentation/Splunk/6.2.1/DistSearch/PropagateSHCconfigurationchanges#Deploy_a_new_configuration_bundle: $SPLUNK_HOME/etc/system/default/app.conf. It lists the configuration files that do not trigger restart when changed. All other configuration changes trigger restart.

RicoSuave · ‎02-06-2015

So Epic! Carlos approves!

RicoSuave · ‎02-06-2015

Kissing a man with a goatee is a lot like going to a picnic. You don't mind going through a little bush to get there!

RicoSuave · ‎02-06-2015

You don't talk to Carlos Danger, you talk to his Goatee.

RicoSuave · ‎02-03-2015

Assuming you have the ps_sos script enabled in your environment from the SOS app, you can use the following search. index=sos sourcetype=ps [search index=_audit action=search user=* (apiStartTime='ZERO_TIME' AND apiEndTime='ZERO_TIME') | rex field=_raw "search_id\=\'(?<scrubbedsearch>\w+\.\w+)" | eval ARGS="*" + scrubbedsearch + "*" | fields + ARGS] | fields + PID | DEDUP PID | table PID This will give you the following output:

Posts	204
Solutions	33
Karma Given	3
Karma Received	288
Member Since	‎10-02-2015

Online Status	Offline
Date Last Visited	‎10-27-2020 12:43 AM

Notable Event Suppression Keys: Why are there peri...

is there a limit on the number of files splunk can...

How to use srchFilter to anonymize data based on r...

SPLUNKD_CONNECTION_TIMEOUT 6.x

DB Connect encryption

Reduce fishbucket size

"Ended without a done-key" messages

Documentation formatting with Chrome

Reached end-of-stream error troubleshooting

Permissions on monitored files

Notable Event Suppression Keys: Why are there peri...

Re: How do bundles work?

Re: Removing time stamp from the emailed csv file

Re: How can I determine the lag between when an ap...

Re: is there a limit on the number of files splunk...

Re: is there a limit on the number of files splunk...

Re: is there a limit on the number of files splunk...

is there a limit on the number of files splunk can...

Re: pipe in srchFilter

Re: How to use srchFilter to anonymize data based ...

How to use srchFilter to anonymize data based on r...

Re: How do i specify a multivalue field when I’m o...

Re: Will old jobs from search head pooling be usab...

Re: How to retrieve the fieldNames, based on the p...

Re: [Python] ValueError: unsupported hash type

Re: Why am I getting error "TcpOutputProc - the 'd...

Re: Since our IIS log format changed, how do I tel...

Re: Is the deployer part of the search head cluste...

Re: Will setting the percent_peers_to_restart = 0 ...

Re: Can you control what gets replicated between s...

Re: [SHC] What config changes will cause a rolling...

Re: Who is Carlos Danger and where can I buy his b...

Re: Who is Carlos Danger and where can I buy his b...

Re: Who is Carlos Danger and where can I buy his b...

Re: How do I display a list of PIDS for all search...

Are you a member of the Splunk Community?