About TheColorBlack

PickleRick · ‎02-18-2023

1. True. Since any bucket can be (assuming that the cluster is complete) on "any" three of your six indexers, there's no guarantee that any single indexer contains a copy of each bucket. And in fact it shouldn't - that would mean you have skewed data distribution which you don't want. 2. There are several approaches you could take depending on the resources you want to use and the goal. The easiest might be to just make backups of the whole machines using any bare-metal-restore backup solutions you use and in case of sudden need just restore whole environment from scratch. There is one obvious downside (storage use) and one caveat - if you spin up a new environment and restore it from such backup - your buckets will most probably quickly (immediately? depends on the period of inactivity and your retention settings) roll to frozen. On the other hand - if you want to minimize the amount of data stored, you could indeed go across all the buckets and deduplicate them so that you only backup single copy of each bucket. You could even set coldToFrozenDir and lower your retention period to a minimum so that splunk moves data to frozen first, removing unnecessary (for the backup purposes) indexes, leaving just raw data, indexed fields and such "irrecoverable" stuff. Then do deduplication and backup. Such reduced buckets must be rebuilt before they can be used again. Of course you could just search across all your indexes and export the data. It has its upsides (you can use such data with any software you want), it has its downsides (if you want to use that data with splunk again you'd have to re-ingest it which might not be so easy if you want to achieve the exactly same results as you had before exporting; and of course re-ingesting the data would be time-consuming and license-consuming). 3. Backup the configs Backup the contents of kvstores on your SH(C). You might want to consider whether you need state files for modular inputs so that if for some reason you'd like to recreate your inputs, they'd not start ingesting the same data all over again.

PickleRick · ‎02-18-2023

As with any such questions - it's always a matter of weighing all pros and cons. There are no simple answers since _in some cases_ _for some customers_ some practices even though generally not advised could be the proper way to go. So it's hard to give a generic yes/no answer in such cases. Having said that - I'm generally against modifying official add-ons if you can avoid. If you do modify the add-on, it's all yours in terms of maintenance. So every update _you_ have to make sure nothing is broken. That's one thing. So if you add new sourcetypes, the natural approach would indeed be to create a new TA. On the other hand, if they are used with some custom extensions of a solution for which there is already a well-known TA it can be tempting to add them to the existing TA because of "consistency". But carefuly consider the maintainability of such solution. Another thing - I saw "hardcoding" written in your question. Remember that abstracting and externalizing configuration is generally a good thing so I'd say hardcode as little as possible. Use datamodels, eventtypes, tags, aliases. Make your creations flexible and reusable.

msquicc · ‎02-16-2023

Thanks for this explanation. It helped me decide to not use the setting / hack.

TheColorBlack · ‎11-16-2022

@jcongerThank you very much for taking the time to respond to my question. I sincerely appreciate it. The path you suggested is likely the one I'm going to continue down.

TheColorBlack · ‎09-15-2022

From personal experience I use application loadbalancers for all of my HEC endpoints exposed to the internet. Via loadbalancer configuration I expose TCP port 443 to the internet and forward TCP port 8088 (standard HEC port) to backend EC2 instances acting as Heavy Forwarders. The security group attached to the loadbalancer allows inbound traffic from the external sources I expect (mainly AWS Kinesis Firehose address space) and the outbound portion of the security group rule allows communication with the subnets my EC2 instances sit in. For healthcheck monitoring on the target group I monitor this endpoint "/services/collector/health" over HTTPS for successful HTTP response codes 200-299 with a 5 second timeout.

amell · ‎02-22-2022

Any answers on this? I am stuck too. The sourcetype=aws:waf seems to have mysteriously disappeared at some point over the last couple of years.

TheColorBlack · ‎09-30-2021

Figured it out, pretty simple but I was doing the operations in the wrong order originally. index="my_custom_index" "properties.requestUri"="http*://my.customwebpage.com:443/api/NotARealEndpoint/*/CoolCars/*" AND NOT "properties.clientIp"="127.0.*.*" AND NOT properties.httpStatusCode=401 |rex field="properties.requestUri" "http(.):\/\/my.customwebpage.com:(\d+)\/api\/NotARealEndpoint\/(?<uniqueHash>[a-zA-z0-9].+[^\/])\/CoolCars\/(?<CarID>[\d].+)" | stats count by properties.clientIp, uniqueHash, CarID | stats list(uniqueHash) as UniqueHash, list(CarID) as CarID, list(count) as Count by properties.clientIp | append [ search index="my_custom_index" "properties.requestUri"="http*://my.customwebpage.com:443/api/NotARealEndpoint/*/CoolCars/*" AND NOT "properties.clientIp"="127.0.*.*" |rex field="properties.requestUri" "http(.):\/\/my.customwebpage.com:(\d+)\/api\/NotARealEndpoint\/(?<uniqueHash>[a-zA-z0-9].+[^\/])\/CoolCars\/(?<CarID>[\d].+)" | stats count by uniqueHash,CarID ] | table properties.clientIp, UniqueHash, CarID, Count

gcusello · ‎05-26-2021

Hi @TheColorBlack, yes, you're right! To be more sure, see at https://docs.splunk.com/Documentation/Splunk/8.2.0/Indexer/Removepeerfrommanagerlist and https://docs.splunk.com/Documentation/Splunk/6.6.2/Indexer/Takeapeeroffline#Take_a_peer_down_permanently:_the_enforce-counts_offline_command . Ciao. Giuseppe

Vardhan · ‎03-13-2021

Hi, Try below props and transforms. props.conf [sourcetype] TRANSFORMS-docker = eventsRoute Transforms.conf REGEX= $DockerNodeHostName (If this doesn't work try to give a unique keyword that can differentiate from other events) DEST_KEY = _MetaData:Index FORMAT = New index name If this reply helps you please upvote it.

TheColorBlack · ‎02-24-2021

Alright, I've figured this one out. Posting the solution for anyone else who may run into the same "issue". This behavior is not explicitly stated in the Scale HTTP Event Collector with distributed deployments documentation. I've opened a case with Splunk Support to clarify weather or not this is the intended behavior and will update this thread accordingly with their answer. The following is called out within the documentation under the "Place and distribute the HEC on heavy forwarders" heading within the aforementioned documentation. If you plan to distribute HEC configurations through the deployment server, set the useDeploymentServer option in the [http] stanza of inputs.conf on the deployment server to 1. When this option is set to 1 and you make UI-based HEC changes on the deployment server, those changes are placed directly in the $SPLUNK_HOME/etc/deployment-apps/splunk_httpinput/ folder, rather than in $SPLUNK_HOME/etc/apps/splunk_httpinput/. See the inputs.conf spec file for further information. It should also be noted that enabling this setting causes your deployment server to expose TCP Port 8088. #$SplunkHome/etc/apps/splunk_httpinput/local/inputs.conf (This is the full config file) [http] useDeploymentServer = 0 allowSslCompression = false root@test-deploymentServer:~# netstat -anoltp |grep -i 8088 NO RESULTS. Versus #$SplunkHome/etc/apps/splunk_httpinput/local/inputs.conf (This is the full config file) [http] useDeploymentServer = 1 allowSslCompression = false root@test-DeploymentServer:~# netstat -anoltp |grep -i 8088 tcp 0 0 0.0.0.0:8088 0.0.0.0:* LISTEN 10639/splunkd off (0.00/0/0) root@test-DeploymentServer:~# telnet 127.0.0.1 8088 Trying 127.0.0.1... Connected to 127.0.0.1. Escape character is '^]'. ^] E: Here's the official response from Splunk Support regarding this issue According to the splunk attribute specifications: useDeploymentServer = <boolean> * Whether or not the HTTP event collector input should write its configuration to a deployment server repository. * When you enable this setting, the input writes its configuration to the directory that you specify with the 'repositoryLocation' setting in the serverclass.conf file. * You must copy the full contents of the splunk_httpinput app directory to this directory for the configuration to work. * When enabled, only the tokens defined in the splunk_httpinput app in this repository are viewable and editable through the API and Splunk Web. * When disabled, the input writes its configuration to $SPLUNK_HOME/etc/apps by default. * Default: 0 (disabled) https://docs.splunk.com/Documentation/Splunk/8.1.2/Admin/Inputsconf Maybe should be more explicit as you mention, if you feel that this is something should be improved or more clear thru our documentation we have Splunk Ideas, please check: https://docs.splunk.com/Documentation/Community/1.0/community/SplunkIdeas

isoutamo · ‎10-12-2020

Hi As @burwell said the SHC is the best option. Second one is rsync. And in both of those you should implement some kind of backup solution (e.g. with git) to keep users changes on safe place even there will be any incidents on production nodes. r. Ismo

richgalloway · ‎09-30-2020

It would take a major catastrophe to bring down an entire AWS region. Spreading an app across AZs in a single region is sufficient in most cases. It all depends on your risk tolerance, of course. Indexer replication is key to data protection, but it doesn't have to replicate to another region. The copies can be in other zones. I understand the goal of minimizing outside dependencies. Splunk, however, is not fully HA and doesn't try to be. For instance, there can be only one CM and there is no built-in mechanism for a hot CM to keep a cold CM current. Fortunately, that's not a problem since a fresh CM can easily rebuild its state with information supplied by the indexers. The indexers just need to know where the new CM is and that can be done using DNS (or other networking tricks). Patient: Doctor, it hurts when I do this. Doctor: Well, don't do that. If a Splunk component cannot communicate with another, necessary Splunk component then there's something wrong with the architecture. Firewall rules or other changes need to be made so components can talk to each other as intended. Requiring forwarders to send data only to local indexers is reasonable and commonplace. It works well if the local indexers can replicate data to remote indexers. Requiring forwarders to talk only to a local DS/LM/CM is also common, mainly because most customers have only one. If the DS/LM/CM fails then the forwarder continues to function using the most recent configuration it has until the server is restored. I like your idea #2. Avoid using intermediate forwarders as in your idea #3. That add complexity and can hamper performance. Stick with your multi-site cluster for ensuring your data exists in two places.

Posts	19
Solutions	2
Karma Given	34
Karma Received	8
Member Since	‎09-28-2020

Online Status	Offline
Date Last Visited	‎02-06-2024 06:36 PM

Is it an anti-pattern to add your own modification...

What are best practices for decommissioning Splunk...

Delivering Events to Splunk via MS Azure Event Hub...

Need Help Creating a Nested Stats Table and Groupi...

Need Help Searching / Visualizing AWS WAF Data

Implications of Destroying an Indexer Peer Within ...

Need Help Routing Subset of Docker Container Logs ...

Having Difficulty Understanding Stanza in props.co...

Splunk Deployment Server Listening on HTTP 8088

Best Practices For Syncing Knowledge Objects on St...

Re: What are best practices for decommissioning Sp...

Re: Is it an anti-pattern to add your own modifica...

Re: Difficulty Understanzing Stanza in props.conf

Re: Delivering Events to Splunk via MS Azure Event...

Re: HEC HWF behind AWS Loadbalancer. ELB or ALB?

Re: Need Help Searching / Visualizing AWS WAF Data

Re: Need Help Creating a Nested Stats Table and Gr...

Re: Implications of Destroying an Indexer Peer Wit...

Re: Need Help Routing Subset of Docker Container L...

Re: Splunk Deployment Server Listening on HTTP 808...

Re: Best Practices For Syncing Knowledge Objects o...

Re: Multi-Site Cluster Administration Questions.