About Admiral_Marith

Admiral_Marith · ‎02-28-2017

We are doing all od that, finally after a protracted discussion with our storage group. Your example really highlights the need to create a thawed path, so that's probably what we will use the vacated HDD space for.

Admiral_Marith · ‎02-28-2017

Because of the inability of our data source owners to tell us how long their retention requirements were, and that our initial onboarding is all network/windows event log/security/active directory/Linux syslog stuff and falling under the interesting data for the enterprise security app, we decided to do 90 days searchable 1 year total retention of the data. When we went physical hardware with the indexers we were having a bit of an argument with our storage group on the 'Yes, I said terabytes' request we were coming onto a situation where the first cut at a mix of HDD and SSD sizing, the original SSD sizing was predicted to be too small for expected license growth. We have two cisco 3260 M4 with 24 disk slots on it. had 7 900 GB SSD, and 9 900 GB 15K HDD (2as a raid 1 system drive and 7 as the HDD for cold). We have a 6 slot SSD RAID 5 array with one active hot spare. Not really being able to pull the lower capacity HDD's to free up slots because we had still not gotten SAN storage for cold and frozen, we and then added 5 3.8 TB SSD's, which was the minimum number of devices to do a RAID 5 and have an active hot spare. So the available slot limitation made us go with higher density SSD devices in the second RAID and we have both raid arrays presented as physical volumes to LVM and in the same logical volume group. the MTBF characteristics of the cisco enterprise class SSD offerings were considered sufficient for our needs with the raid 5 and hot spare configuration over the projected life of the server. When it comes time to replace these, we will re-architect the indexers based on better data, and the ability to use SAN. We are using SSD for data models and hot/warm.

Admiral_Marith · ‎02-27-2017

Greetings fellow delvers of the deep data.... We recently made some changes to indexes.conf because we were not sure the config was doing what we wanted it to do. The result of that poorly considered decision moved 2.6 TB of data out of hot/warm into cold, and another 400 GB into frozen that probably was prematurely aged out. Let me preface this by saying for our size, we have a lot of SSD on our new indexers. Current disk footprint /dev/mapper/splunkdatacold 8.8T 4.4T 4.5T 49% /splunkdatacold /dev/mapper/splunkdatafrozen 15T 2.5T 13T 17% /splunkdatafrozen /dev/mapper/vg02-lv01 650G 414G 237G 64% /splunkdatamodels /dev/mapper/vg02-lv00 11T 933G 9.1T 10% /splunkdatahot The result of that grand data shuffle is that I have a 11 TB SSD volume for hot with 9.1 TB unused space. I also have 5 TB unallocated in the volume group that I could add. I'm told that amount of SSD is somewhat uncommon, but we were going for a reasonably future proof configuration. I am about to cluster another indexer with the same space footprint into this, so I want to make sure we are utilizing the SSD effectively. For the purpose of this question, I will stick to one of our big indexes. We had these two settings and turning them off is what caused the multiple terabyte shift out of hot. maxHotIdleSecs = 86400 maxWarmDBCount = 6800 One of our bigger and typical indexes definitions. [networks] homePath = volume:hot/networks/db coldPath = volume:cold/networks/colddb thawedPath = $SPLUNK_DB/networks/thaweddb maxTotalDataSizeMB = 5083636 homePath.maxDataSizeMB = 3389260 coldPath.maxDataSizeMB = 1694376 #explicit path to frozen directory coldToFrozenDir = /splunkdatafrozen/networks We had thought that getting a 2/3 to 1/3 ratio between hot+warm/cold would be done by specifying the max Total size, and then the homepath.maxDataSizeMB and coldPath.maxDataSizeMB so that those two values added up to the maxTotal value. So the question is, how do you go about engineering a 2/3 to 1/3 split between hot/cold? We want to utilize the SSD's as much as possible, which is why we have been playing with the maxwarmDBcount parm and maxHotIdleSecs Today we turned both of those back on and change the networks sizing a bit. Just to see what actually happens, because the previous configuration never pushed hot past 30% Current parms in place: maxHotIdleSecs = 86400 maxWarmDBCount = 6800 [networks] homePath = volume:hot/networks/db coldPath = volume:cold/networks/colddb thawedPath = $SPLUNK_DB/networks/thaweddb maxTotalDataSizeMB = 5083636 #explicit path to frozen directory coldToFrozenDir = /splunkdatafrozen/networks maxTotalDataSizeMB = 5083636 homePath.maxDataSizeMB = 5083636

Admiral_Marith · ‎10-28-2016

So if I understand this correctly. that limit is on a per index basis, so if one sets it globally in [default] for the indexes.conf to say 1200, that's 1200 warm buckets per index.

Admiral_Marith · ‎10-06-2016

I'm seeing a sudden spike in data coming from our firewalls (edge and internal). On average an increase of 202% daily. It's caused a 42% surge in my license use. While we're chasing that down, I wanted to make sure that I'm not shipping things to frozen prematurely. That said: Indexes.conf snippet for networks [networks] homePath = volume:hot/networks/db coldPath = volume:cold/networks/colddb thawedPath = $SPLUNK_DB/networks/thaweddb maxTotalDataSizeMB = 2541818 homePath.maxDataSizeMB = 1694630 coldPath.maxDataSizeMB = 847188 #explicit path to frozen directory coldToFrozenDir = /splunkdatafrozen/networks So I would expect a total footprint of hot/warm/cold to be 2.54 TB. My actual footprint seems to be.... :/splunkdatahot # du -hs networks/ 213G networks/ :/splunkdatacold # du -hs networks/ 828G networks/ For a total of 1041 GB. What's frosting my cookies the wrong flavor is the face that my homePath.maxDataSizeMB is set to 1.694 TB, but hot/warm only has 213G, whereas coldPath.maxDataSizeMB is 847 GB. Cold use appears to be close to that, but Hot/Warm isn't close to that and the footprint is NOT increasing in hot/warm day to day. So what is it in indexes.conf config for this index we are doing wrong? Please note that while we're sorting out where we're actually going to put frozen, my hot/warm is larger than cold, so we had been trying to shoot for around a 70/30 split between hot/cold. (I know that's inverted but I have a large amount of SSD here - whereas we don't have SAN for cold or frozen - yet) Any insight appreciated. -The Admiral.

Admiral_Marith · ‎10-06-2016

Yes. I'm aware of all those parameters. What I was shooting for was an alternative from the -->settings-->data-->indexes dialog where you see things like current size, earliest event etc. This is one of the few things where there is no 'open in search' button. When you sort on earliest event, it appears you get events that are included in frozen. I'm trying to know just the event date ranges for hot/warm/cold. I have a retention policy of 90 days searchable (delivering 120) and remainder of year archive (275 days but delivering 295). Since Splunk index sizes are determined by data volume and not event age, I need know the earliest event in my indexes that does not include frozen as a sort of retention audit. Now, however, I've seen a 42% increase in my index usage all in firewall traffic, so monitoring the earliest event in hot/warm/cold is even more of a concern while we nail down the cause of the spike.

Admiral_Marith · ‎09-22-2016

I should be a bit more clear. If seeing footprint for hot/cold/frozen I'd like to see the size of the data on hot/cold/frozen and not an aggregate number.

Admiral_Marith · ‎09-22-2016

Good morning those more knowledgeable than myself 🙂 The index usage default panel which shows such useful information as earliest event, is not quite giving me what I need. Trying to manage Hot/warm, cold and frozen in such a way as 60% of the data is on hot/warm, 40% of the data is on cold and anything older than 115 days (we promise 90 searchable) goes to frozen. The frozen data is included in the earliest event calculation, and I'd like to either see my footprint in size for only hot/cold or for all three so I can calculate the hot/cold ratio without the blur introduced by including frozen. Ideally I'd like to have the ability to control hot/cold retention entirely by the date range of the data, not the size of it but that seems to be impossible to do directly hence calculating it. Now having turned frozen on, the data we were using to size the indexes is being made fuzzy by including the frozen data in the earliest event count. So what particular Splunk incantation is needed to parse the index footprint data out like that? -J

Admiral_Marith · ‎01-28-2016

it looks as though this was caused by windows local DNS caching being persistent as long as the forwarder was running and the DNS cache was not flushed locally. Flushing it with the forwarder running did not release the socket that was in use.

Admiral_Marith · ‎01-28-2016

Hey gang. This is somewhat urgent. Moved the indexer to a physical box from a virtual. In our situation we use an alias 'splunkprodindex' to point to the indexer. We did this knowing we would have to move the indexer off virtual. This was to theoretically make it easier to do this. Shut everything down. rsynced the data. Changed the alias to point to the new indexer in DNS. Updated serverclass to whitelist the indexer in the deployment server. Updated forwarder outputs with a comment field as it contained the DNS alias and we wanted to make sure the deployment server had a conversation with the forwarders immediately I have 55 missing forwarders in the deployment monitor screen. When we check a number of these, we see in the log they are still trying to talk to the old server name. Active Directory DNS was flushed. Name resolution on the box (Windows AD servers) was confirmed as to be pointing to the right IP address. We removed the old indexer from the indexer role, and started the new indexer and deployment server and we could see that serverclass and forwarder outputs had a recent timestamp that was after the changes made in the deployment server. We've had several things report in with no trouble. We have stopped and restated forwarders. We have in one case uninstalled and reinstalled the forwarder on one Active Directory server. When we installed it, we only supplied the deployment server name, no indexer name specified. One AD server in a different domain has since checked in to the new server. We made no changes to it. What do we do next? We're going to lose AD data as the logs roll over shortly if we have not already.

Admiral_Marith · ‎01-20-2016

Following the steps in this document: http://docs.splunk.com/Documentation/Splunk/6.2.5/Installation/MigrateaSplunkinstance This is Linux to Linux - Prior to doing this on the new hardware, we are attempting to do this in the lab environment in which we built and configured an identical machine to use as a new indexer. The above document does not appear to cover everything needed to do this. We shut down the lab/test system completely and rsynced the data. We have some indexes pointing to /splunkdatahot /splundatacold... frozen... etc. And a couple still sitting in main in the /opt/splunk/var/lib/splunk file system. So /opt/splunk, /opt/splunk/var/lib/splunk, /splunkdatahot /spunkdatacold, /splundatafrozen, and splundatamodels were copied from the old VM to the new VM to test the process. While this lab environment is V to V, the process should be identical for V to P if I understand things correctly. When things first came up, the new indexer needed the --accept-license flag as expected. Had to change some of the configuration files (add the new name to the indexer whitelist in the server class and chase down where the machine name for http://i[machine]:8000/en-US/app/launcher/home#en-US/app/launcher/home was stored. We are able to search the copied data. We have our custom forwarder outputs file where the indexer name is stored (in prod this will use the alias and we will repoint the alias and not have to do this) and we changed the name of the name of the tcpout:primary_indexers stanza to the new machine name. Deployment server pushed that out to the new indexer (the only machines running at this point are the indexer and the deployment server). We are getting an error message on the indexer console: Forwarding to indexer group primary_indexers blocked for 100 seconds. Since this machine is the only machine in the that group so we don't know why it's saying that. forwarder outputs looks like this: [tcpout] defaultGroup = primary_indexers [tcpout:primary_indexers] server = [servername]006.[fqdn]:9997 We see the forwarder outputs being pushed out to the universal forwarders with that name in them but we have no new data coming into the index. The forwarder logs are also showing the forwarder still attempting to talk to the old indexer name even though forwarder_outputs contains the new indexer name. In short, we cannot do this in prod unless we have a much clearer 'how to' guide. If anyone can help us fill in the blanks we would greatly appreciate it.

Admiral_Marith · ‎01-11-2016

We are seeing this in Linux splunk also. Interesting thing, I can take the entirety of Splunk_TA_nessus in /opt/splunk/etc/apps on the search head it's failing on to another search head, restart splunk and it works on two other search heads. The only difference being that one search head has the Enterprise Security app on it, and the others do not. This is the 4.0.0 version exhibiting the behavior for us. Our splunk level is 6.2.5 and correspondingly supported Enterprise Security app version 3.31 We've examined permissions between the working and not working deployments and nothing is obvious. We'd like this on the same SH as Enterprise Security if possible. Adding my voice to this in hopes that the above information helps connect some dots.

Admiral_Marith · ‎01-11-2016

Right now I am tasked with creating a report for a department showing who is using elevated privileges in Linux and for what commands. That search looks somewhat like this (I've anonymized the server names because of policy, but it will give you the idea. index=* sudo host=[LOGHOST] ( ("[serverytype1]dl002" ) OR ("[serverytype1]dl004" ) OR ("[serverytype1]pl007" ) OR ("[serverytype1]pl008" ) OR ("[serverytype1]pl009" ) OR ("[serverytype1]pl010" ) OR ("[serverytype1]pl011" ) OR ("[serverytype1]pl012" ) OR ("[serverytype1]pl013" ) OR ("[serverytype1]pl014" ) OR ("[serverytype1]tl002" ) OR ("[serverytype1]tl005" ) OR ("[serverytype1]tl006" ) OR ("[serverytype1]tl009" ) OR ("[serverytype1]tl010" ) OR ("[serverytype1]tl011" ) OR ("[serverytype1]tl012" ) OR ("[serverytype1]tl013" ) OR ("[serverytype3]pl001" ) OR ("[serverytype3]sl001" ) OR ("[serverytype4]pl001" ) OR ("[serverytype4]pl002" ) OR ("[serverytype4]pl003" ) OR ("[serverytype4]pl004" ) OR ("[serverytype4]pl005" ) OR ("[serverytype4]pl006" ) OR ("[serverytype4]pl007" ) OR ("[serverytype4]pl008" ) OR ("[serverytype4]tl001" ) OR ("[serverytype2]pl003" ) OR ("[serverytype2]pl004" ) OR ("[serverytype2]pl005" ) OR ("[serverytype2]pl006" ) OR ("[serverytype2]pl007" ) OR ("[serverytype2]pl008" ) OR ("[serverytype2]pl009" ) OR ("[serverytype2]tl001" ) OR ("[serverytype2]tl002" ) OR ("[serverytype2]tl005" ) OR ("[serverytype1]pl016" ) OR ("[serverytype1]pl015" ) ) ( NOT scomact ) (NOT USER=root) So I have a lookup table with all of the server names that are listed above. The problem with doing either method, is that either the list in the search needs to be updated every time (Today alone 12 server decommissions for this group and 6 server build request last week) or the lookup table needs to be updated. The syslog data is coming in, in most cases via remote syslog to a loghost which then sends it in via a universal forwarder. This makes all of the data come in as hostname '[LOGHOST]' so I cannot do a host=name. Sourcetype=sudo is too limiting as I have some sudo log stuff coming in on auth-priv sourctype and a couple others. Could we normalize that stuff? Yeah probably. Just haven't. The server names are split between a naming convention that would allow pattern matching, and an older naming convention where servers were named after Minnesota lakes and thus not able to be pattern matched for all of the hosts. If this was a scripting language I'd just loop through the list of servers and grab the data that met the date ranges and stuff it into a file, salt to taste and serve. How do I create a less cumbersome search than the above example?

Admiral_Marith · ‎11-23-2015

Given that a universal fowarder installs to /opt/splunkforwarder (or seems to in the system I am looking at for example) I believe a universal forwarder configuration can live alongside a heavy forwarder. One problem is that /etc/init.d/splunk would get changed to point to the heavy fowarder in /opt/splunk so making certain that the universal forwarder is shut down before installing/configuring the heavy forwarder seems the way to go. This is what I'm planning on doing with a syslog forwarder.

Posts	14
Solutions	0
Karma Given	1
Karma Received	1
Member Since	‎05-07-2014

Online Status	Offline
Date Last Visited	‎06-05-2020 02:04 AM

How to maximize the use of SSD for hot/warm data?

Why are we not seeing expected behavior for maxTot...

How to find the index footprint by hot, cold, and ...

Why is the deployment server reporting missing for...

Why are we getting error "Forwarding to indexer gr...

Is it possible to use a lookup table to populate a...

Re: How to maximize the use of SSD for hot/warm da...

Re: How to maximize the use of SSD for hot/warm da...

How to maximize the use of SSD for hot/warm data?

Re: Why are we not seeing expected behavior for ma...

Why are we not seeing expected behavior for maxTot...

Re: How to find the index footprint by hot, cold, ...

Re: How to find the index footprint by hot, cold, ...

How to find the index footprint by hot, cold, and ...

Re: Why is the deployment server reporting missing...

Why is the deployment server reporting missing for...

Why are we getting error "Forwarding to indexer gr...

Re: Splunk Add-on for Nessus: Why am I unable to p...

Is it possible to use a lookup table to populate a...

Re: Upgrading a univeral forwarder to a heavy weig...

Join the Conversation