Solved: Documentation Clarification

ChristianF · ‎08-01-2023

Hey Splunk community,

I've been getting turned around in the docs as some things are meant for folks running a single instance and others meant for a distributed environment. I'm currently running an environment with the following

	Search Head (Windows Server 2019)	Indexer 1 (CentOS Stream 9)	Indexer 2 (CentOS Stream 9)	Indexer 3 CentOS Stream 9)
CPU	48 Cores	24 Cores	24 Cores	24 Cores
Disk Space	4Tb	500 GB (expandable)	500 GB (expandable)	500 GB (expandable)
Roles	Deployment Server Cluster Manager License Manager Search Head	Indexer	Indexer	Indexer

I have a Syslog server running Syslog-ng (that won't start the service but thats for another post.)

Now to the main part of the post: I am principally trying to do two things right now, I have forwarders installed on my file servers and one of my domain controllers. The thing is, the documentation is not clear on what route I need to take to ingest file data and AD data. Do I utilize a deployed app to my forwarders that will "automagically" ingest the data I am looking for or create an inputs.conf file to monitor the events I am looking for. Specifically file reads, modifications and related data. I would also like to monitor my AD for logins and and administration data.

Any help would be appreciated.

gcusello · ‎08-01-2023

Hi @ChristianF,

at first, to correctly design an architecture like yours, Community isn't the best solution because this architecture requires a deep and detailed requisites analysis and design by a Splunk Architect!

Anyway, analyzing your hardware references (without knowing your requirements) I see some non correct configurations:

On Search Head, you don't need 4TB of storage: SHs usually contain only Splunk and apps: usually between 100 and 200 GB not more.
500 GB on each indexer should be very few and storage should be designed having the following data:
- daily indexing rate,
- retention (in days),
- presence of a Cluster, if there's a Cluster also Search Factor and Replication Factor are mandatory,
Cluster Master cannot be in the same server of Search Head, but requires a dedicated server, with the minimum hardware reference (12 CPU, 12 GB RAM, 100 GB disk),
Deployment Server cannot be in the same server of Search Head and, if it has to manage more than 50 clients, it requires a dedicated server with the minimum hardware reference (12 CPU, 12 GB RAM, 100 GB disk),
License Master can be on the Search Head but I usually prefer to locate it on the Cluster Master.
CPUs depends on the users and scheduled searches you plan to manage and on the presence of Premium Apps like Enterprise Security or ITSI, so I cannot say if they are correct or not,
You didn't described RAM in your servers.

Here you can find some information on hardware reference: https://docs.splunk.com/Documentation/Splunk/9.1.0/Capacity/Referencehardware , but I hint to hire a Splunk Architect!

Then, where do you want to locale syslog-ng server? I usually prefer rsyslog but it's the same thing and I prefer to locate it on a Universal Forwarder.

It's better to have two Universal Forwarders with sysog-ng and a Load Balancer in front of them to have a redundant architecture so you don't loose syslogs during maintenence or fails.

To take logs from Domain Controllers or file servers, you have to install a Universal Forwarder on each of them and deploy apps to them using the Deployment Server.

Apps will ingest the logs and send them to the indexers.

You don't need to create an inputs.conf, you can take the one in the apps to deploy (e.g. the Splunk_TA_Windows) and enable the inputs you want; custom inputs must be used only for custom inputs in custom Add-Ons.

I hint to follow some training on Splunk architecture and administration or at least see some videos on the YouTube Splunk Channel.

Ciao.

Giuseppe

View solution in original post

gcusello · ‎08-01-2023

Hi @ChristianF,

at first, to correctly design an architecture like yours, Community isn't the best solution because this architecture requires a deep and detailed requisites analysis and design by a Splunk Architect!

Anyway, analyzing your hardware references (without knowing your requirements) I see some non correct configurations:

On Search Head, you don't need 4TB of storage: SHs usually contain only Splunk and apps: usually between 100 and 200 GB not more.
500 GB on each indexer should be very few and storage should be designed having the following data:
- daily indexing rate,
- retention (in days),
- presence of a Cluster, if there's a Cluster also Search Factor and Replication Factor are mandatory,
Cluster Master cannot be in the same server of Search Head, but requires a dedicated server, with the minimum hardware reference (12 CPU, 12 GB RAM, 100 GB disk),
Deployment Server cannot be in the same server of Search Head and, if it has to manage more than 50 clients, it requires a dedicated server with the minimum hardware reference (12 CPU, 12 GB RAM, 100 GB disk),
License Master can be on the Search Head but I usually prefer to locate it on the Cluster Master.
CPUs depends on the users and scheduled searches you plan to manage and on the presence of Premium Apps like Enterprise Security or ITSI, so I cannot say if they are correct or not,
You didn't described RAM in your servers.

Here you can find some information on hardware reference: https://docs.splunk.com/Documentation/Splunk/9.1.0/Capacity/Referencehardware , but I hint to hire a Splunk Architect!

Then, where do you want to locale syslog-ng server? I usually prefer rsyslog but it's the same thing and I prefer to locate it on a Universal Forwarder.

It's better to have two Universal Forwarders with sysog-ng and a Load Balancer in front of them to have a redundant architecture so you don't loose syslogs during maintenence or fails.

To take logs from Domain Controllers or file servers, you have to install a Universal Forwarder on each of them and deploy apps to them using the Deployment Server.

Apps will ingest the logs and send them to the indexers.

You don't need to create an inputs.conf, you can take the one in the apps to deploy (e.g. the Splunk_TA_Windows) and enable the inputs you want; custom inputs must be used only for custom inputs in custom Add-Ons.

I hint to follow some training on Splunk architecture and administration or at least see some videos on the YouTube Splunk Channel.

Ciao.

Giuseppe

gcusello · ‎08-01-2023

Hi @ChristianF ,

good for you, see next time!

Ciao and happy splunking

Giuseppe

P.S.: Karma Points are appreciated by all the contributors 😉

ChristianF · ‎08-01-2023

Hi Giuseppe, For the main search head server, it is actually a holdover from an older SIEM system that I repurposed for Splunk. I did speak with a Splunk Architect about that and said for the deployment size in its current state that it would be fine to host all these roles on that server due to how powerful it is. I can't convert it to a VMWare ESXi host as I would like due to the fact that some of the hardware included is incompatible with ESXi. Each of the indexers has 64 GB of ram and the Search Head has 128 GB of ram.

As for the Syslog-ng server, I utilized it after research purely due to the higher quality of documentation on their website. I don't really have a preference for either outside of it. I do have a Universal Forwarder installed on it and monitoring the destination locations but the Syslog-ng service refuses to start (no error code outside of just stating failure.) journald and systemctl logs doesn't offer any more insights either. I did check the syntax and there's no issue there despite extensive modifications I used "syslog-ng --syntax-only" command which came back clean.

I do have Universal Forwarders installed on the domain controllers and file system but the clarification point that you added for "You don't need to create an inputs.conf, you can take the one in the apps to deploy (e.g. the Splunk_TA_Windows) and enable the inputs you want; custom inputs must be used only for custom inputs in custom Add-Ons." Was actually what I was looking for!

Could you provide a documentation link to where I can implement a load balancer or would I be looking for a third party software to implement this?

gcusello · ‎08-01-2023

Hi @ChristianF,

about the hardware, a server like this is wasted as Search Head, usually SHs are virtula machines with very few disk, not 4 TB, this kind of machines are used as Indexers because the first requirement for an indexer is 800 IOPS.

about syslog server rsyslog or syslog-ng is the same.

About Load Balancer, here you can find some information: https://docs.splunk.com/Documentation/Splunk/9.1.0/Forwarding/Setuploadbalancingd

Anyway, you need a virtual IP that mekes a continous check on the availability of the destination Serverse (UFs), onely one attention: it must run in transparent mode, so the source IP must be maintained.

About the Windows TA, you should copy the inputs.conf that you can find in it from default to local, analyze it and enable the inputs you need.

Obviously you have to do this on the deployment Server so you can deploy the configurations to the UFs.

Abut the multi role of the SH, the problem isn't if it has the power to run al the roles, some roles aren't compatible: Search Head and Master Node cannot be in the same server.

In addition DS requires a dedicated machine if it has to manage more than 50 clients.

Ciao.

Giuseppe

richgalloway · ‎08-01-2023

You want to use a combination of both methods. Create an app that contains an inputs.conf file. The file will specify the inputs you wish to ingest. Deploy the app to your forwarders using the DS.

You may want to use multiple apps with different inputs.conf files. One might be used for AD inputs and would be deployed only to AD servers. Other apps might be deployed to other servers.

---
If this reply helps you, Karma would be appreciated.

Documentation Clarification

inputs.conf

universal forwarder

Windows

Threat Hunting Unlocked: How to Uplevel Your Threat Hunting With the PEAK Framework ...

Splunk APM: New Product Features + Community Office Hours Recap!

Index This | Forward, I’m heavy; backward, I’m not. What am I?