Getting Data In

Request for Best Practices on Collecting Essential Data from Endpoints to Splunk

kn450
Explorer

Dear Splunk Community,

I am currently working on a project focused on identifying the essential data that should be collected from endpoints into Splunk, with the goal of avoiding data duplication and ensuring efficiency in both performance and storage.

Here’s what has been implemented so far:

  • The Splunk_TA_windows add-on has been deployed.

  • The inputs.conf file has been configured to include all available data.

  • Sysmon has been installed on the endpoints.

  • The Sysmon inputs.conf path has been added to be collected using the default configuration from the Splunk_TA_windows add-on.

  • In addition, we are currently collecting data from firewalls and network switches.

I’ve attached screenshots showing the volume of data collected from one endpoint over a 24-hour period. The data volume is quite large, especially in the following categories:

  • WinRegistry

  • Service

Upon reviewing the data, I noticed that some information gathered from endpoints may be redundant or unnecessary, especially since we are already collecting valuable data from firewalls and switches. This has led me to consider whether we can reduce the amount of endpoint data being collected without compromising visibility.

I would appreciate your input on the following:

  1. What are Splunk's best practices for collecting data from endpoints?

  2. What types of data are considered essential for security monitoring and analysis?

  3. Is relying solely on Sysmon generally sufficient in most security environments?

  4. Is there a recommended framework or guideline for collecting the minimum necessary data while maintaining effective monitoring?

I appreciate any suggestions, experiences, or insights you can share. Looking forward to learning from your expertise.

Labels (3)
Tags (1)
0 Karma

livehybrid
Super Champion

Hi @kn450 

I'll try and keep this brief! 

Best practice for endpoint data collection varies by organisation, industry, size, and threat profile (e.g., insider threats, advanced attackers, commodity malware). The right approach is to define your critical security use-cases (such as lateral movement, privilege escalation, or unauthorised access) and then determine what data to collect to support those cases. Relying on tuned Windows event logs provides robust coverage for most behavioral detection; additional Splunk_TA_windows inputs should only be enabled if tied directly to your specific use-cases.

Essential Windows event types commonly used for security monitoring:

  • Logon/Logoff events: 4624 (logon), 4634 (logoff), 4625 (failed logon)
  • Process creation: 4688 (Windows Security log), Sysmon Event ID 1 (with command-line)
  • Network connections: Sysmon Event ID 3
  • File creation/modifications: Sysmon Event ID 11
  • Registry changes: Sysmon Event ID 13 (collect selectively, only if relevant to a use case)
  • Security group/user changes: 4720-4732 (account and group modifications)
  • Service creation/modifications: 7045 (filtered for suspicious activity), Sysmon Event ID 7
 

The free Splunk Security Essentials app can help you map your currently collected data to known security use-cases and identify gaps, aligning your ingestion strategy with what provides actionable security value. I would highly recommend installing this and having a look!


Avoid enabling all data sources by default. Regularly review event types and volumes, filtering or disabling sources that do not support your prioritised detection requirements, to control storage and license use.

Are you using Splunk Enterprise Security, or are you planning to create your own rules? There is a lot we havent covered here such as making sure data is CIM compliant/mapped, hardware sizing etc. This is the sort of thing which would usually be architected out at the start to ensure you have the right data and resources available.

I'd recommend checking out the following links too:

Data source planning for Splunk Enterprise Security

Lantern - Getting data onboarded to Splunk Enterprise Security

🌟 Did this answer help you? If so, please consider:

  • Adding karma to show it was useful
  • Marking it as the solution if it resolved your issue
  • Commenting if you need any clarification

Your feedback encourages the volunteers in this community to continue contributing

0 Karma

PickleRick
SplunkTrust
SplunkTrust

There is no single "best practice" here. It all depends on what you want to achieve. If you have specific use cases and detections, you might want to limit your data only to the data directly contributing to those detections and nothing else. And generally the extent of the data gethered will differ depending on your detections.

But if you want to have data for subsequent forensics or threathunting then you might want to have as much data as possible. As long as you know what your data is about (and that's the most important thing with data onboarding - don't onboard unknown data just because "it might come in handy one day"), you want as much data as you can afford with your environment size and license constraints.

0 Karma

kiran_panchavat
Influencer

@kn450 

Configuring Windows event logs for Enterprise Security use

https://lantern.splunk.com/Security/Product_Tips/Enterprise_Security/Configuring_Windows_event_logs_...

Ensure that data sent from endpoints to Splunk is encrypted using SSL/TLS

Configure Splunk indexing and forwarding to use TLS certificates - Splunk Documentation

 

Did this help? If yes, please consider giving kudos, marking it as the solution, or commenting for clarification — your feedback keeps the community going!
0 Karma

kiran_panchavat
Influencer

@kn450 

Utilize heavy forwarders (HF) to filter and route data based on event types, reducing unnecessary data ingestion

Route and filter data - Splunk Documentation

Data collection architecture - Splunk Lantern

Did this help? If yes, please consider giving kudos, marking it as the solution, or commenting for clarification — your feedback keeps the community going!
0 Karma

kiran_panchavat
Influencer

@kn450 

Critical for detecting login attempts, privilege escalation, and account changes (e.g., Event IDs 4624, 4648, 4672, 4720). Filter out noisy events like 4663 (file access audits) unless specifically needed
 
 
 
 
Useful for system-level events like service changes or crashes (e.g., Event IDs 7036, 7045). Limit to high-value events to reduce volume. 
 
Avoiding Redundancy:
 
Firewall logs provide network traffic visibility (e.g., source/destination IPs, ports, protocols). Avoid collecting redundant network data from endpoints (e.g., excessive DNS or connection logs) unless it provides unique context, like process-level details from Sysmon

https://lantern.splunk.com/Data_Descriptors/Firewall_data 

WinRegistry and Service: These are high-volume sources. Limit to specific keys (e.g., Run keys, AppInit_DLLs) and events (e.g., new service creation) to avoid collecting redundant or low-value changes.
Did this help? If yes, please consider giving kudos, marking it as the solution, or commenting for clarification — your feedback keeps the community going!
0 Karma

kiran_panchavat
Influencer

@kn450 

Clearly define the security use cases (e.g., threat detection, incident response, compliance) to determine which data sources are necessary. Avoid collecting all available data without a purpose, as this increases storage and processing overhead. For example, focus on data that supports MITRE ATT&CK tactics like Execution, Persistence, or Credential Access. 
 
 
Did this help? If yes, please consider giving kudos, marking it as the solution, or commenting for clarification — your feedback keeps the community going!
0 Karma
Get Updates on the Splunk Community!

See your relevant APM services, dashboards, and alerts in one place with the updated ...

As a Splunk Observability user, you have a lot of data you have to manage, prioritize, and troubleshoot on a ...

Splunk App for Anomaly Detection End of Life Announcement

Q: What is happening to the Splunk App for Anomaly Detection?A: Splunk is officially announcing the ...

Aligning Observability Costs with Business Value: Practical Strategies

 Join us for an engaging Tech Talk on Aligning Observability Costs with Business Value: Practical ...