All Posts

Find Answers
Ask questions. Get answers. Find technical product solutions from passionate members of the Splunk community.

All Posts

There is no single "best practice" here. It all depends on what you want to achieve. If you have specific use cases and detections, you might want to limit your data only to the data directly contrib... See more...
There is no single "best practice" here. It all depends on what you want to achieve. If you have specific use cases and detections, you might want to limit your data only to the data directly contributing to those detections and nothing else. And generally the extent of the data gethered will differ depending on your detections. But if you want to have data for subsequent forensics or threathunting then you might want to have as much data as possible. As long as you know what your data is about (and that's the most important thing with data onboarding - don't onboard unknown data just because "it might come in handy one day"), you want as much data as you can afford with your environment size and license constraints.
@kn450  Configuring Windows event logs for Enterprise Security use https://lantern.splunk.com/Security/Product_Tips/Enterprise_Security/Configuring_Windows_event_logs_for_Enterprise_Security_use ... See more...
@kn450  Configuring Windows event logs for Enterprise Security use https://lantern.splunk.com/Security/Product_Tips/Enterprise_Security/Configuring_Windows_event_logs_for_Enterprise_Security_use Ensure that data sent from endpoints to Splunk is encrypted using SSL/TLS Configure Splunk indexing and forwarding to use TLS certificates - Splunk Documentation  
@kn450  Utilize heavy forwarders (HF) to filter and route data based on event types, reducing unnecessary data ingestion Route and filter data - Splunk Documentation Data collection architecture... See more...
@kn450  Utilize heavy forwarders (HF) to filter and route data based on event types, reducing unnecessary data ingestion Route and filter data - Splunk Documentation Data collection architecture - Splunk Lantern
@kn450  Critical for detecting login attempts, privilege escalation, and account changes (e.g., Event IDs 4624, 4648, 4672, 4720). Filter out noisy events like 4663 (file access audits) unless speci... See more...
@kn450  Critical for detecting login attempts, privilege escalation, and account changes (e.g., Event IDs 4624, 4648, 4672, 4720). Filter out noisy events like 4663 (file access audits) unless specifically needed   https://community.splunk.com/t5/All-Apps-and-Add-ons/How-do-I-collect-basic-Windows-OS-Event-Log-data-from-my-Windows/m-p/440187    https://community.splunk.com/t5/Splunk-Enterprise-Security/What-s-the-best-practice-to-configure-a-windows-system-to/m-p/467532    Refer this event codes: https://www.ultimatewindowssecurity.com/securitylog/encyclopedia/    Useful for system-level events like service changes or crashes (e.g., Event IDs 7036, 7045). Limit to high-value events to reduce volume.    Avoiding Redundancy:   Firewall logs provide network traffic visibility (e.g., source/destination IPs, ports, protocols). Avoid collecting redundant network data from endpoints (e.g., excessive DNS or connection logs) unless it provides unique context, like process-level details from Sysmon https://lantern.splunk.com/Data_Descriptors/Firewall_data  WinRegistry and Service: These are high-volume sources. Limit to specific keys (e.g., Run keys, AppInit_DLLs) and events (e.g., new service creation) to avoid collecting redundant or low-value changes.   https://www.splunk.com/en_us/blog/security/threat-hunting-sysmon-event-codes.html 
@kn450  Clearly define the security use cases (e.g., threat detection, incident response, compliance) to determine which data sources are necessary. Avoid collecting all available data without a pur... See more...
@kn450  Clearly define the security use cases (e.g., threat detection, incident response, compliance) to determine which data sources are necessary. Avoid collecting all available data without a purpose, as this increases storage and processing overhead. For example, focus on data that supports MITRE ATT&CK tactics like Execution, Persistence, or Credential Access.    https://riversafe.co.uk/resources/tech-blog/mastering-data-onboarding-with-splunk-best-practices-and-approaches/    Check this https://community.splunk.com/t5/Splunk-Dev/Splunk-registry-monitor-splunk-regmon-generating-too-much-data/m-p/371705 
Dear Splunk Community, I am currently working on a project focused on identifying the essential data that should be collected from endpoints into Splunk, with the goal of avoiding data duplication a... See more...
Dear Splunk Community, I am currently working on a project focused on identifying the essential data that should be collected from endpoints into Splunk, with the goal of avoiding data duplication and ensuring efficiency in both performance and storage. Here’s what has been implemented so far: The Splunk_TA_windows add-on has been deployed. The inputs.conf file has been configured to include all available data. Sysmon has been installed on the endpoints. The Sysmon inputs.conf path has been added to be collected using the default configuration from the Splunk_TA_windows add-on. In addition, we are currently collecting data from firewalls and network switches. I’ve attached screenshots showing the volume of data collected from one endpoint over a 24-hour period. The data volume is quite large, especially in the following categories: WinRegistry Service Upon reviewing the data, I noticed that some information gathered from endpoints may be redundant or unnecessary, especially since we are already collecting valuable data from firewalls and switches. This has led me to consider whether we can reduce the amount of endpoint data being collected without compromising visibility. I would appreciate your input on the following: What are Splunk's best practices for collecting data from endpoints? What types of data are considered essential for security monitoring and analysis? Is relying solely on Sysmon generally sufficient in most security environments? Is there a recommended framework or guideline for collecting the minimum necessary data while maintaining effective monitoring? I appreciate any suggestions, experiences, or insights you can share. Looking forward to learning from your expertise.
I second @richgalloway 's doubts - your description of the problem is confusing  
OK. And those "fields" are...? Values of a multivalued field in a single event? Or just multiple values returned from "stats values "command? Something else? Do you have any other fields in your dat... See more...
OK. And those "fields" are...? Values of a multivalued field in a single event? Or just multiple values returned from "stats values "command? Something else? Do you have any other fields in your data? Do you want them preserved?  
@k1green97  To find values of Field1 that appear in Field2 using Splunk with makeresults(The makeresults command allows users to quickly generate sample data sets for testing) you can create a query... See more...
@k1green97  To find values of Field1 that appear in Field2 using Splunk with makeresults(The makeresults command allows users to quickly generate sample data sets for testing) you can create a query that generates the data and then uses eval and where to filter the matching values.     You can try this query and replace the values: index=my_index sourcetype=my_sourcetype | stats values(Field1) as Field1_values, values(Field2) as Field2_values | mvexpand Field1_values | where Field1_values IN (Field2_values) | table Field1_values
I am not sure where to start on this. I have 2 fields. Field1 only has a few values while Field2 has many. How can I return values Field2 that appear in Field1? Field 1 Field 2 17 27 24 ... See more...
I am not sure where to start on this. I have 2 fields. Field1 only has a few values while Field2 has many. How can I return values Field2 that appear in Field1? Field 1 Field 2 17 27 24 33 36 17   22   24   31   29   08   36
We need more information. Please say more about the problem you are trying to solve.  It would help to see sample data and desired output.
As I earlier said this should be doable. Just try to keep the mixed mode time as short as possible. Of course you must have enough wide and fast connections between your environments, but currently I ... See more...
As I earlier said this should be doable. Just try to keep the mixed mode time as short as possible. Of course you must have enough wide and fast connections between your environments, but currently I don’t think that this is an issue.
thanks @livehybrid . Upvoted. I almost figured it out, but in a slightly different manner. I'm got an ansible setup for URL interaction and automation. The 'contentctl build' will produce artefact s... See more...
thanks @livehybrid . Upvoted. I almost figured it out, but in a slightly different manner. I'm got an ansible setup for URL interaction and automation. The 'contentctl build' will produce artefact similar to a Splunk app with `savedsearches.conf` and other things like `analyticsstories.conf` contentctl build --path content --app.title MY_DETECT --app.appid DA-ESS-MY_DETECT --app.prefix MY --app.label MY Then i'm using the ansible automation which interacts with saved/searches and other endpoints to insert it back. Two things i'm still figuring out is it is slow once the savedsearches have 50+ searches as it runs one by one contentctl new : this option doesn't accept ALL parameters like search, name which means a user input is required Any chance for automation can detect if a savedsearch is Changed, then only insert   Update:  Able to insert into system after contentctl using REST api "saved/searches".  Though the type is specified as 'ebd' (event-based detection), while it is inserted into Splunk, it becomes a 'saved search' type !! any solutions/recommendations for this?
On-prem Splunk Enterprise Security environment, I just recently upgraded to Enterprise Security 9.4.1 and the ES app to 8.0.3. I was watching a video on using Mission Control, and an investigation w... See more...
On-prem Splunk Enterprise Security environment, I just recently upgraded to Enterprise Security 9.4.1 and the ES app to 8.0.3. I was watching a video on using Mission Control, and an investigation was created from a notable event.  Within the investigation, a search was done, to add it to the Investigation.  I want to do this, but when I select the evetn action drop down, within the Search results, I don't have much there, just the default Splunk Event Actions
Honestly, that tells me completely nothing. If sending a json array is so much cheaper than sending separate items from that array... there's something strange here. BTW, you are aware that you can ... See more...
Honestly, that tells me completely nothing. If sending a json array is so much cheaper than sending separate items from that array... there's something strange here. BTW, you are aware that you can simply send your events in batches? And that it's how it's usually done with high-volume setups? So you don't have to use a separate HTTP request for each event?
For verifying regexes https://regex101.com is usually sufficient Anyway, if you're not using the capture group names for field extraction, don't use capture groups. It makes the regexes easier to... See more...
For verifying regexes https://regex101.com is usually sufficient Anyway, if you're not using the capture group names for field extraction, don't use capture groups. It makes the regexes easier to read and saves a bit of performance because Splunk doesn't have to retain the capture group contents. It's a tiny bit of a difference but it's there. So since you're trying to "recast" the data to a static sourcetype, it's enough to use REGEX = \bssh\b to match your events. And you're misunderstanding the relation between fields and capture groups. If you do | rex "(?<ssh>\bssh\b)" Splunk will create a field named "ssh" because that's what the capture group is named. But it will be matching against the whole raw message because if you don't specify the field for matching it's the default option. You can extract data from a specific field using the field= parameter. Like | rex field=message "(?<ssh>\bssh\b)" This would create a field named "ssh" only if an already existing at this point of your search pipeline (either by default extractions defined for your data or manually extracted or created) field named "message" contained a word "ssh". But anyway, this has nothing to do with transforms. With transforms, it's the SOURCE_FIELD option which decides which field the REGEX will be matched against. One big caveat though (beginners often fall into this trap) - during ingest time processing (and that's what you're trying to do) Splunk has no idea about all search-time extracted fields. You can only use indexed fields here in index-time transforms (and they must have been already extracted if they are custom fields). And again, index-time transforms have nothing to do with searching. (And datamodels are something yet completely different so let's not mix it all ;-)) Your config seems pretty OK at first glance but 1. Naming your sourcetype just "authentication" isn't a very good practice. It's usually better to name your sourcetypes in a more unique way. Usually it's some form of convention using vendor name, maybe product and the "kind" of data. Like "apache:error" or "cisco:ios" and so on. 2. You restarted the HF after pushing this config, didn't you? 3. Is the linux_audit sourcetype the original sourcetype of your data or isn't it also a rewritten sourcetype? (I don't remember that one to be honest). Because Splunk decides just once - at the beginning of the ingestion pipeline - what props and transforms options are relevant for the event. And even if you overwrite the event's metadata "in flight" to recast it to another sourcetype, host or source, it will still get processed till the end of the indexing phase according to the original sourcetype/host/source. 4. Oh, and you applied this config in the right place of your infrastructure? On the first "heavy" component in your events' path?
There are a lot of events and there being sent in chunks to save on the lambda processing cost. 
Rick, Thanks for the reply! Seems like this is much more involved than I initially thought. It's not that I am tryin to use the regex as a means of doing searches. I was only running the search to ... See more...
Rick, Thanks for the reply! Seems like this is much more involved than I initially thought. It's not that I am tryin to use the regex as a means of doing searches. I was only running the search to see if the regex I had was actually hitting the data I'm looking for, so rex is out because I'm really not trying to extract anything. Thanks for that clarification. I ran the search with regex instead of rex and it did come back with what I'm looking for. Like I mentioned, I'm just trying to create a props/transforms set to catch data that matches a certain regex and change it's sourcetype to authentication in attempt to CIM the data. Something like: props.conf [linux_audit] TRANSFORMS-changesourcetype = change_sourcetype_authentication transforms.conf [change_sourcetype_authentication] REGEX=(?<ssh>\bssh\b) FORMAT = sourcetype::authentication DEST_KEY=MetaData:Sourcetype Nothing was coming back when I pushed that to my HF's, so I was trying to search the regex to see if it was even hitting anything. If I understand correctly, the <ssh> field needs to already exist for this to work? With that in mind, to your 4th point, does that mean this approach would not be an ideal one? All my indexes are customer based so organizing datamodels by indexes isn't an option.   Do I just have a typo somewhere I'm missing or am I just going down the wrong lane?
Thank you @livehybrid @yuanliu and @bowesmana! This is my first real post here, so I appreciate you bearing with me as I may not have provided a complete picture. @yuanliu 's answer provided a clear... See more...
Thank you @livehybrid @yuanliu and @bowesmana! This is my first real post here, so I appreciate you bearing with me as I may not have provided a complete picture. @yuanliu 's answer provided a clear example of how I can use mvfind and mvindex to extract the correct data. The only thing I had to add was a \b word boundary to the mvfind regex, so it wouldn't hit the earlier partial match. Here is the query: index=okta "debugContext.debugData.privilegeGranted"="*" | eval type_index = mvfind('target{}.type', "CUSTOM_ROLE\b") | eval "Target Name" = mvindex('target{}.displayName', type_index) | eval "Target ID" = mvindex('target{}.alternateId', type_index) | rename actor.displayName as "Actor", description as "Action", debugContext.debugData.privilegeGranted as "Role(s)" | table Time, Actor, Action, "Target Name", "Target ID", Action, "Role(s)"
I'm trying to do a transaction using an array.  I need to define the transaction by a value in an array.  However, this value could be any value in the array and the value could be in a different arr... See more...
I'm trying to do a transaction using an array.  I need to define the transaction by a value in an array.  However, this value could be any value in the array and the value could be in a different array index number in another event.  Is there an easy command for this in Splunk?