About tretrigh

tretrigh · ‎01-08-2025

Thank you for the input everyone. @isoutamo - you are correct in that each data source I'm looking at has vastly different data available... Some sources come from endpoint agents which have username, endpoint name, ip address (local/public), url, url ip, etc. Other sources from network devices and might track users by local IP only, but also might have which FW the request goes through, etc. I have one source which only lists a single field to identify the user.... the MAC address... really not helpful without an additional lookup. I ended up using a number of macros and lots of coalesces to make my field names consistent.

tretrigh · ‎01-07-2025

You're right... the original question wasn't clear enough. Well it was to me... but that is always the case I suppose! I'll consider using the existing Web DM or potentially creating a new one that will allow a little more customization for what I'm after. Thank you for the input.

tretrigh · ‎01-07-2025

Hi @richgalloway - I considered this one. The description is: The fields in the Web data model describe web server and/or proxy server data in a security or operational context. Looking at the fields in this data model, this seems to me to be geared more for web servers, not the clients of those servers. Many recommended fields in this data model would not apply to the web browsing logs from the client's perspective. Is attempting to squeeze logs from the clients into this data model commonly done? And to answer your questions - We have other data (from web servers) which use the Web data model. Furthermore, the data I want to group/find with this search is definitely NOT CIM compliant. As the number of data sources for web browsing is high for our environment (something like 10+ sources), many of the sources do not have the same information available. I'm building a list of fields myself to standardize the names and would ideally map them to a data model.

tretrigh · ‎01-07-2025

I'm building a search which takes a URL and returns all events from separate indexes/products where a client (user endpoint, server, etc) attempted access. The goal is to answer "who tried to visit url X". I have reviewed the default CIM data models here: https://docs.splunk.com/Documentation/CIM/5.1.0/User/CIMfields However, none seem to fit this specific use case. Can anyone sanity check me to see if I've overlooked one? Thanks!

tretrigh · ‎06-23-2023

Storage is all SSD on NetApp using RAID-DP connected using fibre channel backend. I'm waiting to hear more about matching up times where we're seeing spikes with the guys in Infrastructure. I'm unsure about the IOPS limits at this point. To note, I learned that the OS / disk and the /splunkdata disk for each indexer are all on the same aggregate. As I am unfamiliar with NetApp, I don't know if this matters (but assuming it is okay)?

tretrigh · ‎06-23-2023

We are periodically seeing spikes of Storage I/O Saturation (Monitoring Console > Resource Usage: Deployment). When split by host we can see that this is affecting all 6 indexers nearly simultaneously for the /opt/splunkdata mount points. As expected, this triggers the Health Status notification throughout the day (warning or alert). To note, Load Averages are regularly > 5% with CPU usage normally under 10% for each indexer (24 cores each). RAM usage around 30% per indexer. We are wondering if our physical storage and/or network might be a bottleneck or if it's something on the Splunk side. For a Splunk Admin beginner, could someone please offer some suggestions on where we could start troubleshooting these spikes or explain in more detail the specifics around Storage I/O Saturation? We are on Enterprise 9.0.4 across the board and considering the recent update sooner than later. Thank you!

tretrigh · ‎04-25-2023

Answering my own question here: Several indexers were not automatically getting the new source type applied for unknown reasons. I was specifically looking at one which was not. A reboot of each indexer missing the source type resolved the issue. A splunkd restart would probably have been sufficient. All indexers are working as intended. I added the app to each splunk host (SH, deployment server, etc) which defines the new source type. A debug refresh populated the new source type correctly on each host. I incorrectly assumed that the app's presence on the indexers would affect the data coming from each of the splunk hosts.

tretrigh · ‎04-25-2023

Thanks for the reply @gcusello . In this situation there is no add on. The log file on each Splunk host is generated by a script we wrote. We have attempted to manually define the source type for this specific log unsuccessfully. Do you have any suggestions for how to correctly manually define the source type other than what we've already done? Thank you for the assistance!

tretrigh · ‎04-25-2023

Thank you for the reply. Do you have any specific guidance on how to apply the correct source type to our data in our situation?

tretrigh · ‎04-25-2023

Thank you for the reply. I might be missing something obvious, but unsure how any of these settings might help us reassign the source type to something else. Could you please provide further elaboration? Thank you!

tretrigh · ‎04-25-2023

In our distributed enterprise Splunk environment we have a log file being generated on each Splunk host (indexers, search head, deployment server, etc) located at: /opt/splunk/var/log/splunk/foo.log By default this gets logged to _internal using the foo-too_small source type. We now want to change the source type to one we created (my:custom:sourcetype). I have created the following props.conf file on the deployment server as a custom app and deployed successfully via apply cluster-bundle. However, new log data is still being associated with the existing source type of foo-too_small. We also set the local.meta file (under metadata) for permissions. I have verified this file is making it to the indexers in peer-apps. [my:custom:sourcetype] TIME_FORMAT = %Y-%m-%d %H:%M:%S,%3N MAX_TIMESTAMP_LOOKAHEAD = 25 [source::.../var/log/splunk/foo.log] sourcetype = my:custom:sourcetype Questions: Why isn't this working? What needs to be done instead to change to a custom source type? Thank you in advance!

tretrigh · ‎01-03-2023

Thank you, @PickleRick, for such a detailed answer. You truly are an Ultra Champion as your forum title suggests and it's answers like yours which make a community great. Please correct me if wrong. Other than the 2 rules of thumb, you are suggesting that only in advanced cases (which we may never encounter) should we be concerned about splitting up data into multiple indexes, in cases "with hugely different cardinalities or activity levels". This is recommended for efficiency/performance reasons, or...? Performance is our primary consideration when considering multiple indexes. For example, would firewall logs with >99% traffic and the remainder threat/operational/etc be a good use case for multiple indexes (assuming recommended architecture/resources)? I know we will want dashboards, alerts, reports, etc that ONLY cover the threat logs. We are planning on having CIM compliant data and using acceleration where possible. I'm sure the Splunk Data Administration classes will likely help... Thoughts or input welcome on efficiency considerations for firewall logs being split into multiple indexes as described above. Thank you!

tretrigh · ‎12-30-2022

New customer seeking guidance for creating indexes/sourcetypes and determining granularity. Primarily we're looking for deeper guidance on why more so than what. We have a large, complex environment. Our naming scheme for indexes thus far is: organization_category_purpose (ex acme_net_fw) organization - unique to us, required, primarily used to segment data between organizations. category - broad, like network, application, endpoint, etc purpose - more specific, largely unique per category Does the following seem best practice, for firewalls? 2 or 3 indexes used by firewalls (traffic, operations, maybe threats?) Multiple sourcetypes split into the various indexes We are looking at SC4S as a guide (https://splunk.github.io/splunk-connect-for-syslog/main/sources/vendor/PaloaltoNetworks/panos/) although their examples are not always consistent. We are struggling to determine how granular to be with the purpose of the index and with the amount of possible sourcetypes we can/will have. We do not have the need to specify sensitivity or retention time. Furthermore, we do not have the need to separate security/infrastructure teams. This slide from a Splunk presentation suggests that many sourcetypes get their own index for efficiency: Questions With 4-5 separate firewall products in use in one organization (the most complex), we're looking at 20-25 unique sourcetypes distributed into around 3 firewall indexes, just for firewalls. Does this sound correct? We want to avoid unnecessary complexity for future searches, documentation, etc while not destroying our efficiency. Can anyone speak into their experiences with creating too many/too few indexes? Specifically on long-term organization, search efficiency, overall experience? Can anyone offer any additional real-world guidance on creating a data catalog? We can't see any reason to split up windows event logs for endpoints (security/application, etc) but could see security being separate from the others for DCs. Does that sound correct? Any resources or guidance appreciated. Here's what we're using so far: SC4S example structure: https://splunk.github.io/splunk-connect-for-syslog/main/sources/vendor/PaloaltoNetworks/panos/ https://lantern.splunk.com/Splunk_Success_Framework/Data_Management/Naming_conventions https://subscription.packtpub.com/book/data/9781789531091/5/ch05lvl1sec32/best-practices-for-administering-splunk https://kinneygroup.com/blog/the-proverbial-8-ball-splunk-implementation/

Posts	13
Solutions	1
Karma Given	4
Karma Received	4
Member Since	‎12-30-2022

Online Status	Offline
Date Last Visited	‎01-23-2026 07:50 AM

Sanity Check - No Web Browsing Data Model?

Troubleshooting High Storage I/O Saturation Spikes...

Why is source type stuck on too_small?

Indexes/Source Types Best Practice - Data Onboardi...

Re: Sanity Check - No Web Browsing Data Model?

Re: Sanity Check - No Web Browsing Data Model?

Re: Sanity Check - No Web Browsing Data Model?

Sanity Check - No Web Browsing Data Model?

Re: High Storage I/O Saturation Spikes

Troubleshooting High Storage I/O Saturation Spikes...

Re: Why is source type stuck on too_small?

Re: Source Type Stuck on too_small

Re: Source Type Stuck on too_small

Re: Source Type Stuck on too_small

Why is source type stuck on too_small?

Re: Indexes/Source Types Best Practice - Data Onbo...

Indexes/Source Types Best Practice - Data Onboardi...

Join the Conversation