Community Blog
Get the latest updates on the Splunk Community, including member experiences, product education, events, and more!

Mastering Splunk CIM: Simplify Searches & Scale Data

marycordova
SplunkTrust
SplunkTrust

Unlock Universal Data Visibility: A Practitioner’s Guide to the Splunk Common Information Model (CIM)

 

Managing multiple data sources with inconsistent field names—like SourceAddress in firewalls versus IpAddress in Windows—creates massive SPL complexity and slows down incident response. The Common Information Model (CIM) solves this by normalizing your data, allowing you to build one dashboard that works across every data source simultaneously. 

Key Takeaways

  • Search Simplification: Use one field name (e.g., src_ip) to query multiple disparate indexes at once.

  • Automatic Scalability: Newly onboarded CIM-compliant data is automatically picked up by existing alerts and dashboards.

  • Reduced Maintenance: Leveraging CIM-compliant TAs from Splunkbase ensures your schema stays up to date even when vendors change their logging formats.

What Is CIM in Splunk?

The Common Information Model (CIM) in Splunk is a common way of naming fields from all your different data sources so that you can build content quickly and easily across all that different data at the same time.   

If your firewall sends data with SourceAddress and your Windows hosts send IpAddress your Splunk SPL query query would be:

(index=firewall SourceAddress=127.0.0.1) OR (index=windows IpAddress=127.0.0.1) 

and you would have to already know that the firewall is SourceAddress and Windows is IpAddress.

With CIM, you don’t have to know anything ahead of time and your queries are more simple:

 (index=firewall OR index=windows) src_ip=127.0.0.1 

But wait, how do you know IpAddress is the source and not the destination?  You don’t have to know actually

(index=firewall OR index=windows) (src_ip=127.0.0.1 OR dest_ip=127.0.0.1)

 Why Should You Use CIM? 

  • data access without pre-requisite source system schema knowledge 
  • easier to create SPL queries mean faster return on investment and decreased user and administrative burden 
  • comprehensive (or correlated) search across different data sources mean no missed events in dashboards, reports, and alerts 

Without CIM normalization, a search for suspicious login attempts would need to accommodate Active Directory's field names, Azure's naming conventions, and Okta's structure—all in a single query… and then the vendor changes their schema… 

With CIM 

  • Newly on-boarded data is automatically included in existing dashboards, reports, and alerts 
  • CIM compliant Apps and TAs (Technology Add-Ons) from SplunkBase enable out-of-the-box data on-boarding with pre-built dashboards, reports, and alerts without any work from you (or very little...) 
  • Splunk supported Apps and TAs are updated by Splunk when the vendor changes their schema 
  • Vendor supported Apps and TAs are updated by the vendor...when the vendor changes their schema 

 

CIM Data Models and Fields

CIM provides more than 20 domain-specific data models (Alerts, Email, Network Traffic, etc.). While there are ~500 fields available, most practitioners only need the "Vital 20."

You need the common CIM fields from across all the data models. Line them up in order of most to least important and in order of directionality.

Most Common CIM Fields 

Focus on these core fields for your dashboards and alerts:

_time, severity, action, signature, src_user, dest_user, src_ip, src_nt_host,
src_nt_domain, src_*, src, dest, dest_ip, dest_nt_host, dest_nt_domain, dest_*,
description  

Note: Anything else is maybe an edge case or a deep dive that requires access to raw data, not what you need to surface quickly in a dashboard, report, or alert.   

Standardizing  Field Values 

Some fields require prescribed values to maintain data integrity. The best example is severity, which should strictly follow these values:

  • critical, high, medium, low, informational, unknown
    • Anything else is unacceptable. 

Most fields do not have specific values defined in the documentation. There are a few fields you may want to internally define with acceptable values for your organization. 

You may want to require the field src_user to always be your long format user ID such as first.last@mydomain.com.  Alternatively, you may want to define that src_user is instead the short form flast user ID.  

The fields src_nt_host and dest_nt_host are another example where you may want to decide these should always be short NETBIOS hostnames and not fully.qualified.hostnames.mydomain.com.    

In addition to the kind of value acceptable for a field, you may also want to define the case-sensitivity of the value, such as lower() for users and upper() for hosts.   

Key CIM Considerations 

There is a bit of art to this.  For example...do you choose to align with category in addition to the above?  How is the field type in the Alerts data model different than the field category in the DLP data model? 

What about verdict or outcome? There isn’t really anything like this in the CIM but it’s important for security alerts to be able to classify detections as either “malicious” or “benign”.   

You will have to make some decisions for your organization, as long as you stay consistent in your use of a few custom fields, that’s ok.   

For example, to be able to trace a single event from the source system to Splunk and then to a ticketing system, you may want to use a combination of the CIM id field and your own custom internal_id field. Injecting this custom field at index time creates a link backward to the raw source data and forward to the ticketing system when added to the payload. 

However, do not over engineer.  More than 20ish fields begs the question...is this really necessary in most scenarios? 

You can extend CIM with custom fields, but don't modify exiting CIM mappings, especially if part of a supported CIM compliant TA or App. This breaks compatibility with the CIM ecosystem.  If you insist...ALIAS is your friend. 

Splunk Apps and TAs that Help with CIM 

There are two kinds of Apps and TAs to help with CIM, meta-Apps and TAs that that assist in working with CIM and data models and Apps and TAs that assist in on-boarding data to Splunk and applying CIM mappings to source system field naming schemas.   

Wherever possible, when on-boarding data, try to use a (supported) CIM compliant app from SplunkBase.   

Use the SplunkBase search filters to find supported CIM compliant TAs that provide data on-boarding and CIM mappings: 

marycordova_0-1771874301288.png

To work directly with the CIM schema or data models in Splunk, if you are mapping a custom data set or fixing a broken CIM mapping in an unsupported TA or App for example, install the “CIM Validator” and the “Common Information Model” Apps.   

 

The dictionary lookup for fields is super helpful: 

marycordova_1-1771874301288.png

 

Deploys data models using CIM for configuration and acceleration 

marycordova_2-1771874301288.png

 

How to Build Your Own CIM 

Building CIM means knowing your data.  You will need the admin guide for your data, or if it’s a custom data set, engage with the creator of the data.  You will also want to have the CIM data model references handy, or the CIM validator app deployed in your Splunk environment.   

Once you have your data reference material (if available) and the CIM data model guides, there are just a few steps to building your own CIM: 

  1. Remove field noise from your field listing so it’s not distracting you 
  2. Identify your 10-20ish important fields that cover most use cases 
  3. You could even just do your top 2 or 3 most important fields and work iteratively over time 
  4. Analyze and map the fields to the CIM fields 
  5. Create your propstransforms, etc. for the mappings and deploy to your splunk environment 
  6. Validate your work running queries searching for the expected output of your mappings 
  7. Even more crucially, run queries searching for the opposite of your query to find things that fell through the cracks 

If your query says | where isnotnull(src_ip)...search also for |where isnull(src_ip) and make sure none of the results should have had a src_ip field mapped.    

Splunk .conf online and Splunk Lantern has a few resources to walk you through creating CIM step-by-step using a real-world ransomware example: 

Conclusion 

  • Using CIM means your users can get value from their data right away, without the upfront burden of learning every data set.   
  • Dashboards, reports, and alerts work out-of-the-box for even faster time-to-value.   
  • If you are using supported CIM compliant Apps and TAs everything is already done for you and mapping changes for source data schemas are handled by the App or TA developer.   
  • Newly on-boarded CIM compliant data sets are automatically picked up by your dashboards, reports, and alerts.   
  • CIM can be implemented iteratively by data source or even by field allowing you quick wins by taking small manageable bites.   
  • Stay internally consistent if you need to extend CIM with (a few) custom fields or if you want to further refine CIM by standardizing CIM field values.   
  • Use the CIM meta-Apps and TAs to help you on your journey.   

Not using CIM means more complex searches requiring greater SPL skill from your users leading to frustration, underutilization, and attrition.   

Download the Splunk Common Information Model App

Contributors
Get Updates on the Splunk Community!

[Puzzles] Solve, Learn, Repeat: Matching cron expressions

This puzzle (first published here) is based on matching timestamps to cron expressions.All the timestamps ...

Design, Compete, Win: Submit Your Best Splunk Dashboards for a .conf26 Pass

Hello Splunkers,  We’re excited to kick off a Splunk Dashboard contest! We know that dashboards are a primary ...

May 2026 Splunk Expert Sessions: Security & Observability

Level Up Your Operations: May 2026 Splunk Expert Sessions Whether you are refining your security posture or ...