Getting Data In
Highlighted

How to map existing sourcetypes to CIM data models?

Motivator

I have an environment with a large number of sourcetypes and would like to map those to the appropriate CIM data model. While I generally know about the Splunk commands pivot and datamodel, their use seems to depend on the fields already having the 'correct' names or the sourcetypes already having been tagged. Has anyone found a decent way to attempt to have Splunk detect that sourcetype X might map to data model Y?

Highlighted

Re: How to map existing sourcetypes to CIM data models?

Motivator

This solution involves the use of the Data Curator app. If are downloading the app for the first time to look into this solution make sure you run the ‘build sourcetype_fields csv’ saved search after installation.

A more complete writeup of this methodology can be found here.

I ended up manually creating a list of fields associated with most of the data models and objects and put that into Splunk as a lookup with wildcards. The following query bounces the sourcetype_fields lookup from the Data Curator app to the data model field list.

| inputlookup sourcetype_fields.csv | eval field = lower(field) | lookup dm_fields field as field | search model!=none | stats dc(field) as fields values(field) as field_list by sourcetype model object | where fields > 1 | sort -fields | stats max(fields) as maxFieldMatch  list(model) as Model list(object) as Object list(fields) as fieldMatch by sourcetype | sort -maxFieldMatch

The results of the search in my environment are what you might call directionally correct moreso than exact =). There are quite a number of good, actionable results but much of the depends on what fields names you have defined already. One thing I did pre CIM years ago was settle on srcip and destip for IP address related fields so we already had that baked into many of our sourcetypes. If you've used sourceip or destinationip you might want to look into adjusting the lookup. I also saw a fair bit of matches for both Web and Network Traffic for the same sourcetype as there is fair bit of field naming overlap. Not to be out done the Windows Security logs had 5 matches lol: Web, Email (filtering & email objects), Network Traffic, and Authentication.

If you have any feedback to the list below or the methodology I'm all ears!

Transforms

[dm_fields]
filename = data_models.csv
match_type = WILDCARD(field)
max_matches = 1
min_matches = 1

Lookup

field,model,object
"*bytes*",Web,Web
"*bytes_in*",Web,Web
"*bytes_out*",Web,Web
"*cached*",Web,Web
"*cookie*",Web,Web
dest,Web,Web
"*duration*",Web,Web
"*http_content_type*",Web,Web
"*http_method*",Web,Web
"*http_referrer*",Web,Web
"*http_user_agent*",Web,Web
"*http_user_agent_length*",Web,Web
"*referer*",Web,Web
"*response_time*",Web,Web
"*site*",Web,Web
src,Web,Web
"*src_ip*",Web,Web
"*status*",Web,Web
"*uri_path*",Web,Web
"*uri_query*",Web,Web
"*url*",Web,Web
"*url_length*",Web,Web
"*user*",Web,Web
"*bytes*","Network_Traffic","All_Traffic"
"*bytes_in*","Network_Traffic","All_Traffic"
"*bytes_out*","Network_Traffic","All_Traffic"
"*channel*","Network_Traffic","All_Traffic"
dest,"Network_Traffic","All_Traffic"
"*dest_interface*","Network_Traffic","All_Traffic"
"*dest_mac*","Network_Traffic","All_Traffic"
"*dest_port*","Network_Traffic","All_Traffic"
"*dest_translated_ip*","Network_Traffic","All_Traffic"
"*dest_translated_port*","Network_Traffic","All_Traffic"
"*direction*","Network_Traffic","All_Traffic"
"*duration*","Network_Traffic","All_Traffic"
dvc,"Network_Traffic","All_Traffic"
"*flow_id*","Network_Traffic","All_Traffic"
"*icmp_code*","Network_Traffic","All_Traffic"
"*icmp_type*","Network_Traffic","All_Traffic"
"*mac*","Network_Traffic","All_Traffic"
"*packets*","Network_Traffic","All_Traffic"
"*packets_in*","Network_Traffic","All_Traffic"
"*packets_out*","Network_Traffic","All_Traffic"
"*protocol*","Network_Traffic","All_Traffic"
"*protocol_version*","Network_Traffic","All_Traffic"
"*response_time*","Network_Traffic","All_Traffic"
"*rule*","Network_Traffic","All_Traffic"
"*session_id*","Network_Traffic","All_Traffic"
src,"Network_Traffic","All_Traffic"
"*src_interface*","Network_Traffic","All_Traffic"
"*src_ip*","Network_Traffic","All_Traffic"
"*src_mac*","Network_Traffic","All_Traffic"
"*src_port*","Network_Traffic","All_Traffic"
"*src_translated_ip*","Network_Traffic","All_Traffic"
"*src_translated_port*","Network_Traffic","All_Traffic"
"*ssid*","Network_Traffic","All_Traffic"
"*tcp_flag*","Network_Traffic","All_Traffic"
"*transport*","Network_Traffic","All_Traffic"
"*tos*","Network_Traffic","All_Traffic"
"*ttl*","Network_Traffic","All_Traffic"
"*user*","Network_Traffic","All_Traffic"
"*vlan*","Network_Traffic","All_Traffic"
"*wifi*","Network_Traffic","All_Traffic"
dest,Authentication,Authentication
"*dest_nt_domain*",Authentication,Authentication
"*duration*",Authentication,Authentication
"*response_time*",Authentication,Authentication
src,Authentication,Authentication
"*src_nt_domain*",Authentication,Authentication
"*src_user*",Authentication,Authentication
"*user*",Authentication,Authentication
dest,Certificates,"All_Certificates"
"*dest_port*",Certificates,"All_Certificates"
"*duration*",Certificates,"All_Certificates"
"*response_time*",Certificates,"All_Certificates"
src,Certificates,"All_Certificates"
"*transport*",Certificates,"All_Certificates"
"*ssl_end_time*",Certificates,SSL
"*ssl_engine*",Certificates,SSL
"*ssl_hash*",Certificates,SSL
"*ssl_is_valid*",Certificates,SSL
"*ssl_issuer*",Certificates,SSL
"*ssl_issuer_common_name*",Certificates,SSL
"*ssl_issuer_email*",Certificates,SSL
"*ssl_issuer_locality*",Certificates,SSL
"*ssl_issuer_organization*",Certificates,SSL
"*ssl_issuer_state*",Certificates,SSL
"*ssl_issuer_street*",Certificates,SSL
"*ssl_issuer_unit*",Certificates,SSL
"*ssl_name*",Certificates,SSL
"*ssl_policies*",Certificates,SSL
"*ssl_publickey*",Certificates,SSL
"*ssl_publickey_algorithm*",Certificates,SSL
"*ssl_serial*",Certificates,SSL
"*ssl_session_id*",Certificates,SSL
"*ssl_signature_algorithm*",Certificates,SSL
"*ssl_start_time*",Certificates,SSL
"*ssl_subject*",Certificates,SSL
"*ssl_subject_common_name*",Certificates,SSL
"*ssl_subject_email*",Certificates,SSL
"*ssl_subject_locality*",Certificates,SSL
"*ssl_subject_state*",Certificates,SSL
"*ssl_subject_street*",Certificates,SSL
"*ssl_subject_unit*",Certificates,SSL
"*ssl_validity_window*",Certificates,SSL
"*ssl_version*",Certificates,SSL
"*delay*",Email,Email
dest,Email,Email
"*duration*",Email,Email
"*file_hash*",Email,Email
"*file_name*",Email,Email
"*file_size*",Email,Email
"*internal_message_id*",Email,Email
"*message_id*",Email,Email
"*message_info*",Email,Email
"*orig_dest*",Email,Email
"*orig_recipient*",Email,Email
"*orig_src*",Email,Email
"*process*",Email,Email
"*process_id*",Email,Email
"*protocol*",Email,Email
"*recipient*",Email,Email
"*recipient_count*",Email,Email
"*recipient_status*",Email,Email
"*response_time*",Email,Email
"*retries*",Email,Email
"*return_addr*",Email,Email
"*size*",Email,Email
src,Email,Email
"*src_user*",Email,Email
"*status_code*",Email,Email
"*subject*",Email,Email
"*url*",Email,Email
"*user*",Email,Email
"*xdelay*",Email,Email
"*xref*",Email,Email
"*filter_action*",Email,Filtering
"*filter_score*",Email,Filtering
"*signature*",Email,Filtering
"*signature_extra*",Email,Filtering
"*signature_id*",Email,Filtering
dest,"Intrusion Detection","IDS_Attacks"
dvc,"Intrusion Detection","IDS_Attacks"
"*ids_type*","Intrusion Detection","IDS_Attacks"
"*severity*","Intrusion Detection","IDS_Attacks"
"*signature*","Intrusion Detection","IDS_Attacks"
src,"Intrusion Detection","IDS_Attacks"
"*user*","Intrusion Detection","IDS_Attacks"
"*date*",Malware,"Malware_Attacks"
dest,Malware,"Malware_Attacks"
"*dest_nt_domain*",Malware,"Malware_Attacks"
"*dest_requires_av*",Malware,"Malware_Attacks"
"*file_hash*",Malware,"Malware_Attacks"
"*file_name*",Malware,"Malware_Attacks"
"*file_path*",Malware,"Malware_Attacks"
"*signature*",Malware,"Malware_Attacks"
src,Malware,"Malware_Attacks"
"*user*",Malware,"Malware_Attacks"
"*vendor_product*",Malware,"Malware_Attacks"
dest,Malware,"Malware_Operations"
"*dest_nt_domain*",Malware,"Malware_Operations"
"*dest_requires_av*",Malware,"Malware_Operations"
"*product_version*",Malware,"Malware_Operations"
"*signature_version*",Malware,"Malware_Operations"
"*vendor_product*",Malware,"Malware_Operations"
dest,Performance,"All_Performance"
"*dest_should_timesync*",Performance,"All_Performance"
"*hypervisor_id*",Performance,"All_Performance"
"*resource_type*",Performance,"All_Performance"
"*cpu_load_mhz*",Performance,CPU
"*cpu_load_percent*",Performance,CPU
"*cpu_time*",Performance,CPU
"*cpu_user_percent*",Performance,CPU
"*fan_speed*",Performance,Facilities
"*power*",Performance,Facilities
"*temperature*",Performance,Facilities
"*mem*",Performance,Memory
"*mem_committed*",Performance,Memory
"*mem_free*",Performance,Memory
"*mem_used*",Performance,Memory
"*swap*",Performance,Memory
"*swap_free*",Performance,Memory
"*swap_used*",Performance,Memory
"*array*",Performance,Storage
"*blocksize*",Performance,Storage
"*cluster*",Performance,Storage
"*fd_max*",Performance,Storage
"*fd_used*",Performance,Storage
"*latency*",Performance,Storage
"*mount*",Performance,Storage
"*parent*",Performance,Storage
"*read_blocks*",Performance,Storage
"*read_latency*",Performance,Storage
"*read_ops*",Performance,Storage
"*storage*",Performance,Storage
"*storage_free*",Performance,Storage
"*storage_free_percent*",Performance,Storage
"*storage_used*",Performance,Storage
"*storage_used_percent*",Performance,Storage
"*write_blocks*",Performance,Storage
"*write_latency*",Performance,Storage
"*write_ops*",Performance,Storage
"*thruput*",Performance,Network
"*thruput_max*",Performance,Network
"*signature*",Performance,OS
"*uptime*",Performance,Uptime

View solution in original post

Highlighted

Re: How to map existing sourcetypes to CIM data models?

Explorer

Extremely Helpful,
Why can't all answers be this helpful and direct?

Highlighted

Re: How to map existing sourcetypes to CIM data models?

Motivator

lol I posted both the question and answer which helps. Glad you've found it helpful. Sadly I look at this a year later and realize how little I've been able to do to take more action upon this work /sigh.

0 Karma