Splunk Enterprise Security

Are there best practices for CIM datamodel mapping for PaloAlto firewalls?

MonkeyK
Builder

Are there best practices when mapping PaloAlto firewall logs to CIM datamodels?
One think that I noticed is that Network_Traffic maps anything with tag="network" and tag="communicate". This means all logs of type "start" and "end", which are not filter terms for the Network_Traffic datamodel. It seems to me that the datamodel should only include "end" events to prevent double counting traffic. Is that right? Are there other considerations for how PaloAlto firewall logs should get mapped into Network_Traffic?

How about how PaloAloto firewall logs get mapped into other datamodels?
-Network_Sessions
-Web

Are there best practice docs for other log sources getting properly mapped to CIM datamodels? If not, such docs could prove invaluable to a person trying to get their datamodels working properly.

This has been bugging me since we implemented Splunk ES. Our professional services consultant thought I should have had an answer for Network_Traffic (we didn't even address others), but without more knowledge of how the datamodels were used, I could not know.

rpille_splunk
Splunk Employee
Splunk Employee

The Palo Alto Networks add-on (https://splunkbase.splunk.com/app/2757/) already has CIM mapping configurations included, so you can use that or use it as a model for additional CIM mapping if your data sources are not covered by that add-on.

You may also find the following documentation pages helpful:

Use the CIM to normalize data at search time (http://docs.splunk.com/Documentation/CIM/4.9.0/User/UsetheCIMtonormalizedataatsearchtime) is a step-by-step guide to normalizing data to the CIM.

Use the CIM to normalize OSSEC data
(http://docs.splunk.com/Documentation/CIM/4.9.0/User/UsetheCIMtonormalizeOSSECdata) is a worked example, using OSSEC data.

0 Karma

MonkeyK
Builder

I get that there is a current definition. But what do I do if the results seem wrong?

For example
Palo Alto Traffic logs get mapped to the Network Traffic datamodel. But all start and end records for the same session are mapped as separate Network Traffic entries. This means that I have at least two Network Traffic entries per actual event. But the problem gets worse since many Palo Alto traffic logs contain multiple starts

To see this, try the following query on the Network_Traffic datamodel:

|tstats summariesonly=t count as sameCount from datamodel=Network_Traffic.All_Traffic where All_Traffic.session_id>100000 by All_Traffic.session_id All_Traffic.src_ip All_Traffic.src_port | eventstats count as total | stats count values(total) as total by sameCount

I ran this against 9/9/17 8:40 - 9/9/17 8:41 (one minute) in my environment and get the following
sameCount count total
1 22826 35598
2 12202 35598
3 526 35598
4 18 35598
5 18 35598
6 3 35598
7 5 35598

So 35598 Network Traffic events. Of those
5 sessions included 7 Network Traffic events
3 sessions included 6 Network Traffic events
18 sessions included 18 Network Traffic events
...etc
and these numbers are low, because they don't count the Network session that did not end in the minute queried

What kinds of problems does this cause?
1) I am misrepresenting any summary statistics since some stats will repeat for each session
2) I have no way of comparing start or end times (without massive processing). So searching for things like beaconing cannot account for intervals

Based on these problems, it seems that there must be a better way to map the logs to Network_Traffic. That is why I am asking.

Get Updates on the Splunk Community!

Discover Powerful New Features in Splunk Cloud Platform: Enhanced Analytics, ...

Hey Splunky people! We are excited to share the latest updates in Splunk Cloud Platform 9.3.2408. In this ...

Splunk Classroom Chronicles: Training Tales and Testimonials

Welcome to the "Splunk Classroom Chronicles" series, created to help curious, career-minded learners get ...

Access Tokens Page - New & Improved

Splunk Observability Cloud recently launched an improved design for the access tokens page for better ...