In my environment, palo alto (proxy) logs are being stored into Splunk.
I want to know what kind of operation on a server make high-risk communication to internet using palo alto logs and Windows event logs or Linux audit log or some thing.
Is it possible with Correlation Search of Splunk ?
Thank you for your reply.
I extracted data from palo alto using Splunk Add-on for Palo Alto Networks. Here is an example.
Oct 28 13:46:12 192.168.248.2 1 2024-10-28T13:46:12+09:00 PA-VM - - - - 1,2024/10/28 13:46:09,007254000360102,TRAFFIC,start,2818,2024/10/28 13:46:09,192.168.252.100,13.107.5.93,192.168.252.2,13.107.5.93,dmz-to-internet,,,web-browsing,vsys1,DMZ,INTERNET,ethernet1/2,ethernet1/1,SecurityCheck,2024/10/28 13:46:12,497655,1,54084,443,35405,443,0x1400000,tcp,allow,5636,1220,4416,11,2024/10/28 13:46:10,0,computer-and-internet-info,,7423264892787200760,0x0,192.168.0.0-192.168.255.255,United States,,6,5,n/a,0,0,0,0,,PA-VM,from-policy,,,0,,0,,N/A,0,0,0,0,c2a50b1f-ea25-41ce-9c7c-709bde6deec4,0,0,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,2024-10-28T13:46:12.041+09:00,,,internet-utility,general-internet,browser-based,4,"used-by-malware,able-to-transfer-file,has-known-vulnerability,tunnel-other-application,pervasive-use",,web-browsing,no,no,0,NonProxyTraffic,,0,0,0
About the second comment, The risk value is shown in the log. In the above example, the risk value is 4. (the value can be 1 ~ 5) It is seems to be determined by Palo Alto (Palo Alto Add-on).
However I wonder the true high risk communication can be extracted from logs and what action is the cause of the risky communication (by correlation search).
For now, I want to make correlation search from the palo alto log and Windows event log.
https://knowledgebase.paloaltonetworks.com/KCSArticleDetail?id=kA10g000000ClmQCAS
Weights
CharacteristicFactor
Evasive | 3 |
Excessive Bandwidth Use | 1 |
Used by Malware | 4 |
Capable of File Transfer | 3 |
Known Vulnerabilities | 3 |
Tunnels Other Apps | 2 |
Prone to Misuse | 2 |
Pervasive | 1 |
Total | 19 |
Risk Assignment
RiskRange
1 | 0–3 |
2 | 4–6 |
3 | 7–9 |
4 | 10–13 |
5 | 14+ |
Your example log actually shows which of the risk factors were part of the calculation.
internet-utility,general-internet,browser-based,4,"used-by-malware,able-to-transfer-file,has-known-vulnerability,tunnel-other-application,pervasive-use",,web-browsing
I believe you would be better served by correlating with DNS records sourced from the original machine, and/or investigating how to have Palo Alto resolve the URL inside the session log. You might actually have that already in the "threat" log entries.
Amazing
The first thing you need is an understanding of your data. It is your data. We do not have access to it and do not know what data you have, so it is difficult for us to determine what information you might be able to extract from it.
Secondly, risk is subjective. What do you deem to be high risk? What evidence do you have in your logs (that are now in Splunk) that might help you determine if something is "risky"?
Such open questions as you have posed, only lead to more questions.