Getting Data In

SEDCMD not doing what I expect- Fields that end with \" having issues?

jwhughes58
Contributor

We have an issue with pan:threat in our dev environment having fields that end like this \”, What this does is escape the “ so the field isn’t closed, and it grabs extra.  For example,

An example event

Jun 20 09:45:17 pan_firewall 1,2023/06/20 09:45:17,016201006029,THREAT,url,2561,2023/06/20 09:45:17,10.10.10.10,11.11.11.11,12.12.12.12,13.13.13.13,Internal-Gateway-Client-Connect,,,web-browsing,vsys1,inside,inside,ethernet1/2,ethernet1/2,Shared_Log_Fwd,2023/06/20 09:45:17,633045,1,55384,443,55384,20077,0x140b000,tcp,alert,"pan_firewall/default.asp\",(9999),PAN-Allowed-Sites,informational,client-to-server,7237130175635929631,0x8000000000000000,10.0.0.0-10.255.255.255,United States,,,0,,,1,,,,,,,,0,29,50,52,0,vsys1,pan_firewall,,,,get,0,,0,,N/A,unknown,AppThreat-0-0,0x0,0,4294967295,," PAN-Allowed-Sites,health-and-medicine,low-risk",27b923e5-b821-4544-8790-5eb413f7ed4a,0,,,,,,,,,,,,,,,,,,,,,,,,,,,,,0,2023-06-20T09:45:17.691+00:00,,,,internet-utility,general-internet,browser-based,4,"used-by-malware,able-to-transfer-file,has-known-vulnerability,tunnel-other-application,pervasive-use",,web-browsing,no,no

url = ,"pan_firewall/default.asp\",(9999),PAN-Allowed-Sites,informational,client-to-server,7237130175635929631,0x8000000000000000,10.0.0.0-10.255.255.255,United States,,,0,,,1,,,,,,,,0,29,50,52,0,vsys1,pan_firewall,,,,get,0,,0,,N/A,unknown,AppThreat-0-0,0x0,0,4294967295,," PAN-Allowed-Sites

category = low-risk",27b923e5-b821-4544-8790-5eb413f7ed4a,0,,,,,,,,,,,,,,,,,,,,,,,,,,,,,0,2023-06-20T09:45:17.691+00:00,,,,internet-utility,general-internet,browser-based,4,"used-by-malware

I’ve tried multiple SEDCMD to change the \”, so it is something else, but even though it is in btool the events still have the \",

/data/splunk/hot/apps/splunk/etc/apps/Splunk_TA_paloalto/default/props.conf                  SEDCMD-palo_alto_remove_backslah = s/\\\",/\\ \",/g

I did see the recommendation to send to an HF, but this data arrives via syslog and then goes to the indexers.  The regex works in the various tools I've tried.  Data is somewhat anonymized.  Any suggestions?

TIA,

Joe

Labels (1)
0 Karma

PickleRick
SplunkTrust
SplunkTrust

Unfortunately, various tools using regexes can have their own ideas about the need of escaping (or not) various things. Most notorious about this is of course vim with its "counterintuitive" use of backslashes on groupping parentheses and pluses. Anyway, some tools don't mind extra backslash even if it's not needed, some do.

That's why I'd try to be as precise as possible and go

s/\\"/"/g
0 Karma

gcusello
SplunkTrust
SplunkTrust

Hi @jwhughes58 ,

it's always better to use one or (better) two HFs to receive syslogs, but if you receive syslogs on your Indexers, you have to add the pan add-on on the Indexers.

Could you share theprops.conf you're using?

Anyway, your shouldn't need to change the logs to remove backslashes, you should configure the TIME_PREFIX and TIME_FORMAT options for your sourcetype:

pan:threat
TIME_PREFIX = ^
TIME_FORMAT = %b %d %H:%M:%S

in this way you're sure that your event starts at the correct point.

Ciao.

Giuseppe

0 Karma

jwhughes58
Contributor

Hi @gcusello ,

We are using Splunk_TA_paloalto 8.0.2 so the default/props.conf has this in it

[pan:threat]
SHOULD_LINEMERGE = false
EVENT_BREAKER_ENABLE = true
KV_MODE = none
TIME_PREFIX = ^(?:[^,]*,){6}
MAX_TIMESTAMP_LOOKAHEAD = 32
TIME_FORMAT = %Y/%m/%d %H:%M:%S

In the local/props.conf I have this

[pan:threat]
SEDCMD-palo_alto_remove_backslah = s/\\\",/\\ \",/g

We have a syslog farm with each of the syslog servers running a UF.  The data gets cooked and stored on the indexers.  When I push the TA it goes to the SHs and the indexers.

TIA,

Joe

0 Karma

gcusello
SplunkTrust
SplunkTrust

Hi @jwhughes58,

the TIME_PREFIX and TIME_FORMAT for that sourcetype aren't correct for the logs you shared, are you sure that they are the correct logs or the correct sourcetype?

Ciao.

Giuseppe

0 Karma

jwhughes58
Contributor

Hi @gcusello,

The syslog-ng.conf adds on the bolded additional information

Jun 20 09:45:17 icacdcgp3.fw.ntwk.kp.org 1,2023/06/20 09:45:17,016201006029,THREAT,url,2561,2023/06/20 09:45:17

The remainder of the event is what Palo Alto sends.  There have been no other issues with the Palo Alto TA beyond fields that values that end like this \",  Those the TA doesn't process correctly.  That is why I was using the SEDCMD to try and change it.

TIA,

Joe

0 Karma

gcusello
SplunkTrust
SplunkTrust

Hi @jwhughes58,

Ok, this is more clear.

if all your pan logs arrive throgh syslog-ng, try to replace the TIME_PREFIX and TIME_FORMAT with the ones I share and see results.

ciao.

Giuseppe

0 Karma

jwhughes58
Contributor

Hi @gcusello,

I will give it a try when I get back from running my wife to the airport and let you know.

TIA,

Joe

0 Karma

jwhughes58
Contributor

Hi @gcusello ,

It didn't work for me.  Another member of the team is looking at it since it is possible I have looked at it for too long.  Hopefully, they will find something.

TIA,

Joe

0 Karma

jwhughes58
Contributor

And since I forgot to add it, Splunk 9.0.4 (build de405f4a7979) and RHEL 4.18.0-348.12.2.el8_5.x86_64.

0 Karma
Get Updates on the Splunk Community!

Technical Workshop Series: Splunk Data Management and SPL2 | Register here!

Hey, Splunk Community! Ready to take your data management skills to the next level? Join us for a 3-part ...

Spotting Financial Fraud in the Haystack: A Guide to Behavioral Analytics with Splunk

In today's digital financial ecosystem, security teams face an unprecedented challenge. The sheer volume of ...

Solve Problems Faster with New, Smarter AI and Integrations in Splunk Observability

Solve Problems Faster with New, Smarter AI and Integrations in Splunk Observability As businesses scale ...