All Apps and Add-ons

How do I get the Palo Alto Networks App for Splunk to parse URLs with commas?

Builder

We've started using the Palo Alto Networks App for Splunk and I noticed that some of the later fields in the raw logging record are getting parsed wrong. If the URL or referrer has a comma in it (permissible characters in URLs), the respective field is incomplete and the rest of its values wind up in other fields.
Does anyone else have experience dealing with this? Any recommendations?

0 Karma
1 Solution

Builder

PaloAlto has confirmed that this is a bug in PAN-OS

They note that this bug was resolved in 7.1.4 , 7.0.10 and 6.1.14.
The fix is implemented through Bug ID 92621. Can be found in addressed issues in the release notes.

View solution in original post

Builder

PaloAlto has confirmed that this is a bug in PAN-OS

They note that this bug was resolved in 7.1.4 , 7.0.10 and 6.1.14.
The fix is implemented through Bug ID 92621. Can be found in addressed issues in the release notes.

View solution in original post

Splunk Employee
Splunk Employee

Great, but hard work. Glad you got a fix for this!

0 Karma

Splunk Employee
Splunk Employee

The Add-on for that app uses DELIMS with commas. So if you want it to work you would have to write out a proper REGEX to extract the fields properly. I loaded your example and saw the same issue. I thought since the referrer field is quoted that it should work, but alas, it did not. You could create a new field extract on _raw for that specific field if you like. Something like this:

(?http.+?)"

Or you could go through a laborious process and rewrite all the field extractions for the transforms.conf under the header of extract_threat.

Builder

It appears that the add on can handle commas within quoted strings. I am not sure how. For example, if the URL field has a comma in it, it is parsed correctly. But if the referrer field has a comma in it the field gets split out to multiple fields.

btorresgil noted that the referrer field was unexpectedly starting with two double-quotes, and I noticed that it was actually starting and ending with two double-quotes (,""referrer,with,commas"",). I am guessing that at some point, PaloAlto wrote code to wrap the comma containing referrer with double-quotes and then later decided to wrap all URL type fields with double-quotes but forgot to remove the older code.
I have contacted PaloAlto and am in the process of trying to prove to them that there is a problem. Other options that I am pursuing include
-Ask the firewall admin to change the syslog format to delimit fields differently (this is problematic for other consumers of the firewall logs)
-use a stream editor to alter the syslog so that something like (,"")(?:[^,]) can be transformed to ,". And something like (?:[^,])("",) can be transformed to ",
-rewrite fields extractions as suggested above.

0 Karma

Builder

Do you have an example of a log that exhibits this problem? What version of Splunk are you using? What version of the Palo Alto Networks App and Add-on are you using?

0 Karma

Builder

Thank you!

Splunk v6.4.3
Palo Alto App v5.2.0
add-on v3.6.1
Could be tricky sharing a URL log publicly. Is it ok if I modify some stuff? if so, (and trying to keep it as close as possible to the original) the raw record looks like:

Oct  4 16:56:18 panfw-p1  : 1,2016/10/04 16:56:17,001701003672,THREAT,url,0,2016/10/04 16:56:17,10.141.44.75,199.59.149.200,0.0.0.0,0.0.0.0,web_browsing_rule,domain\userX,,twitter-base,vsys1,internal,external,ethernet1/1,ethernet1/2,Default_Logging,2016/10/04 16:56:17,214619,1,54356,443,0,0,0x1000000,tcp,block-url,"analytics.twitter.com/i/adsct?txn_id=gv05&oct_p_id=555&p_id=Twitter,43379enUS&tw_country_code=US",(9999),Social_Network_URLs,informational,client-to-server,3489840686,0x0,10.0.0.0-10.255.255.255,US,0,,0,,,1,Mozilla/5.0 (Windows NT 6.1; WOW64; Trident/7.0; rv:11.0) like Gecko,,"",""https://www.tripadvisor.com/PageMoniker?pixelList=aol_pixel,choicestream_clickout_pixel,choicestream_id_pixel,choicestream_pixel_2_us,choicestream_seg_pixel,clicktripz_clickout_pixel,criteo_pixel,facebook_wca_pixel,google_clickout_audience_pixel,google_clickout_conversion_pixel,google_pixel,intent_media_conversion_pixel,intent_media_xsell_pixel,mediaalpha_conv_pixel,resilion_clickout_pixel,smarter_travel_meta_conversion_pixel,sojern_clickout_pixel,sojern_pixel,sundaysky_pixel,twitter_clickout_pixel,twitter_clickout_pixel_2,twitter_dpa_pixel_1,twitter_dpa_pixel_2,twitter_pixel,yahoo_clickout_pixel,yahoo_pixel,yahoo_retargeting_pixel,yahoo_sizmek_dynamic_pixel&locIds=&blTabIdx=&servlet=HotelHighlight&pixelType=PAGEVIEW&"",,,,0,11,0,0,0,,panfw-p1,

with this,

user_agent field shows: Mozilla/5.0 (Windows NT 6.1; WOW64; Trident/7.0; rv:11.0) like Gecko
referrer field shows: ""https://www.tripadvisor.com/PageMoniker?pixelList=aol_pixel
sender field shows: choicestream_clickout_pixel
subject field shows: choicestream_id_pixel
recipient field shows: choicestream_pixel_2_us

Let me know if I can provide better info.

0 Karma

Builder

Thanks, I see the problem. I have a few questions: What version of PAN-OS are you using? Does this issue affects all URL logs or just some URL logs? Are you using the default syslog format on the firewall, or have you configured a custom syslog format? If you prefer, feel free to email me: btorres-gil@paloaltonetworks.com

0 Karma
Don’t Miss Global Splunk
User Groups Week!

Free LIVE events worldwide 2/8-2/12
Connect, learn, and collect rad prizes
and swag!