Getting Data In

Field extraction failing on ampersand

wilcomply13
Explorer

I have an issue with a URL field being extracted improperly and failing when an ampersand is present in the URL field. Transforms indicates the following delims:

 

 

DELIMS = "\t", "="

 

 

 

Btool run on the SH member also shows that no other extract commands or delims are identified. All fields are extracted properly except for URL fields that have an ampersand, which excludes everything beyond the ampersand for the field value.

 

Labels (2)

isoutamo
SplunkTrust
SplunkTrust
Hi
can you post your current props.conf, transforms.conf and sample data? Is this on search or ingest phase issue?
r. Ismo
0 Karma

wilcomply13
Explorer

This is a search time field extraction:


props.conf

[proxy-web]
FIELDALIAS-ClientIP_as_src=ClientIP AS src
FIELDALIAS-ClientIP_as_src_ip = ClientIP AS src_ip
FIELDALIAS-aob_gen_proxy_web_alias_1 = protocol AS transport
FIELDALIAS-aob_gen_proxy_web_alias_2 = user AS src_user
FIELDALIAS-aob_gen_proxy_web_alias_3 = dlpengine AS severity
FIELDALIAS-aob_gen_proxy_web_alias_4 = threatname AS signature
FIELDALIAS-aob_gen_proxy_web_alias_5 = contenttype AS http_content_type
FIELDALIAS-aob_gen_proxy_web_alias_6 = hostname AS dest
FIELDALIAS-aob_gen_proxy_web_alias_8 = responsesize AS bytes_in
FIELDALIAS-aob_gen_proxy_web_alias_9 = requestsize AS bytes_out
FIELDALIAS-clientpublicIP_as_src_translated_ip = clientpublicIP AS src_translated_ip
FIELDALIAS-clienttranstime_as_response_time = clienttranstime AS response_time
FIELDALIAS-department_as_src_user_bunit = department AS src_user_bunit
FIELDALIAS-dlpdictionaries_as_signature = dlpdictionaries AS signature
FIELDALIAS-filename_as_file_name = filename AS file_name
FIELDALIAS-md5_as_file_hash = md5 AS file_hash
FIELDALIAS-refererURL_as_http_referrer = refererURL AS http_referrer
FIELDALIAS-requestmethod_as_http_method = requestmethod AS http_method
FIELDALIAS-serverip_as_dest_ip = serverip AS dest_ip
FIELDALIAS-threatcategory_as_category = threatcategory AS category
FIELDALIAS-transactionsize_as_bytes = transactionsize AS bytes
FIELDALIAS-urlcategory_as_category = urlcategory AS category
FIELDALIAS-useragent_as_http_user_agent = useragent AS http_user_agent
REPORT-ta_builder_internal_use_kv_format_results_for_proxy_web = ta_builder_internal_use_kv_format_results_for_proxy_web
category = Network & Security
description = Web/Proxy Logs


transforms.conf

[ta_builder_internal_use_kv_format_results_for_proxy_web]
DELIMS = "\t", "="

 

Sample output discrepancy with custom regex to extract full URL vs. URL extracted by default given above props/transforms:
Sample dataSample data

0 Karma

isoutamo
SplunkTrust
SplunkTrust
Can you post also _raw?
0 Karma

wilcomply13
Explorer
Jan 10 15:08:21 host.domain.com 2022-01-10 15:07:38 reason=Allowed event_id=000000000000 protocol=HTTPS action=Allowed transactionsize=1111 responsesize=111 requestsize=1111 urlcategory=Custom URL Category serverip=8.8.8.8 clienttranstime=111 requestmethod=POST refererURL=www.google.com/ ClientIP=9.9.9.9 status=204 user=user@domain.com url=www.google.com/gen_204?atyp=csi&ei=rUvcYZ-iM5GV0PEPt9uH8As&s=web&st=13120&fid=2&t=fi&zx=1641827258879
0 Karma
Get Updates on the Splunk Community!

Combine Multiline Logs into a Single Event with SOCK - a Guide for Advanced Users

This article is the continuation of the “Combine multiline logs into a single event with SOCK - a step-by-step ...

Everything Community at .conf24!

You may have seen mention of the .conf Community Zone 'round these parts and found yourself wondering what ...

Index This | I’m short for "configuration file.” What am I?

May 2024 Edition Hayyy Splunk Education Enthusiasts and the Eternally Curious!  We’re back with a Special ...