Getting Data In

Field extraction failing on ampersand

wilcomply13
Explorer

I have an issue with a URL field being extracted improperly and failing when an ampersand is present in the URL field. Transforms indicates the following delims:

 

 

DELIMS = "\t", "="

 

 

 

Btool run on the SH member also shows that no other extract commands or delims are identified. All fields are extracted properly except for URL fields that have an ampersand, which excludes everything beyond the ampersand for the field value.

 

Labels (2)

isoutamo
SplunkTrust
SplunkTrust
Hi
can you post your current props.conf, transforms.conf and sample data? Is this on search or ingest phase issue?
r. Ismo
0 Karma

wilcomply13
Explorer

This is a search time field extraction:


props.conf

[proxy-web]
FIELDALIAS-ClientIP_as_src=ClientIP AS src
FIELDALIAS-ClientIP_as_src_ip = ClientIP AS src_ip
FIELDALIAS-aob_gen_proxy_web_alias_1 = protocol AS transport
FIELDALIAS-aob_gen_proxy_web_alias_2 = user AS src_user
FIELDALIAS-aob_gen_proxy_web_alias_3 = dlpengine AS severity
FIELDALIAS-aob_gen_proxy_web_alias_4 = threatname AS signature
FIELDALIAS-aob_gen_proxy_web_alias_5 = contenttype AS http_content_type
FIELDALIAS-aob_gen_proxy_web_alias_6 = hostname AS dest
FIELDALIAS-aob_gen_proxy_web_alias_8 = responsesize AS bytes_in
FIELDALIAS-aob_gen_proxy_web_alias_9 = requestsize AS bytes_out
FIELDALIAS-clientpublicIP_as_src_translated_ip = clientpublicIP AS src_translated_ip
FIELDALIAS-clienttranstime_as_response_time = clienttranstime AS response_time
FIELDALIAS-department_as_src_user_bunit = department AS src_user_bunit
FIELDALIAS-dlpdictionaries_as_signature = dlpdictionaries AS signature
FIELDALIAS-filename_as_file_name = filename AS file_name
FIELDALIAS-md5_as_file_hash = md5 AS file_hash
FIELDALIAS-refererURL_as_http_referrer = refererURL AS http_referrer
FIELDALIAS-requestmethod_as_http_method = requestmethod AS http_method
FIELDALIAS-serverip_as_dest_ip = serverip AS dest_ip
FIELDALIAS-threatcategory_as_category = threatcategory AS category
FIELDALIAS-transactionsize_as_bytes = transactionsize AS bytes
FIELDALIAS-urlcategory_as_category = urlcategory AS category
FIELDALIAS-useragent_as_http_user_agent = useragent AS http_user_agent
REPORT-ta_builder_internal_use_kv_format_results_for_proxy_web = ta_builder_internal_use_kv_format_results_for_proxy_web
category = Network & Security
description = Web/Proxy Logs


transforms.conf

[ta_builder_internal_use_kv_format_results_for_proxy_web]
DELIMS = "\t", "="

 

Sample output discrepancy with custom regex to extract full URL vs. URL extracted by default given above props/transforms:
Sample dataSample data

0 Karma

isoutamo
SplunkTrust
SplunkTrust
Can you post also _raw?
0 Karma

wilcomply13
Explorer
Jan 10 15:08:21 host.domain.com 2022-01-10 15:07:38 reason=Allowed event_id=000000000000 protocol=HTTPS action=Allowed transactionsize=1111 responsesize=111 requestsize=1111 urlcategory=Custom URL Category serverip=8.8.8.8 clienttranstime=111 requestmethod=POST refererURL=www.google.com/ ClientIP=9.9.9.9 status=204 user=user@domain.com url=www.google.com/gen_204?atyp=csi&ei=rUvcYZ-iM5GV0PEPt9uH8As&s=web&st=13120&fid=2&t=fi&zx=1641827258879
0 Karma
Career Survey
First 500 qualified respondents will receive a $20 gift card! Tell us about your professional Splunk journey.

Can’t make it to .conf25? Join us online!

Get Updates on the Splunk Community!

Leveraging Automated Threat Analysis Across the Splunk Ecosystem

Are you leveraging automation to its fullest potential in your threat detection strategy?Our upcoming Security ...

Can’t Make It to Boston? Stream .conf25 and Learn with Haya Husain

Boston may be buzzing this September with Splunk University and .conf25, but you don’t have to pack a bag to ...

Splunk Lantern’s Guide to The Most Popular .conf25 Sessions

Splunk Lantern is a Splunk customer success center that provides advice from Splunk experts on valuable data ...