Getting Data In

Log formatting not as expected

splunklearner
Communicator
Jun 26 13:46:12 128.23.84.166 [local0.err] <131>Jun 26 13:46:12 GBSDFA1AD011HMA.systems.uk.fed ASM:f5_asm=PROD

vs_name="/f5-tenant-01/XXXXXXXX"
violations="HTTP protocol compliance failed"
sub_violations="HTTP protocol compliance failed:Header name with no header value"
attack_type="HTTP Parser Attack"
violation_rating="3/5"
severity="Error"

support_id="XXXXXXXXX"
policy_name="/Common/waf-fed-transparent"
enforcement_action="none"

dest_ip_port="128.155.6.2:443"
ip_client="128.163.192.44"
x_forwarded_for_header_value="N/A"

method="POST"
uri="/auth-service/api/v2/token/refreshAccessToken"
microservice="N/A"
query_string="N/A"
response_code="500"

sig_cves="N/A"
sig_ids="N/A"
sig_names={N/A}
sig_set_names="N/A"
staged_sig_cves="N/A"
staged_sig_ids="N/A"
staged_sig_names="N/A"
staged_sig_set_names="N/A"

<?xml version='1.0' encoding='UTF-8'?>
<BAD_MSG>
<violation_masks>
<block>0-0-0-0</block>
<alarm>2400500004500-106200000003e-0-0</alarm>
<learn>0-0-0-0</learn>
<staging>0-0-0-0</staging>
</violation_masks>
<request-violations>
<violation>
<viol_index>14</viol_index>
<viol_name>VIOL_HTTP_PROTOCOL</viol_name>
<http_sanity_checks_status>2</http_sanity_checks_status>
<http_sub_violation_status>2</http_sub_violation_status>
<http_sub_violation>SGVhZGVyICdBdXRob3JpemF0aW9uJyBoYXMgbm8gdmFsdWU=</http_sub_violation>
</violation>
</request-violations>
</BAD_MSG>​
Jul  3 11:12:48 128.168.189.4 [local0.err] <131>2025-07-03T11:12:48+00:00 nginxplus-nginx-ingress-controller-6947cb4744-hxwf5 ASM:Log_details\x0a\x0avs_name="14-cyberwasp-sv-busybox.ikp3001ynp.cloud.uk.fed:10-/"\x0aviolations="Attack signature detected"\x0asub_violations="N/A"\x0aattack_type="Cross Site Scripting (XSS)"\x0aviolation_rating="5/5"\x0aseverity="N/A"\x0a\x0asupport_id="14096019979554169061"\x0apolicy_name="waf-fed-enforced"\x0aenforcement_action="block"\x0a\x0adest_ip_port="0.0.0.0:443"\x0aip_client="128.175.220.223"\x0ax_forwarded_for_header_value="N/A"\x0a\x0amethod="GET"\x0auri="/"\x0amicroservice="N/A"\x0aquery_string="svanga=%3Cscript%3Ealert(1)%3C/script%3E%22"\x0aresponse_code="0"\x0a\x0asig_cves="N/A,N/A,N/A,N/A"\x0asig_ids="200001475,200000098,200001088,200101609"\x0asig_names={XSS script tag end (Parameter) (2),XSS script tag (Parameter),alert() (Parameter)...}\x0asig_set_names="{High Accuracy Signatures;Cross Site Scripting Signatures;Generic Detection Signatures (High Accuracy)},{High Accuracy Signatures;Cross Site Scripting Signatures;Generic Detection Signatures (High Accuracy)},{Cross Site Scripting Signatures}..."\x0astaged_sig_cves="N/A,N/A,N/A,N/A"\x0astaged_sig_ids="N/A"\x0astaged_sig_names="N/A"\x0astaged_sig_set_names="N/A"\x0a\x0a<?xml version='1.0' encoding='UTF-8'?><BAD_MSG><violation_masks><block>400500200500-1a01030000000032-0-0</block><alarm>20400500200500-1ef903400000003e-7400000000000000-0</alarm><learn>0-0-0-0</learn><staging>0-0-0-0</staging></violation_masks><request-violations><violation><viol_index>42</viol_index><viol_name>VIOL_ATTACK_SIGNATURE</viol_name><context>parameter</context><parameter_data><value_error/><enforcement_level>global</enforcement_level><name>c3Zhbmdh</name><value>PHNjcmlwdD5hbGVydCgxKTwvc2NyaXB0PiI=</value><location>query</location><expected_location></expected_location><is_base64_decoded>false</is_base64_decoded><param_name_pattern>*</param_name_pattern><staging>0</staging></parameter_data><staging>0</staging><sig_data><sig_id>200001475</sig_id><blocking_mask>3</blocking_mask><kw_data><buffer>c3ZhbmdhPTxzY3JpcHQ+YWxlcnQoMSk8L3NjcmlwdD4i</buffer><offset>8</offset><length>7</length></kw_data></sig_data><sig_data><sig_id>200000098</sig_id><blocking_mask>3</blocking_mask><kw_data><buffer>c3ZhbmdhPTxzY3JpcHQ+YWxlcnQoMSk8L3NjcmlwdD4i</buffer><offset>7</offset><length>7</length></kw_data></sig_data><sig_data><sig_id>200001088</sig_id><blocking_mask>2</blocking_mask><kw_data><buffer>c3ZhbmdhPTxzY3JpcHQ+YWxlcnQoMSk8L3NjcmlwdD4i</buffer><offset>15</offset><length>6</length></kw_data></sig_data><sig_data><sig_id>200101609</sig_id><blocking_mask>3</blocking_mask><kw_data><buffer>c3ZhbmdhPTxzY3JpcHQ+YWxlcnQoMSk8L3NjcmlwdD4i</buffer><offset>7</offset><length>25</length></kw_data></sig_data></violation></request-violations></BAD_MSG>

We have already implemented some platform logs in Splunk and this is the format we have for it (1st XML)

 

and the props.conf we have written for this in indexer - 

[abcd]
TIME_PREFIX = ^
MAX_TIMESTAMP_LOOKAHEAD = 25
TIME_FORMAT = %b %d %H:%M:%S
SEDCMD-newline_remove = s/\\r\\n/\n/g
SEDCMD-formatxml =s/></>\n</g
LINE_BREAKER = ([\r\n]+)[A-Z][a-z]{2}\s+\d{1,2}\s\d{2}:\d{2}:\d{2}\s
SHOULD_LINEMERGE = False
TRUNCATE = 10000

# Leaving PUNCT enabled can impact indexing performance. Customers can
# comment this line if they need to use PUNCT (e.g. security use cases)
ANNOTATE_PUNCT = false
 
props.conf on search head -
 
[abcd]
REPORT-xml_kv_extract = bad_msg_xml, bad_msg_xml_kv
 
transforms.conf
 
[bad_msg_xml]
REGEX = (?ms)<BAD_MSG>(.*?)<\/BAD_MSG>
FORMAT = Bad_Msg_Xml::$1

[bad_msg_xml_kv]
SOURCE_KEY = Bad_Msg_Xml
REGEX = (?ms)<(\w*)>([^<]*)<\/\1>
FORMAT = $1::$2
MV_ADD = true
 
Now we are applying same logic for the  raw data (attached above in 2nd XML format) and now it is not at all working in readable format --
 
 
Sometimes single event is coming as multi event. for example response code coming as one event method is coming as another event which is not supposed to be. Please help me with props and transforms modifications. We need data to be in the format I have given initially
Labels (5)
0 Karma

PickleRick
SplunkTrust
SplunkTrust

This is... bad,

Firstly, it seems that it's data already received by something else, embedded in another format and sent to Splunk.

Then secondly, these are completely different sourcetypes. So if you absolutely cannot separate them earlier, you should overwrite sourcetype on ingestion so that each of those types is parsed differently.

0 Karma

livehybrid
SplunkTrust
SplunkTrust

Hi @splunklearner 

Could this is an issue with the LINE_BREAKER, try the following which includes a negative lookahead for the date:

LINE_BREAKER=([\r\n]+)(?=[A-Z][a-z]{2}\s+\d{1,2}\s\d{2}:\d{2}:\d{2}\s)

livehybrid_0-1750971348256.png

 

Can I just check, you said you have the props/transforms on the Indexer, is this data sent from a UF or HF? If its a HF then you'll need to deploy it there too.

🌟 Did this answer help you? If so, please consider:

  • Adding karma to show it was useful
  • Marking it as the solution if it resolved your issue
  • Commenting if you need any clarification

Your feedback encourages the volunteers in this community to continue contributing

0 Karma

splunklearner
Communicator

@livehybrid already checked the same via chatgpt and applied but no luck.

0 Karma

livehybrid
SplunkTrust
SplunkTrust

Can you also confirm, is the data coming from a UF? I saw you put that the conf was on the Indexers but if its being sent from a Heavy Forwarder it will need to be there too.

Is this a regular monitor:// input?

🌟 Did this answer help you? If so, please consider:

  • Adding karma to show it was useful
  • Marking it as the solution if it resolved your issue
  • Commenting if you need any clarification

Your feedback encourages the volunteers in this community to continue contributing

0 Karma

splunklearner
Communicator

UF it is. Not HF.

0 Karma

livehybrid
SplunkTrust
SplunkTrust

@splunklearner 

Hmm okay, it matches via https://regex101.com/r/ZZw8Lv/1 - must be something else, I'll keep digging.

Did chatgpt have any other suggestions!?

🌟 Did this answer help you? If so, please consider:

  • Adding karma to show it was useful
  • Marking it as the solution if it resolved your issue
  • Commenting if you need any clarification

Your feedback encourages the volunteers in this community to continue contributing

0 Karma
Career Survey
First 500 qualified respondents will receive a $20 gift card! Tell us about your professional Splunk journey.
Get Updates on the Splunk Community!

Introduction to Splunk AI

How are you using AI in Splunk? Whether you see AI as a threat or opportunity, AI is here to stay. Lucky for ...

Splunk + ThousandEyes: Correlate frontend, app, and network data to troubleshoot ...

Are you tired of troubleshooting delays caused by siloed frontend, application, and network data? We've got a ...

Maximizing the Value of Splunk ES 8.x

Splunk Enterprise Security (ES) continues to be a leader in the Gartner Magic Quadrant, reflecting its pivotal ...