Getting Data In

Log formatting not as expected

splunklearner
Communicator
Jun 26 13:46:12 128.23.84.166 [local0.err] <131>Jun 26 13:46:12 GBSDFA1AD011HMA.systems.uk.fed ASM:f5_asm=PROD

vs_name="/f5-tenant-01/XXXXXXXX"
violations="HTTP protocol compliance failed"
sub_violations="HTTP protocol compliance failed:Header name with no header value"
attack_type="HTTP Parser Attack"
violation_rating="3/5"
severity="Error"

support_id="XXXXXXXXX"
policy_name="/Common/waf-fed-transparent"
enforcement_action="none"

dest_ip_port="128.155.6.2:443"
ip_client="128.163.192.44"
x_forwarded_for_header_value="N/A"

method="POST"
uri="/auth-service/api/v2/token/refreshAccessToken"
microservice="N/A"
query_string="N/A"
response_code="500"

sig_cves="N/A"
sig_ids="N/A"
sig_names={N/A}
sig_set_names="N/A"
staged_sig_cves="N/A"
staged_sig_ids="N/A"
staged_sig_names="N/A"
staged_sig_set_names="N/A"

<?xml version='1.0' encoding='UTF-8'?>
<BAD_MSG>
<violation_masks>
<block>0-0-0-0</block>
<alarm>2400500004500-106200000003e-0-0</alarm>
<learn>0-0-0-0</learn>
<staging>0-0-0-0</staging>
</violation_masks>
<request-violations>
<violation>
<viol_index>14</viol_index>
<viol_name>VIOL_HTTP_PROTOCOL</viol_name>
<http_sanity_checks_status>2</http_sanity_checks_status>
<http_sub_violation_status>2</http_sub_violation_status>
<http_sub_violation>SGVhZGVyICdBdXRob3JpemF0aW9uJyBoYXMgbm8gdmFsdWU=</http_sub_violation>
</violation>
</request-violations>
</BAD_MSG>​
Jul  3 11:12:48 128.168.189.4 [local0.err] <131>2025-07-03T11:12:48+00:00 nginxplus-nginx-ingress-controller-6947cb4744-hxwf5 ASM:Log_details\x0a\x0avs_name="14-cyberwasp-sv-busybox.ikp3001ynp.cloud.uk.fed:10-/"\x0aviolations="Attack signature detected"\x0asub_violations="N/A"\x0aattack_type="Cross Site Scripting (XSS)"\x0aviolation_rating="5/5"\x0aseverity="N/A"\x0a\x0asupport_id="14096019979554169061"\x0apolicy_name="waf-fed-enforced"\x0aenforcement_action="block"\x0a\x0adest_ip_port="0.0.0.0:443"\x0aip_client="128.175.220.223"\x0ax_forwarded_for_header_value="N/A"\x0a\x0amethod="GET"\x0auri="/"\x0amicroservice="N/A"\x0aquery_string="svanga=%3Cscript%3Ealert(1)%3C/script%3E%22"\x0aresponse_code="0"\x0a\x0asig_cves="N/A,N/A,N/A,N/A"\x0asig_ids="200001475,200000098,200001088,200101609"\x0asig_names={XSS script tag end (Parameter) (2),XSS script tag (Parameter),alert() (Parameter)...}\x0asig_set_names="{High Accuracy Signatures;Cross Site Scripting Signatures;Generic Detection Signatures (High Accuracy)},{High Accuracy Signatures;Cross Site Scripting Signatures;Generic Detection Signatures (High Accuracy)},{Cross Site Scripting Signatures}..."\x0astaged_sig_cves="N/A,N/A,N/A,N/A"\x0astaged_sig_ids="N/A"\x0astaged_sig_names="N/A"\x0astaged_sig_set_names="N/A"\x0a\x0a<?xml version='1.0' encoding='UTF-8'?><BAD_MSG><violation_masks><block>400500200500-1a01030000000032-0-0</block><alarm>20400500200500-1ef903400000003e-7400000000000000-0</alarm><learn>0-0-0-0</learn><staging>0-0-0-0</staging></violation_masks><request-violations><violation><viol_index>42</viol_index><viol_name>VIOL_ATTACK_SIGNATURE</viol_name><context>parameter</context><parameter_data><value_error/><enforcement_level>global</enforcement_level><name>c3Zhbmdh</name><value>PHNjcmlwdD5hbGVydCgxKTwvc2NyaXB0PiI=</value><location>query</location><expected_location></expected_location><is_base64_decoded>false</is_base64_decoded><param_name_pattern>*</param_name_pattern><staging>0</staging></parameter_data><staging>0</staging><sig_data><sig_id>200001475</sig_id><blocking_mask>3</blocking_mask><kw_data><buffer>c3ZhbmdhPTxzY3JpcHQ+YWxlcnQoMSk8L3NjcmlwdD4i</buffer><offset>8</offset><length>7</length></kw_data></sig_data><sig_data><sig_id>200000098</sig_id><blocking_mask>3</blocking_mask><kw_data><buffer>c3ZhbmdhPTxzY3JpcHQ+YWxlcnQoMSk8L3NjcmlwdD4i</buffer><offset>7</offset><length>7</length></kw_data></sig_data><sig_data><sig_id>200001088</sig_id><blocking_mask>2</blocking_mask><kw_data><buffer>c3ZhbmdhPTxzY3JpcHQ+YWxlcnQoMSk8L3NjcmlwdD4i</buffer><offset>15</offset><length>6</length></kw_data></sig_data><sig_data><sig_id>200101609</sig_id><blocking_mask>3</blocking_mask><kw_data><buffer>c3ZhbmdhPTxzY3JpcHQ+YWxlcnQoMSk8L3NjcmlwdD4i</buffer><offset>7</offset><length>25</length></kw_data></sig_data></violation></request-violations></BAD_MSG>

We have already implemented some platform logs in Splunk and this is the format we have for it (1st XML)

 

and the props.conf we have written for this in indexer - 

[abcd]
TIME_PREFIX = ^
MAX_TIMESTAMP_LOOKAHEAD = 25
TIME_FORMAT = %b %d %H:%M:%S
SEDCMD-newline_remove = s/\\r\\n/\n/g
SEDCMD-formatxml =s/></>\n</g
LINE_BREAKER = ([\r\n]+)[A-Z][a-z]{2}\s+\d{1,2}\s\d{2}:\d{2}:\d{2}\s
SHOULD_LINEMERGE = False
TRUNCATE = 10000

# Leaving PUNCT enabled can impact indexing performance. Customers can
# comment this line if they need to use PUNCT (e.g. security use cases)
ANNOTATE_PUNCT = false
 
props.conf on search head -
 
[abcd]
REPORT-xml_kv_extract = bad_msg_xml, bad_msg_xml_kv
 
transforms.conf
 
[bad_msg_xml]
REGEX = (?ms)<BAD_MSG>(.*?)<\/BAD_MSG>
FORMAT = Bad_Msg_Xml::$1

[bad_msg_xml_kv]
SOURCE_KEY = Bad_Msg_Xml
REGEX = (?ms)<(\w*)>([^<]*)<\/\1>
FORMAT = $1::$2
MV_ADD = true
 
Now we are applying same logic for the  raw data (attached above in 2nd XML format) and now it is not at all working in readable format --
 
 
Sometimes single event is coming as multi event. for example response code coming as one event method is coming as another event which is not supposed to be. Please help me with props and transforms modifications. We need data to be in the format I have given initially
Labels (5)
0 Karma

PickleRick
SplunkTrust
SplunkTrust

This is... bad,

Firstly, it seems that it's data already received by something else, embedded in another format and sent to Splunk.

Then secondly, these are completely different sourcetypes. So if you absolutely cannot separate them earlier, you should overwrite sourcetype on ingestion so that each of those types is parsed differently.

0 Karma

livehybrid
Super Champion

Hi @splunklearner 

Could this is an issue with the LINE_BREAKER, try the following which includes a negative lookahead for the date:

LINE_BREAKER=([\r\n]+)(?=[A-Z][a-z]{2}\s+\d{1,2}\s\d{2}:\d{2}:\d{2}\s)

livehybrid_0-1750971348256.png

 

Can I just check, you said you have the props/transforms on the Indexer, is this data sent from a UF or HF? If its a HF then you'll need to deploy it there too.

🌟 Did this answer help you? If so, please consider:

  • Adding karma to show it was useful
  • Marking it as the solution if it resolved your issue
  • Commenting if you need any clarification

Your feedback encourages the volunteers in this community to continue contributing

0 Karma

splunklearner
Communicator

@livehybrid already checked the same via chatgpt and applied but no luck.

0 Karma

livehybrid
Super Champion

Can you also confirm, is the data coming from a UF? I saw you put that the conf was on the Indexers but if its being sent from a Heavy Forwarder it will need to be there too.

Is this a regular monitor:// input?

🌟 Did this answer help you? If so, please consider:

  • Adding karma to show it was useful
  • Marking it as the solution if it resolved your issue
  • Commenting if you need any clarification

Your feedback encourages the volunteers in this community to continue contributing

0 Karma

splunklearner
Communicator

UF it is. Not HF.

0 Karma

livehybrid
Super Champion

@splunklearner 

Hmm okay, it matches via https://regex101.com/r/ZZw8Lv/1 - must be something else, I'll keep digging.

Did chatgpt have any other suggestions!?

🌟 Did this answer help you? If so, please consider:

  • Adding karma to show it was useful
  • Marking it as the solution if it resolved your issue
  • Commenting if you need any clarification

Your feedback encourages the volunteers in this community to continue contributing

0 Karma
Get Updates on the Splunk Community!

Deep Dive into Federated Analytics: Unlocking the Full Power of Your Security Data

In today’s complex digital landscape, security teams face increasing pressure to protect sprawling data across ...

Your summer travels continue with new course releases

Summer in the Northern hemisphere is in full swing, and is often a time to travel and explore. If your summer ...

From Alert to Resolution: How Splunk Observability Helps SREs Navigate Critical ...

It's 3:17 AM, and your phone buzzes with an urgent alert. Wire transfer processing times have spiked, and ...