Getting Data In

How to Splunk the SAP Security Audit Log

afx
Contributor

The post question did include the answer, but then it could not be marked as an answer, therefore I pushed the content into a second post that could be marked as an answer.

1 Solution

afx
Contributor

This is a write-up of my experiences trying to integrate the SAP Security Audit Log into Splunk without spending money and time getting third party adapters into SAP. I want to run this without touching SAP at all.

The information about the SAP audit log was taken from here:
https://blogs.sap.com/2014/12/11/analysis-and-recommended-settings-of-the-security-audit-log-sm19-sm...

First, you need to setup a splunk user id on the SAP servers that can read the log files, so typically it should be in group sapsys.
Of course you need to know where the log file is written to.

The SAP Security Audit log is a weird beast, it is written in UTF-16 even though it only shows simple ASCII, maybe SAP has a deal with disk manufacturers.

In the following, I assume a universal forwarder on the SAP sever which is managed by a deployment server and that the indexers are managed as cluster nodes and the basic infrastructure is already set up.

Forwarder Setup for SAP SAL:
props.conf:

[sap:sal]
CHARSET=UTF-16LE
NO_BINARY_CHECK=false
detect_trailing_nulls = false

inputs.conf:

[monitor:///sapmnt/AMP/audit/SAL/*/audit_a01amp_AMP_*_000001]
index = amp_sal
sourcetype = sap:sal

The forwarder already needs to know about the UTF-16LE encoding otherwise you might get rather strange results.
The SAL has 200 character records, no proper line ends.
Therefore, we need to trick Splunk into seeing lines.
This works by using the report type indicator at the beginning (always 2 or 3 on our systems) and the two dummy 0 bytes after date and time. It will consume the initial digit of the record but this is usually not relevant for the analysis.

So now that we can send the records to the indexer we need to help Splunk to identify the fields.
The initial thought of using just fixed fields never really worked the results where unusable for analysis.
Thanks to this post: https://answers.splunk.com/answers/78772/fixed-width-data-streamsfield-extractions.html
I got the idea to split the records into delimited fields.

This needs to be done on the indexer.
Therefore, the props.conf file that gets pushed to the indexers looks similar to the one on the forwarder with one crucial entry added:
TRANSFORMS=add_separators. And the character set information is removed:

[sap:sal]
category = Custom
BREAK_ONLY_BEFORE_DATE =
LINE_BREAKER = ([23])[A-Z][A-Z][A-Z0-9]\d{14}00
TIME_PREFIX=\w{3}
TIME_FORMAT=%Y%m%d%H%M%S
MAX_TIMESTAMP_LOOKAHEAD = 14
SHOULD_LINEMERGE = false
TRANSFORMS=add_separators

The Transforms entry of course also requires a transforms.conf on the indexer, it looks like this:

[add_separators]
DEST_KEY=_raw
SOURCE_KEY=_raw
REGEX = ^(.{3})(.{8})(.{6})(\w\w)(.{5})(.{5})(.{2})(.{8})(.{12})(.{20})(.{40})(.{3})(.)(.{64})(.{20})
FORMAT=$1|$2|$3|$4|$5|$6|$7|$8|$9|$10|$11|$12|$13|$14|$15

This splits the incoming records into fields separated by pipe symbols which can then be used to have reliable field definitions.

Now that we do have the data in a parse able format, we can put together an app for the Search Head:
props.conf:

[sap:sal]
category = Custom
REPORT-SAP-Delim = REPORT-SAP-Delim

EXTRACT-SAL-A = ^.{128}(?<FA>.*?[^&\|]*)&*
EXTRACT-SAL-B = ^.{128}.*?&(?<FB>.*?[^&\|]*)&*
EXTRACT-SAL-C = ^.{128}.*?&.*?&(?<FC>.*?[^&\|]*)&*
EXTRACT-SAL-D = ^.{128}.*?&.*?&.*?&(?<FD>.*?[^\|]*)

LOOKUP-auto_sap_sm20 = sap_sm20 message_id AS message_id OUTPUTNEW audit_class AS sap_audit_class event_class AS sap_event_class message AS sap_message new_in_release AS sap_new_in_release

The fixed fields will be defined through a transformation via REPORT-SAP-Delim.
In addition, the EXTRACT statments get dynamic subfields of the message field.
Finally, the LOOKUP uses the message_id to provide some additional explanatory fields.
The table in the SAP blog post referenced above was used to create that lookup.

The fields are defined in transforms.conf:

[REPORT-SAP-Delim]
DELIMS = "|"
FIELDS = "message_id","date","time","dummy","process_id","task","proctype","term","user","transaction","app","client","sglmode","message","src"

[sap_sm20]
batch_index_query = 0
case_sensitive_match = 1
filename = SAP_SM20.csv

Now we finally have a splunkable SAP Security Audit Log.

I tried to use the four dynamic fields to fill in the placeholders in the messages via an EVAL statement in props.conf, but that will not work with the message coming out of a lookup.
On the other hand, I will probably only need this in a report, so it can be done there.
This is an example report to mimic a regular SAP report for client 100:

index=amp_sal client=100  
| eval Message=sap_message
| eval Message=replace(Message,"&A",if(FA!="",FA," ")) 
| eval Message=replace(Message,"&B",if(FB!="",FB," ")) 
| eval Message=replace(Message,"&C",if(FC!="",FC," ")) 
| eval Message=replace(Message,"&D",if(FD!="",FD," ")) 
| lookup dnslookup clientip as src OUTPUT clienthost as src_resolved
| fillnull src_resolved value="N/A" 
| convert timeformat="%Y-%m-%d %H:%M:%S" ctime(_time) as Timestamp 
| table Timestamp user client sap_event_class src_resolved transaction app message_id Message 
| rename client as Mandant src_resolved as Source user as User transaction as Transaction app as Program sap_event_class as "SAP Event Class"

This report can then easily adapted to find failed logins (message_id IN (AU2,AU6,AUO,AUM,BUD) or other relevant events.

Happy Splunking
afx

View solution in original post

Dare2SplunkSAP
Explorer

Rather than developing this yourself, you can also look into PowerConnect. Not only are you able to pull in audit logs, but you'll also gain access to amazing performance-related dashboards out of the box!

0 Karma

truyennt07
Observer

I got this error from splunkd log. Does anyone faced the same issue?

05-12-2022 19:16:56.499 +0200 INFO WatchedFile - Checksum for seekptr didn't match, will re-read entire file=

0 Karma

jbrocks
Communicator

Additional information I found out while parsing SAL version2 logs:

The number at beginning of an event (2 or 3) seems to indicate, if the log messages continous in the next event (when the fiel path is too long for the fields limit of 64 signs) - this seems to work for a maximum of three events, that need to be combined. If a path exeeds the limit of 3 events, the rest of it seems to be lost. 

A "3" seems to indicate that the file path continous in the next message, a "2" indicates that the event ends behind the event starting with "2", so this is where you want to have the line breaker. This leads to the following:

 

LINE_BRAKER = 2[A-Z][A-Z][A-Z0-9]\d{14}00.{180}()

 

After LINE_BREAKING the events have to be set together using SEDCMD for events hat contain a length of two or three events.

0 Karma

afx
Contributor

Cool, thank's a lot!

0 Karma

marceloalejandr
Path Finder

As a former Basis Admin, this is awesome.   Thanks for sharing this. 

0 Karma

amartin6
Path Finder

We have a NW 7.4 system, with the following string of a raw file, I'm not sure yet which character set it is, I see some asci and some hex

Is there a published SAP document that is the secret decoder of fields to characters? Wondering if I can figure out the "unknowns"

AU120200121000026000484100087B5010100000012CONTROLM_BTC00000008RSBTCRTE00000005B&0&A0088066D29467B7092265A9E7C2F59D2130412E4E9FD15ACB97A5B64B89FCC8372DC0035

Which from above can be parsed out as such:

"message id" AU1
"date" 20200121
"time" 000026
"dummy" 00
"process id" 04841
"task" 00087
"proctype" B5
"unknown 1" 010100000012
"user name" CONTROLM_BTC
"unknown 2" 00000008
"transaction" RSBTCRTE
"unknown 3" 00000005
"unknown 4" B&0&A0088066D29467B7092265A9E7C2F59D2130412E4E9FD15ACB97A5B64B89FCC8372DC0035

I'm missing the message field which seems to be a duplicate of the transaction field, when looking at the message column through the SAP GUI it is more granular than the transaction code itself

0 Karma

afx
Contributor

Hi,
are you sure that this is the exact raw string of the file? It seems to be missing a digit before the AU1 message identifier (that digit gets eaten by splunk when using my definition of the line breaker).
I suggest opening an unmodified SAP audit log file in an editor where you can toggle the encoding (like Notepad++ or VIM) and checking it there to see if my defintion will work for it.

The beginning of your Unknown4 definitely looks like a SAP message, but it seems to have garbage at the end as only the "B&0&A" are used by AU1.
To me it looks like your SAP system is not nulling out buffers, but that should be extremely unlikely...

Unfortunately, there is no documentation apart from the referenced SAP Blog post about the format of the file.

cheers
afx

0 Karma

becksyboy
Contributor
0 Karma

becksyboy
Contributor

Thanks axf, i've attached a sample

0 Karma

afx
Contributor

Strange, this seems to have a header. Not something I have seen in my logs.
As this is already a UTF8 file and the records are not of fixed length, I wonder how it was generated? I assume that this is not an original unmodified SAP audit file..
cheers
afx

0 Karma

becksyboy
Contributor

Thanks afx, i have gone back to SAP admins for further info. Our inputs are picking this file up from an AIX host and they are listed as audit_YYYMMDD_000001.

0 Karma

afx
Contributor

The filename sounds ok.
So I wonder, did they convert this for you for convenince or why is it different.
After all, the SAL should be no different on AIX than the Linux ones I get.
cheers
afx

0 Karma

afx
Contributor

This is a write-up of my experiences trying to integrate the SAP Security Audit Log into Splunk without spending money and time getting third party adapters into SAP. I want to run this without touching SAP at all.

The information about the SAP audit log was taken from here:
https://blogs.sap.com/2014/12/11/analysis-and-recommended-settings-of-the-security-audit-log-sm19-sm...

First, you need to setup a splunk user id on the SAP servers that can read the log files, so typically it should be in group sapsys.
Of course you need to know where the log file is written to.

The SAP Security Audit log is a weird beast, it is written in UTF-16 even though it only shows simple ASCII, maybe SAP has a deal with disk manufacturers.

In the following, I assume a universal forwarder on the SAP sever which is managed by a deployment server and that the indexers are managed as cluster nodes and the basic infrastructure is already set up.

Forwarder Setup for SAP SAL:
props.conf:

[sap:sal]
CHARSET=UTF-16LE
NO_BINARY_CHECK=false
detect_trailing_nulls = false

inputs.conf:

[monitor:///sapmnt/AMP/audit/SAL/*/audit_a01amp_AMP_*_000001]
index = amp_sal
sourcetype = sap:sal

The forwarder already needs to know about the UTF-16LE encoding otherwise you might get rather strange results.
The SAL has 200 character records, no proper line ends.
Therefore, we need to trick Splunk into seeing lines.
This works by using the report type indicator at the beginning (always 2 or 3 on our systems) and the two dummy 0 bytes after date and time. It will consume the initial digit of the record but this is usually not relevant for the analysis.

So now that we can send the records to the indexer we need to help Splunk to identify the fields.
The initial thought of using just fixed fields never really worked the results where unusable for analysis.
Thanks to this post: https://answers.splunk.com/answers/78772/fixed-width-data-streamsfield-extractions.html
I got the idea to split the records into delimited fields.

This needs to be done on the indexer.
Therefore, the props.conf file that gets pushed to the indexers looks similar to the one on the forwarder with one crucial entry added:
TRANSFORMS=add_separators. And the character set information is removed:

[sap:sal]
category = Custom
BREAK_ONLY_BEFORE_DATE =
LINE_BREAKER = ([23])[A-Z][A-Z][A-Z0-9]\d{14}00
TIME_PREFIX=\w{3}
TIME_FORMAT=%Y%m%d%H%M%S
MAX_TIMESTAMP_LOOKAHEAD = 14
SHOULD_LINEMERGE = false
TRANSFORMS=add_separators

The Transforms entry of course also requires a transforms.conf on the indexer, it looks like this:

[add_separators]
DEST_KEY=_raw
SOURCE_KEY=_raw
REGEX = ^(.{3})(.{8})(.{6})(\w\w)(.{5})(.{5})(.{2})(.{8})(.{12})(.{20})(.{40})(.{3})(.)(.{64})(.{20})
FORMAT=$1|$2|$3|$4|$5|$6|$7|$8|$9|$10|$11|$12|$13|$14|$15

This splits the incoming records into fields separated by pipe symbols which can then be used to have reliable field definitions.

Now that we do have the data in a parse able format, we can put together an app for the Search Head:
props.conf:

[sap:sal]
category = Custom
REPORT-SAP-Delim = REPORT-SAP-Delim

EXTRACT-SAL-A = ^.{128}(?<FA>.*?[^&\|]*)&*
EXTRACT-SAL-B = ^.{128}.*?&(?<FB>.*?[^&\|]*)&*
EXTRACT-SAL-C = ^.{128}.*?&.*?&(?<FC>.*?[^&\|]*)&*
EXTRACT-SAL-D = ^.{128}.*?&.*?&.*?&(?<FD>.*?[^\|]*)

LOOKUP-auto_sap_sm20 = sap_sm20 message_id AS message_id OUTPUTNEW audit_class AS sap_audit_class event_class AS sap_event_class message AS sap_message new_in_release AS sap_new_in_release

The fixed fields will be defined through a transformation via REPORT-SAP-Delim.
In addition, the EXTRACT statments get dynamic subfields of the message field.
Finally, the LOOKUP uses the message_id to provide some additional explanatory fields.
The table in the SAP blog post referenced above was used to create that lookup.

The fields are defined in transforms.conf:

[REPORT-SAP-Delim]
DELIMS = "|"
FIELDS = "message_id","date","time","dummy","process_id","task","proctype","term","user","transaction","app","client","sglmode","message","src"

[sap_sm20]
batch_index_query = 0
case_sensitive_match = 1
filename = SAP_SM20.csv

Now we finally have a splunkable SAP Security Audit Log.

I tried to use the four dynamic fields to fill in the placeholders in the messages via an EVAL statement in props.conf, but that will not work with the message coming out of a lookup.
On the other hand, I will probably only need this in a report, so it can be done there.
This is an example report to mimic a regular SAP report for client 100:

index=amp_sal client=100  
| eval Message=sap_message
| eval Message=replace(Message,"&A",if(FA!="",FA," ")) 
| eval Message=replace(Message,"&B",if(FB!="",FB," ")) 
| eval Message=replace(Message,"&C",if(FC!="",FC," ")) 
| eval Message=replace(Message,"&D",if(FD!="",FD," ")) 
| lookup dnslookup clientip as src OUTPUT clienthost as src_resolved
| fillnull src_resolved value="N/A" 
| convert timeformat="%Y-%m-%d %H:%M:%S" ctime(_time) as Timestamp 
| table Timestamp user client sap_event_class src_resolved transaction app message_id Message 
| rename client as Mandant src_resolved as Source user as User transaction as Transaction app as Program sap_event_class as "SAP Event Class"

This report can then easily adapted to find failed logins (message_id IN (AU2,AU6,AUO,AUM,BUD) or other relevant events.

Happy Splunking
afx

vanvan
Path Finder

Awesome post, thank you!

0 Karma

becksyboy
Contributor

Thanks @afx this is worked out great for us when it came to onboarding SAP Netweaver Version 7.31/7.41 which had the SM19 security configuration. We had to use CHARSET=UTF-16BE for one source and CHARSET=UTF-16LE for the other to see the logs in Splunk, plus use a slightly different LINE_BREAKER and TIME_PREFIX.

As you did we used the table from the blog post to create the lookup.

We also need to onboard SAP Netweaver Version 750 which as the RSAU security Configuration. Have you come across this version before or know of a similar table to use like for the blog post?

thanks

0 Karma

afx
Contributor

Hi becksboy,
glad this was helpful!
We do run 750 here and the blog post from the SAP guy references the added SAL events, so that should work fine.
I am curious, do you know why you have UTF-16BE in one log?
And what did you have to use for LINE_BREAKER and TIME_PREFIX?

have a nice weekend
afx

0 Karma

becksyboy
Contributor

Hi afx it was strange, it was specifically for SAP Netweaver Version 731 when we tried to view the raw log in notepadd++ it was unreadable, when it came into Splunk it was gibberish.

I played around with a few decoding options and UTF-16BE seemed to work.

Apologies my mistake! we did use your suggested LINE_BREAKER and TIME_PREFIX which worked perfectly! (working on many onboarding tasks so mixed up my config :))

Below is an example of the 750/RSAU logs, which we find is different to the 7.31/7.41 ones in formatting.
5SAL_SAP_19720607_000000_FFFFFFFFFFFF18F43420D2543289875A6507ACD3B9DDA6DC7F5C58D7327CE17A6747698BBED60035AUW20200128000002009175300111B6000100000006SAPSYS00000008RSBTCRTE000000000009RSBTCRTE&009019A8A9C1C68EAE2D5AEFC55F369A4A89B74F648AAAA8DEE58BC0A6F9FAC4ED2D0035AUW20200128000003008061100109B6000100000006SAPSYS00000008RSBTCRTE000000000009RSBTCRTE&0090746CBE5A98425B4DEAD04A50436321120C49F2826F3B755C40B5A5AEDE61300D0035AUW20200128000.................

I initially tried to line Break with the following, but no luck yet:
LINE_BREAKER = [A-Z]{3}\d{4}\d{2}\d{2}\d{6}

thanks

0 Karma

afx
Contributor

Interesting, so no clear reason why BE...

cheers
afx

0 Karma

becksyboy
Contributor

Hi afx, i'm trying to re-write your props/transforms for the RSAU logs we get as this format is different to the SM19 ones. As you can see below we see some fields which we set to unknown. Have you managed to do something for RSAU.

Also your EXTRACT statements to get the dynamic subfields of the message field. Do you know which positions they are for the RSAU logs.

Could it be our RSAU logs are littered with too much garbage?

SM19

AU3|20200128|092758|00|19661|00010|Da|CSTS3PS2|FF_ACCBASIS1|DBACOCKPIT|SAPMSYST |800|1|DBACOCKPIT|CSTS3PS20

"message_id", AU3
"date", 20200128
"time", 092758
"dummy", 00
"process_id", 19661
"task", 00010
"proctype", Da
"term", CSTS3PS2
"user", FF_ACCBASIS1
"transaction", DBACOCKPIT
"app", SAPMSYST
"client", 800
"sglmode", 1
"message", DBACOCKPIT
"src" CSTS3PS20

RSAU

AUW|20200205|000002|00|91753|00111|B6|000100000006|SAPSYS|00000008|RSBTCRTE|000000000009|RSBTCRTE|&009019A8A9C1C68EAE2D5AEFC55F369A4A89B74F648AAAA8DEE58BC0A6F9FAC4ED2D0035

"message_id", AUW
"date", 20200205
"time", 000002
"dummy", 00
"process_id", 91753
"task", 00111
"proctype", B6
"unknown1",000100000006
"app", SAPSYS
"unknown2", 00000008
"transaction", RSBTCRTE
"unknown3", 000000000009
"message", RSBTCRTE
"unknown3", &009019A8A9C1C68EAE2D5AEFC55F369A4A89B74F648AAAA8DEE58BC0A6F9FAC4ED2D0035

thanks

0 Karma

afx
Contributor

Ok, this is different...
When our systems where migrated, the audit log format stayed the same.
Is this a fresh system with 750 or a migrated one?
As I only have a few boxes, I do not have much systems to compare with.

Do you still have 200 Character records? Check with NotePad++ or similar.
The spare characters for the line break seem to be still there from your example, so if there is just a new lenght, that could be adjusted easily for a longer record length.

cheers
afx

0 Karma
Get Updates on the Splunk Community!

Infographic provides the TL;DR for the 2024 Splunk Career Impact Report

We’ve been buzzing with excitement about the recent validation of Splunk Education! The 2024 Splunk Career ...

Enterprise Security Content Update (ESCU) | New Releases

In December, the Splunk Threat Research Team had 1 release of new security content via the Enterprise Security ...

Why am I not seeing the finding in Splunk Enterprise Security Analyst Queue?

(This is the first of a series of 2 blogs). Splunk Enterprise Security is a fantastic tool that offers robust ...