Dashboards & Visualizations

windows command line parsing (aspera ascp.exe)

mitag
Contributor

Need to extract source and target path fields from a logged command line for an application called Aspera SCP, part of IBM Aspera file transfer service. The command lines are logged via events such as these:

 

C:\Program Files\Aspera\Enterprise Server\bin\ascp.exe -T -Q -d -l 50000 -m 25000 -k 2 -i C:\Users\asperaadmin\.ssh\asperaweb_id_dsa.openssh -O 33001 -P 33001 --ignore-host-key --mode=send --user=xferuser --host=ats-aws-us-whatever.com Z:\Content\metadata\somefile.xml /we-shall-anonimyze-this-one-too-b2c09898392f
C:\Program Files\Aspera\Enterprise Server\bin\ascp.exe -T -Q -d -l 300000 -m 10000 -k 2 -O 33001 -P 22 --ignore-host-key Z:\Source Files\Content\image.jpg username@target.host.net:/target/path/directory 

 

Note that the flags may have filenames after them - that may contain spaces; filenames may contain spaces as well.

The commands follow Aspera Command Reference:

 

ascp options [[user@]srcHost:]source_file1[,source_file2,...] [[user@]destHost:]target_path

 

Questions:

  • What is the best mechanism to extract fields such as username, hostnames, filenames for both source and target parts of the above events / commands, and optionally command flags, at search time? A single complex rex statement with optional groups given some of the fields are optional? Multiple simpler rex statements?
  • Would appreciate writing SPL for extract these fields that would work for the above two events.

P.S. I tried writing several rex statements to extract the ascp filename, ignore most flags, then one source filename, and finally target user, host and path for the above two statements - and got stuck - my multiple rex statements are stepping on each other and thus do not seem to be the best mechanism. The code below isn't working properly.

 

| rex field=event_message "^(?P<program_path_win>\w\:\\\.*?\\\(?P<program_module_win>[^\\\]+\.\S+))\s+(?P<event_msg_tail>.+)$"

| rex field=event_msg_tail ".+\s+(?P<file_path_win>\w\:\\\.*?\\\(?P<file_name_win>[^\\\]+\.\S+))\s+(?P<Destination>((?P<peer_userID>.+)\@)(?P<peer_host>\S+))\:|s+(?P<peer_dir>.*)?$"

| rex field=event_msg_tail "--host=(?P<peer_host>\S+)\s+"
| rex field=event_msg_tail "--user=(?P<peer_user>\S+)\s+"

| rex field=event_msg_tail ".+\s+(?P<file_path_win>\w\:\\\.*?\\\(?P<file_name_win>[^\\\]+\.\S+))\s+(?P<Destination>((?P<peer_userID>.+)\@)(?P<peer_host>\S+))\:|s+(?P<peer_dir>.*)?$"

| rex field=event_msg_tail "-i\s+(?P<private_key_file>\w\:\\\.+?)\s+(?:-\w+\s+|--\w+=|$)"
| rex field=event_msg_tail ".+\s+(?P<file_path_win>\w\:\\\.*?\\\(?P<file_name_win>[^\\\]+\.\S+))\s+(?P<peer_dir>.*)?$"


| eval peer_dir = coalesce(peer_dir, "")
| eval peer_host = coalesce(peer_host, "")
| eval peer_user = coalesce(peer_user, peer_userID, "")
| eval agent = coalesce(program_module_win, "")

| table _time host log_level component agent peer_host peer_user peer_userID peer_dir

 

Thanks!

P.S. I inadvertently posted this to the wrong section - to "Dashboards & Visualizations" rather than "search" - but don't see an option to move the post. Would someone please kindly move it - or tell me how?

0 Karma
1 Solution

to4kawa
Ultra Champion
index=_internal | head 1 | fields _raw | eval _raw="C:\Program Files\Aspera\Enterprise Server\bin\ascp.exe -T -Q -d -l 50000 -m 25000 -k 2 -i C:\Users\asperaadmin\.ssh\asperaweb_id_dsa.openssh -O 33001 -P 33001 --ignore-host-key --mode=send --user=xferuser --host=ats-aws-us-whatever.com Z:\Content\metadata\somefile.xml /we-shall-anonimyze-this-one-too-b2c09898392f"
| appendpipe [eval _raw="C:\Program Files\Aspera\Enterprise Server\bin\ascp.exe -T -Q -d -l 300000 -m 10000 -k 2 -O 33001 -P 22 --ignore-host-key Z:\Source Files\Content\image.jpg username@target.host.net:/target/path/directory"]
| rex "ascp.exe.*\s--\S+\s(?P<source_files>[A-Z]\:.*)\s(?P<target_path>.*)$"
| rex field=source_files "((?<user>[^@]+(?=@))(?:@))?(?<source_host>[^:]+(?=\:\/))?.*"
| rex field=target_path "((?<user>[^@]+(?=@))(?:@))?(?<target_host>[^:]+(?=\:))?.*"

Regular expressions can't work well without accurate samples, right?


View solution in original post

to4kawa
Ultra Champion
index=_internal | head 1 | fields _raw | eval _raw="C:\Program Files\Aspera\Enterprise Server\bin\ascp.exe -T -Q -d -l 50000 -m 25000 -k 2 -i C:\Users\asperaadmin\.ssh\asperaweb_id_dsa.openssh -O 33001 -P 33001 --ignore-host-key --mode=send --user=xferuser --host=ats-aws-us-whatever.com Z:\Content\metadata\somefile.xml /we-shall-anonimyze-this-one-too-b2c09898392f"
| appendpipe [eval _raw="C:\Program Files\Aspera\Enterprise Server\bin\ascp.exe -T -Q -d -l 300000 -m 10000 -k 2 -O 33001 -P 22 --ignore-host-key Z:\Source Files\Content\image.jpg username@target.host.net:/target/path/directory"]
| rex "ascp.exe.*\s--\S+\s(?P<source_files>[A-Z]\:.*)\s(?P<target_path>.*)$"
| rex field=source_files "((?<user>[^@]+(?=@))(?:@))?(?<source_host>[^:]+(?=\:\/))?.*"
| rex field=target_path "((?<user>[^@]+(?=@))(?:@))?(?<target_host>[^:]+(?=\:))?.*"

Regular expressions can't work well without accurate samples, right?


mitag
Contributor

Well played 🙂 Removing "--ignore-host-key" trips it though: "-flag" and "--options" aren't required parts of the command. E.g. the SPL should work for these two, as well:

 

ascp.exe -T -Q -l 300000 Z:\Source Files\Content\image2.jpg /target/path/directory2

 

... and something like:

 

ascp.exe -T -Q -l 300000 -i C:\Users\asperaadmin\.ssh\asperaweb_id_dsa.openssh Z:\Source Files\Content\image2.jpg /target/path/directory2

 

 I.e. the only certainty is this:

ascp options [[user@]srcHost:]source_file1[,source_file2,...] [[user@]destHost:]target_path
0 Karma
Get Updates on the Splunk Community!

What the End of Support for Splunk Add-on Builder Means for You

Hello Splunk Community! We want to share an important update regarding the future of the Splunk Add-on Builder ...

Solve, Learn, Repeat: New Puzzle Channel Now Live

Welcome to the Splunk Puzzle PlaygroundIf you are anything like me, you love to solve problems, and what ...

Building Reliable Asset and Identity Frameworks in Splunk ES

 Accurate asset and identity resolution is the backbone of security operations. Without it, alerts are ...