Dashboards & Visualizations

windows command line parsing (aspera ascp.exe)

mitag
Contributor

Need to extract source and target path fields from a logged command line for an application called Aspera SCP, part of IBM Aspera file transfer service. The command lines are logged via events such as these:

 

C:\Program Files\Aspera\Enterprise Server\bin\ascp.exe -T -Q -d -l 50000 -m 25000 -k 2 -i C:\Users\asperaadmin\.ssh\asperaweb_id_dsa.openssh -O 33001 -P 33001 --ignore-host-key --mode=send --user=xferuser --host=ats-aws-us-whatever.com Z:\Content\metadata\somefile.xml /we-shall-anonimyze-this-one-too-b2c09898392f
C:\Program Files\Aspera\Enterprise Server\bin\ascp.exe -T -Q -d -l 300000 -m 10000 -k 2 -O 33001 -P 22 --ignore-host-key Z:\Source Files\Content\image.jpg username@target.host.net:/target/path/directory 

 

Note that the flags may have filenames after them - that may contain spaces; filenames may contain spaces as well.

The commands follow Aspera Command Reference:

 

ascp options [[user@]srcHost:]source_file1[,source_file2,...] [[user@]destHost:]target_path

 

Questions:

  • What is the best mechanism to extract fields such as username, hostnames, filenames for both source and target parts of the above events / commands, and optionally command flags, at search time? A single complex rex statement with optional groups given some of the fields are optional? Multiple simpler rex statements?
  • Would appreciate writing SPL for extract these fields that would work for the above two events.

P.S. I tried writing several rex statements to extract the ascp filename, ignore most flags, then one source filename, and finally target user, host and path for the above two statements - and got stuck - my multiple rex statements are stepping on each other and thus do not seem to be the best mechanism. The code below isn't working properly.

 

| rex field=event_message "^(?P<program_path_win>\w\:\\\.*?\\\(?P<program_module_win>[^\\\]+\.\S+))\s+(?P<event_msg_tail>.+)$"

| rex field=event_msg_tail ".+\s+(?P<file_path_win>\w\:\\\.*?\\\(?P<file_name_win>[^\\\]+\.\S+))\s+(?P<Destination>((?P<peer_userID>.+)\@)(?P<peer_host>\S+))\:|s+(?P<peer_dir>.*)?$"

| rex field=event_msg_tail "--host=(?P<peer_host>\S+)\s+"
| rex field=event_msg_tail "--user=(?P<peer_user>\S+)\s+"

| rex field=event_msg_tail ".+\s+(?P<file_path_win>\w\:\\\.*?\\\(?P<file_name_win>[^\\\]+\.\S+))\s+(?P<Destination>((?P<peer_userID>.+)\@)(?P<peer_host>\S+))\:|s+(?P<peer_dir>.*)?$"

| rex field=event_msg_tail "-i\s+(?P<private_key_file>\w\:\\\.+?)\s+(?:-\w+\s+|--\w+=|$)"
| rex field=event_msg_tail ".+\s+(?P<file_path_win>\w\:\\\.*?\\\(?P<file_name_win>[^\\\]+\.\S+))\s+(?P<peer_dir>.*)?$"


| eval peer_dir = coalesce(peer_dir, "")
| eval peer_host = coalesce(peer_host, "")
| eval peer_user = coalesce(peer_user, peer_userID, "")
| eval agent = coalesce(program_module_win, "")

| table _time host log_level component agent peer_host peer_user peer_userID peer_dir

 

Thanks!

P.S. I inadvertently posted this to the wrong section - to "Dashboards & Visualizations" rather than "search" - but don't see an option to move the post. Would someone please kindly move it - or tell me how?

Labels (1)
0 Karma
1 Solution

to4kawa
Ultra Champion
index=_internal | head 1 | fields _raw | eval _raw="C:\Program Files\Aspera\Enterprise Server\bin\ascp.exe -T -Q -d -l 50000 -m 25000 -k 2 -i C:\Users\asperaadmin\.ssh\asperaweb_id_dsa.openssh -O 33001 -P 33001 --ignore-host-key --mode=send --user=xferuser --host=ats-aws-us-whatever.com Z:\Content\metadata\somefile.xml /we-shall-anonimyze-this-one-too-b2c09898392f"
| appendpipe [eval _raw="C:\Program Files\Aspera\Enterprise Server\bin\ascp.exe -T -Q -d -l 300000 -m 10000 -k 2 -O 33001 -P 22 --ignore-host-key Z:\Source Files\Content\image.jpg username@target.host.net:/target/path/directory"]
| rex "ascp.exe.*\s--\S+\s(?P<source_files>[A-Z]\:.*)\s(?P<target_path>.*)$"
| rex field=source_files "((?<user>[^@]+(?=@))(?:@))?(?<source_host>[^:]+(?=\:\/))?.*"
| rex field=target_path "((?<user>[^@]+(?=@))(?:@))?(?<target_host>[^:]+(?=\:))?.*"

Regular expressions can't work well without accurate samples, right?


View solution in original post

to4kawa
Ultra Champion
index=_internal | head 1 | fields _raw | eval _raw="C:\Program Files\Aspera\Enterprise Server\bin\ascp.exe -T -Q -d -l 50000 -m 25000 -k 2 -i C:\Users\asperaadmin\.ssh\asperaweb_id_dsa.openssh -O 33001 -P 33001 --ignore-host-key --mode=send --user=xferuser --host=ats-aws-us-whatever.com Z:\Content\metadata\somefile.xml /we-shall-anonimyze-this-one-too-b2c09898392f"
| appendpipe [eval _raw="C:\Program Files\Aspera\Enterprise Server\bin\ascp.exe -T -Q -d -l 300000 -m 10000 -k 2 -O 33001 -P 22 --ignore-host-key Z:\Source Files\Content\image.jpg username@target.host.net:/target/path/directory"]
| rex "ascp.exe.*\s--\S+\s(?P<source_files>[A-Z]\:.*)\s(?P<target_path>.*)$"
| rex field=source_files "((?<user>[^@]+(?=@))(?:@))?(?<source_host>[^:]+(?=\:\/))?.*"
| rex field=target_path "((?<user>[^@]+(?=@))(?:@))?(?<target_host>[^:]+(?=\:))?.*"

Regular expressions can't work well without accurate samples, right?


mitag
Contributor

Well played 🙂 Removing "--ignore-host-key" trips it though: "-flag" and "--options" aren't required parts of the command. E.g. the SPL should work for these two, as well:

 

ascp.exe -T -Q -l 300000 Z:\Source Files\Content\image2.jpg /target/path/directory2

 

... and something like:

 

ascp.exe -T -Q -l 300000 -i C:\Users\asperaadmin\.ssh\asperaweb_id_dsa.openssh Z:\Source Files\Content\image2.jpg /target/path/directory2

 

 I.e. the only certainty is this:

ascp options [[user@]srcHost:]source_file1[,source_file2,...] [[user@]destHost:]target_path
0 Karma
Get Updates on the Splunk Community!

Announcing Scheduled Export GA for Dashboard Studio

We're excited to announce the general availability of Scheduled Export for Dashboard Studio. Starting in ...

Extending Observability Content to Splunk Cloud

Watch Now!   In this Extending Observability Content to Splunk Cloud Tech Talk, you'll see how to leverage ...

More Control Over Your Monitoring Costs with Archived Metrics GA in US-AWS!

What if there was a way you could keep all the metrics data you need while saving on storage costs?This is now ...