Splunk Search

extracting fields from another field

martinnepolean
Explorer

Hi,

We are receiving the event in json format and given the _raw event below. I am trying to extract the fields in search time through props and transforms from a particular field but it is not working

_raw event

[{"command":"gitly-upload-pack tcp://prod-gitly-primary.domain.com:80 {\"repository\":{\"storage_name\":\"default\",\"relative_path\":\"infrastructure/app-config-iam-lint-rules.git\",\"git_object_directory\":\"\",\"git_alternate_object_directories\":[],\"gl_repository\":\"project-139\",\"gl_project_path\":\"infrastructure/app-config-iam-lint-rules\"},\"gl_repository\":\"project-139\",\"gl_project_path\":\"infrastructure/app-config-iam-lint-rules\",\"gl_id\":\"key-id\",\"gl_username\":\"uname\",\"git_config_options\":[],\"git_protocol\":null}","user":"user with id key-7260","pid":6055,"level":"info","msg":"executing git command","time":"2020-02-14T11:23:34+00:00","instance_id":"instanceid","instance_type":"m5.4xlarge","az":"us-east-1b","private_ip":"x.x.x.x","vpc_id":"vpc-id","ami_id":"ami-id","account_id":"12345","vpc":"infra-vpc","log_env":"prod","fluent_added_timestamp":"2020-02-14T11:23:36.397+0000","@timestamp":"2020-02-14T11:23:36.397+0000","SOURCE_REALTIME_TIMESTAMP":"1581679416397075","MESSAGE":"executing git command"}

Below is the value assigned to command field and i am trying to split into multiple fields,

gitly-upload-pack tcp://prod-gitly-primary.domain.com:80 {"repository":{"storage_name":"default","relative_path":"infrastructure/app-config-iam-lint-rules.git","git_object_directory":"","git_alternate_object_directories":[],"gl_repository":"project-139","gl_project_path":"infrastructure/app-config-iam-lint-rules"},"gl_repository":"project-139","gl_project_path":"infrastructure/app-config-iam-lint-rules","gl_id":"key-id","gl_username":"uname","git_config_options":[],"git_protocol":null}

It is extracted as expected through rex search cmd. **searchquery | rex field=command "^(?<git_command>[^\s]+)\s(?<git_url>[^\s]+)\s(?<git_json>.*)" | spath input=git_json**

i am trying to put it through props and transforms but not working

[sourcetype]
REPORT-command = morefields_from_command

[morefields_from_command]
kv_mode = json
SOURCE_KEY = command
REGEX = (?<git_command>\S+)\s(?<git_url>\S+)\s(?<git_json>.*)

my requirement is

git_command = gitly-upload-pack 
git-url = tcp://prod-gitly-primary.domain.com:80
git_json = {"repository":{"storage_name":"default","relative_path":"infrastructure/app-config-iam-lint-rules.git","git_object_directory":"","git_alternate_object_directories":[],"gl_repository":"project-139","gl_project_path":"infrastructure/app-config-iam-lint-rules"},"gl_repository":"project-139","gl_project_path":"infrastructure/app-config-iam-lint-rules","gl_id":"key-id","gl_username":"uname","git_config_options":[],"git_protocol":null}

once this done, then i have again split it from git_json as below

storage_name = default
relative_path=infrastructure/app-config-iam-lint-rules.git
..
..
..
git_protocol= null

0 Karma
1 Solution

to4kawa
Ultra Champion

props.conf(sourcetype=git_json)

[git_json]
EXTRACT-git_command = (?:command\":\")(?P<git_command>\S+)
EXTRACT-git_json = (?:\s)(?P<git_json>{.*})
EXTRACT-git_url = (?P<git_url>tcp:\S+)
REPORT-git_json = git_json
SHOULD_LINEMERGE = 0
TIME_FORMAT = %FT%T.%3Q%:z
TIME_PREFIX = @timestamp\":\"
TZ = UTC
pulldown_type = 1

transforms.conf

[git_json]
CLEAN_KEYS = 0
FORMAT = $1::$2
MV_ADD = 1
REGEX = \"([^\"]+?)\":(?:\"|\{|\[)?([^\"]*)
SOURCE_KEY = git_json

I try this props.conf. if _raw is your sample, it is OK.

View solution in original post

0 Karma

to4kawa
Ultra Champion

props.conf(sourcetype=git_json)

[git_json]
EXTRACT-git_command = (?:command\":\")(?P<git_command>\S+)
EXTRACT-git_json = (?:\s)(?P<git_json>{.*})
EXTRACT-git_url = (?P<git_url>tcp:\S+)
REPORT-git_json = git_json
SHOULD_LINEMERGE = 0
TIME_FORMAT = %FT%T.%3Q%:z
TIME_PREFIX = @timestamp\":\"
TZ = UTC
pulldown_type = 1

transforms.conf

[git_json]
CLEAN_KEYS = 0
FORMAT = $1::$2
MV_ADD = 1
REGEX = \"([^\"]+?)\":(?:\"|\{|\[)?([^\"]*)
SOURCE_KEY = git_json

I try this props.conf. if _raw is your sample, it is OK.

0 Karma

oscar84x
Contributor

Hi @martinnepolean,

Try adding this to your props.conf. Include everything, up to "in command". That's part of the code.

[sourcetype]
EXTRACT-morefields = (?<git_command>\S+)\s(?<git_url>\S+)\s(?<git_json>.*) in command
0 Karma

martinnepolean
Explorer

Hi @oscar84x ,

added till command,It is not working

0 Karma

oscar84x
Contributor

OK. Back to your original method, the documentation for transforms.conf specifies the following format: SOURCE_KEY = field:command. Could you try modifying that?

0 Karma

martinnepolean
Explorer

Tried it already, no luck

0 Karma
Get Updates on the Splunk Community!

Stay Connected: Your Guide to May Tech Talks, Office Hours, and Webinars!

Take a look below to explore our upcoming Community Office Hours, Tech Talks, and Webinars this month. This ...

They're back! Join the SplunkTrust and MVP at .conf24

With our highly anticipated annual conference, .conf, comes the fez-wearers you can trust! The SplunkTrust, as ...

Enterprise Security Content Update (ESCU) | New Releases

Last month, the Splunk Threat Research Team had two releases of new security content via the Enterprise ...