Good mornign All,
I have several logs with fields which have sibfield. I would like to be able to extract the subfield and append it to the parent. The example should clarify my query. I have a log of user modifications. The log would look something like that:
Before we delve into SPL, I want to ask if you have any influence on developers of this application. The put in so much energy into crafting a seemingly stiff log format. With that energy, why don't they just give you compliant JSON? That will easily satisfy all your structural desire to have parent node and children nodes. For example,
{"Changed Attributes":
{
"SAM Account Name": "-",
"Display Name": "-",
"User Principal Name": "-",
"Home Directory": "-",
"Home Drive": "-",
"Script Path": "-",
"Profile Path": "-",
"User Workstations": "-",
"Password Last Set": "9/12/2023 7:30:15 AM",
"Account Expires": "-",
"Primary Group ID": "-",
"AllowedToDelegateTo": "-",
"Old UAC Value": "-",
"New UAC Value": "-",
"User Account Control": "-",
"User Parameters": "-",
"SID History": "-",
"Logon Hours": "-"
}
}
And this sample will give the following fields:
field name | field value |
Changed Attributes.Account Expires | - |
Changed Attributes.AllowedToDelegateTo | - |
Changed Attributes.Display Name | - |
Changed Attributes.Home Directory | - |
Changed Attributes.Home Drive | - |
Changed Attributes.Logon Hours | - |
Changed Attributes.New UAC Value | - |
Changed Attributes.Old UAC Value | - |
Changed Attributes.Password Last Set | 9/12/2023 7:30:15 AM |
Changed Attributes.Primary Group ID | - |
Changed Attributes.Profile Path | - |
Changed Attributes.SAM Account Name | - |
Changed Attributes.SID History | - |
Changed Attributes.Script Path | - |
Changed Attributes.User Account Control | - |
Changed Attributes.User Parameters | - |
Changed Attributes.User Principal Name | - |
Changed Attributes.User Workstations | - |
I believe this satisfies your structural requirement.
If you absolutely have no influence, AND if the developers are so disciplined that they will never make tiny changes in log format, I want to ask how do you expect Splunk to identify "Changed Attributes:" as parent node? Is it merely by leading space in the line? Such criteria are extremely unrobust. Additionally, how many different parent nodes can there be? Is the illustration the entirety of the log or just a portion of the log? Unless you can clearly describe data characteristics, there is no way to give a meaningful solution.
Now, if the illustration is the entirety of the log, and your developers are extremely religious about spaces and swear on their souls never to make change, you can use these characteristics to derive information you need. One such methods is to convert the free-hand string to compliant JSON, then use spath to extract and flatten the structure.
| rex max_match=0 mode=sed "s/^/{\"/ s/ / \"/g s/: /\"&\"/g s/
/\",
/g s/:\",/\":
{/ s/$/\"
}
}/"
| spath
As to group the fields into changed and unchanged sets, that can also be achieved. If your developers are flexible to make the log compliant JSON, they can just make unchanged fields JSON null. Else you can try to handle them as string provided that the free-hand text is extremely rigid like mentioned above.
| foreach *
[eval changed = mvappend(changed, if('<<FIELD>>' == "-", null(), "<<FIELD>>" . " => " . '<<FIELD>>')),
unchanged = mvappend(unchanged, if('<<FIELD>>' == "-", "<<FIELD>>", null()))]
| table changed unchanged
This way, you get
changed | unchanged |
Changed Attributes.Password Last Set => 9/12/2023 7:30:15 AM | Changed Attributes.Account Expires Changed Attributes.AllowedToDelegateTo Changed Attributes.Display Name Changed Attributes.Home Directory Changed Attributes.Home Drive Changed Attributes.Logon Hours Changed Attributes.New UAC Value Changed Attributes.Old UAC Value Changed Attributes.Primary Group ID Changed Attributes.Profile Path Changed Attributes.SAM Account Name Changed Attributes.SID History Changed Attributes.Script Path Changed Attributes.User Account Control Changed Attributes.User Parameters Changed Attributes.User Principal Name Changed Attributes.User Workstations |
If your illustrated data is the entirety of the log, this is an emulation you can play with and compare with real data
| makeresults
| fields - _time
| eval data = "Changed Attributes:
SAM Account Name: -
Display Name: -
User Principal Name: -
Home Directory: -
Home Drive: -
Script Path: -
Profile Path: -
User Workstations: -
Password Last Set: 9/12/2023 7:30:15 AM
Account Expires: -
Primary Group ID: -
AllowedToDelegateTo: -
Old UAC Value: -
New UAC Value: -
User Account Control: -
User Parameters: -
SID History: -
Logon Hours: -"
``` data emulation above ```
Good morning yuanliu,
Thank you very much for such detailed response. I will go through the proposed solutions and let you know how this worked for us.
As to the format of the log, this is a standard Windows Active Direcotry log. There is no way we can change the format. Many other Windows / AD logs will have similar structure. Not much we can do about this.
Again, thank you for your answer.
Kind Regards,
Mike.