Solved: automatic nested field extraction

mpatnode · ‎08-17-2010

I'm struggling with trying to extract multiple fields from a multivalue Active Directory attribute. For instance, given the following object:

dcName=w2k3r2.demo.dev
admonEventType=Update
Names:
    objectCategory=CN=Service-Connection-Point,CN=Schema,CN=Configuration,DC=demo,DC=dev
    ...
    distinguishedName=CN=bsmith,CN=Users,CN=default,CN=Zones,CN=Centrify,CN=Program Data,DC=demo,DC=dev
    objectGUID=cffb0829-0642-134c-2ef1-f03cc696e10b
          ...
    keywords=addr:253|animal:rabbit|color:blue
    showInAdvancedViewOnly=TRUE

I still want objectGUID and the other single value attributes parsed, but in this example, I also would like addr, animal and color parsed out as their own key-value pairs (also, I don't want to have to know the keynames apriori). Is there a preprocessing step where I can break the multi-value attributes into separate lines, or do I need to replace the ad-kv "(?<_KEY_1>[\w-]+)=(?<_VAL_1>[^\r\n]*)" tranform with some incredibly gnarly regex?

gkanapathy · ‎08-18-2010

Put in props.conf:

[ActiveDirectory]
REPORT-MESSAGE = ad-kv,keywords-kv

This overrides the default extraction (which is just "ad-kv"), and listing "keywords-kv" on the same line after it ensures that it runs after the "ad-kv" has had a chance to extract the keywords field first.

View solution in original post

gkanapathy · ‎08-18-2010

Put in props.conf:

[ActiveDirectory]
REPORT-MESSAGE = ad-kv,keywords-kv

This overrides the default extraction (which is just "ad-kv"), and listing "keywords-kv" on the same line after it ensures that it runs after the "ad-kv" has had a chance to extract the keywords field first.

gkanapathy · ‎08-18-2010

It doesn't matter whether the field is separately indexed or not. Please note that the ad-kv fields are also not extracted at index time, and are not any more "first-class" than the keyword fields. It simply appears that way because "diff" operates line-by-line against the full raw text, and the non-keyword fields happen to be on their own lines. What you really need is a field-by-field diff, which sadly Splunk does not come with.

mpatnode · ‎08-18-2010

Much better, but now can I do this before the event is stored so the fields look like first class fields in the event? In particular, I want to detect a change to one of the keyword fields, currently diff only shows me the keyword attribute changed.

mpatnode · ‎08-18-2010

This was trivial, once I found the right doc

In transforms.conf

[keywords-kv]
SOURCE = keywords
DELIMS = "|", ":"

Then in my search:

sourcetype="ActiveDirectory" keywords | extract keywords-kv

So now, I'd like to do this for all ActiveDirectory objects, and handle it both in keywords or description. It would be nice if I didn't need to add the "extract" pipes.

gkanapathy · ‎08-18-2010

You don't. see my other answer.

Stephen_Sorkin · ‎08-17-2010

Do you only want this in the "keywords" field or could the pipe delimited key:value pairs occur as values of other fields as well?

automatic nested field extraction

Updated Data Type Articles, Anniversary Celebrations, and More on Splunk Lantern

A Prelude to .conf25: Your Guide to Splunk University

4 Ways the Splunk Community Helps You Prepare for .conf25

Are you a member of the Splunk Community?

automatic nested field extraction

Updated Data Type Articles, Anniversary Celebrations, and More on Splunk Lantern

A Prelude to .conf25: Your Guide to Splunk University

4 Ways the Splunk Community Helps You Prepare for .conf25