Splunk Search

What is the best way to solve this key pair field value issue?

Communicator

I have a data set with multiple key pair field values that start with the same key name.


Data source is Web Sense proxy logs authenticated by Active Directory.

=1 category=227 user=LDAP://1.2.3.4 OU=Sub Department,OU=IS,OU=HQ,OU=Employee,OU=WidgetCo,DC=subdomain,DC=TLDdomain,DC=local/Splunk Nerd src_host=192.168.100.200
=1 category=227 user=LDAP://1.2.3.4 OU=Sub Department,OU=IS,OU=HQ,OU=Employee,OU=WidgetCo,DC=subdomain,DC=TLDdomain,DC=local/Splunk Nerd src_host=192.168.100.200
=7 category=1526 user=LDAP://1.2.3.4 OU=Sub Department,OU=IS,OU=HQ,OU=Employee,OU=WidgetCo,DC=subdomain,DC=TLDdomain,DC=local/Splunk Nerd src_host=192.168.100.200


By default Splunk is parsing the first OU= and the first DC=. However, it is not parsing the remaining OU and DC pairs. I tried using | eval NextOU=mvindex(OU,1). That does not seem to be working. I wonder if that is because the OU= pairs are all on the same line?

I have a working regex that allows me to parse out the username. | rex field=_raw "^.local\/(?P.?)src_host.*$" I could create a regex to parse out each OU and DC pair. However, there is the possibility a particular user may be nested under more or less OU’s.

Not sure what I am missing here. I looked into the extract command, but I think Splunk is working as expected.

0 Karma
1 Solution

Legend

Try this and see... carefully copy the exact spacing, etc.

In props.conf

[yoursourcetypehere]
REPORT-websense_ext=websense_extraction

In transforms.conf

[websense_extraction]
DELIMS = ", ", "="
MV_ADD = true

View solution in original post

Legend

Try this and see... carefully copy the exact spacing, etc.

In props.conf

[yoursourcetypehere]
REPORT-websense_ext=websense_extraction

In transforms.conf

[websense_extraction]
DELIMS = ", ", "="
MV_ADD = true

View solution in original post

Communicator

Hi lquinn,

The information you provided works, thank you. However, more event fields are created with garbage data. I think it may have to do with there are other key value pairs separated by spaces in the same event. I am reviewing the props.conf and transforms.conf documentation to better understand what is occurring.

What do you think? Can Splunk handle parsing the key value pairs that have spaces and commas in the same event?

Here is an entire event.

Dec 3 16:48:31 101.1.1.42 vendor=Websense product=Security product_version=3.2.1 action=permitted severity=1 category=17 user=LDAP://1.2.3.4 OU=Sub Department,OU=IS,OU=HQ,OU=Employee,OU=WidgetCo,DC=subdomain,DC=TLDDomain,DC=local/Splunk Nerd src_host=1.2.3.4 src_port=60608 dst_host=context.bestbuy.com dst_ip=172.226.16.62 dst_port=80 bytes_out=2102 bytes_in=768 http_response=200 http_method=GET http_content_type=image/gif http_user_agent=Mozilla/5.0_(compatible;_MSIE_9.0;_Windows_NT_6.1;_WOW64;_Trident/5.0) http_proxy_status_code=200 reason=- disposition=1048 policy=Web Surfer role=8 duration=3 url=http://context.bestbuy.com/

0 Karma

Legend

Other things to try:

  1. Leave out the DELIMS attribute, but keep the MV_ADD. Don't change anything else in the answer above and see what happens.

  2. Replace the transforms.conf stanza with

    [websense_extraction_ou]
    REGEX=(OU)=(\S+?)(:?\s|,)
    FORMAT = $1::$2
    MV_ADD = true

    [websense_extraction_dc]
    REGEX=(DC)=(\S+?)(:?\s|,)
    FORMAT = $1::$2
    MV_ADD = true

and props.conf becomes

[yoursourcetypehere]
 REPORT-websense_ext=websense_extraction_ou,websense_extraction_dc

Communicator

This works perfectly. I may have been incorrect about the additional key value pairs being created due to the props.conf and transforms.conf modification. What appears to be happening is Splunk is parsing additional event fields out of really long URL strings in each event that contain sometext=sometext. Depending on the the results to my search, I sometimes have more or less of goofy event fields.

Thank you again for the help.

0 Karma

Legend

Yes, I have seen that problem with URL strings, too. There isn't much you can do about it, except just ignore the weird fields.

0 Karma