I have input data that looks like:
time=2017-05-29 calendar:num_1day_active_users=10437 gplus:num_1day_active_users=1 docs:num_1day_active_users=0 gmail:num_1day_active_users=24594 drive:num_1day_active_users=15787
I have done minimal work to props.conf, mostly to set up timestamp parsing.
The problem is that all the field names are being extracted as num1dayactiveusers, and I am only getting the first value in the event (I get num1dayactiveusers=10437).
If the colon were a period, then Splunk would auto-convert it to an underscore, and the fields would extract with names calendarnum1dayactiveusers, gplusnum1dayactiveusers, docsnum1dayactiveusers, gmailnum1dayactiveusers, and drivenum1dayactiveusers.
How can I get Splunk to do the same for field names that contain colons?
Check out CLEAN_KEYS in transforms.conf:
CLEAN_KEYS = [true|false] * NOTE: This attribute is only valid for search-time field extractions. * Optional. Controls whether Splunk "cleans" the keys (field names) it extracts at search time. "Key cleaning" is the practice of replacing any non-alphanumeric characters (characters other than those falling between the a-z, A-Z, or 0-9 ranges) in field names with underscores, as well as the stripping of leading underscores and 0-9 characters from field names. * Add CLEAN_KEYS = false to your transform if you need to extract field names that include non-alphanumeric characters, or which begin with underscores or 0-9 characters. * Defaults to true.
But you will have to use transforms.conf to define your extraction, and use a REPORT- line in props.conf to make use of that functionality.
Edit: it appears I mis-read the question. Please disregard, but leaving the content in place in case it helps anyone else.
Add this on your search heads for search time field extractions
[yoursourcetype] REPORT-extractfields = extract_colon_fields
[extract_colon_fields] REGEX = (\S+)\=(\S+) FORMAT = $1::$2
A restart of Splunk would be required. It should give you fields like calendarnum1dayactiveusers, gplusnum1dayactiveusers.
This works for this use case. I have a similar one that may have embedded spaces in the field values, but that's another day (I'll probably just move to json as a file format...)
You can build your own KVP extractor in transforms.conf like this:
[get_kvps_and_keep_colons] FORMAT = $1::$2 MV_ADD = 1 REGEX = (?:^|[\r\n\s]+)(\S+)=(\S+)