Getting Data In

CSV with custom quoting

anthonysomerset
Path Finder

Hi There

i have a CSV/UDR without headers with following example rows

session_start,0    ,0    ,2017-03-07 20:00:50 +0200     ,                              ,172.99.99.0~86056588,labtest             ,172.99.99.0       ,       [0],         0,         0,         0,         0
usage_start  ,0    ,0    ,2017-03-07 20:25:37 +0200     ,2017-03-07 20:25:47 +0200     ,172.99.99.0~86056588,labtest             ,172.99.99.0       ,   [27770],     20549,     18187,      2362,         0
usage_int    ,0    ,3    ,2017-03-07 20:33:05 +0200     ,2017-03-07 20:41:54 +0200     ,172.99.99.0~86056588,labtest             ,172.99.99.0       ,   [15457],     54450,     36051,     18399,         0
usage_stop   ,0    ,5    ,2017-03-07 20:46:23 +0200     ,2017-03-07 20:46:23 +0200     ,172.99.99.0~86056588,labtest             ,172.99.99.0       ,    [6322],         0,         0,         0,         0
session_stop ,0    ,59   ,2017-03-07 20:00:50 +0200     ,2017-03-07 20:59:32 +0200     ,172.99.99.0~86056588,labtest             ,172.99.99.0       ,       [0],         0,         0,         0,         0

currently i have a props that looks like this:

[sde_rg_udr]
SHOULD_LINEMERGE=false
KV_MODE = NONE
TIME_FORMAT = %Y-%m-%d %H:%M:%S %z
TIME_PREFIX=(?:.*?,){3}
MAX_TIMESTAMP_LOOKAHEAD = 26
INDEXED_EXTRACTIONS = CSV
FIELD_NAMES = RecordType,RecordStatus,RecordNumber,StartTime,EndTime,AcctSessionId,SubscriberId,FramedIp,[ServiceId],TotalBytes,RxBytes,TxBytes,Time
TZ = Africa/Harare

Unfortunately the ServiceID field is encapsulated in [] braces and i just need the id without the braces - how can i change my props to extract the field at index time without the braces?

0 Karma
1 Solution

anthonysomerset
Path Finder

So i tried @cusello approach of the SEDCMD but it didn't do what i wanted it to 😞

i Ended up down the regex extraction route mainly because of all the additional spaces used for padding - for the benefit of others my final props config was

[sde_rg_udr]
SHOULD_LINEMERGE=false
KV_MODE = NONE
TIME_FORMAT = %Y-%m-%d %H:%M:%S %z
TIME_PREFIX=(?:.*?,){3}
MAX_TIMESTAMP_LOOKAHEAD = 26
EXTRACT-sde_rg_udr = ^(?P<RecordType>[^,]*)\s*,(?P<RecordStatus>[^,]*)\s*,(?P<RecordNumber>[^,]*)\s*,(?P<StartTime>\d{4}-\d{2}-\d{2}\s\d{2}:\d{2}:\d{2}\s\+\d{4})\s*,(?P<EndTime>\d{4}-\d{2}-\d{2}\s\d{2}:\d{2}:\d{2}\s\+\d{4})\s*,(?P<AcctSessionId>[^,]*)\s*,(?P<SubscriberId>[^,]*)\s*,(?P<FramedIp>[^,]*)\s*,\s*\[(?P<ServiceId>[^,]*)\],\s*(?P<TotalBytes>[^,]*),\s*(?P<RxBytes>[^,]*),\s*(?P<TxBytes>[^,]*),\s*(?P<Time>[^,]*)

View solution in original post

0 Karma

anthonysomerset
Path Finder

So i tried @cusello approach of the SEDCMD but it didn't do what i wanted it to 😞

i Ended up down the regex extraction route mainly because of all the additional spaces used for padding - for the benefit of others my final props config was

[sde_rg_udr]
SHOULD_LINEMERGE=false
KV_MODE = NONE
TIME_FORMAT = %Y-%m-%d %H:%M:%S %z
TIME_PREFIX=(?:.*?,){3}
MAX_TIMESTAMP_LOOKAHEAD = 26
EXTRACT-sde_rg_udr = ^(?P<RecordType>[^,]*)\s*,(?P<RecordStatus>[^,]*)\s*,(?P<RecordNumber>[^,]*)\s*,(?P<StartTime>\d{4}-\d{2}-\d{2}\s\d{2}:\d{2}:\d{2}\s\+\d{4})\s*,(?P<EndTime>\d{4}-\d{2}-\d{2}\s\d{2}:\d{2}:\d{2}\s\+\d{4})\s*,(?P<AcctSessionId>[^,]*)\s*,(?P<SubscriberId>[^,]*)\s*,(?P<FramedIp>[^,]*)\s*,\s*\[(?P<ServiceId>[^,]*)\],\s*(?P<TotalBytes>[^,]*),\s*(?P<RxBytes>[^,]*),\s*(?P<TxBytes>[^,]*),\s*(?P<Time>[^,]*)
0 Karma

gcusello
SplunkTrust
SplunkTrust

Hi anthonysomerset,
try to insert in your props.conf file the following line to change ",[" and "]," with commas:

SEDCMD-drop1 = s/,\[/,/g
SEDCMD-drop2 = s/\],/,/g

Bye.
Giuseppe

0 Karma

anthonysomerset
Path Finder

So i tried this approach and it did the job in the raw event data however the field still had the braces in 😞

0 Karma
Career Survey
First 500 qualified respondents will receive a $20 gift card! Tell us about your professional Splunk journey.

Can’t make it to .conf25? Join us online!

Get Updates on the Splunk Community!

Take Action Automatically on Splunk Alerts with Red Hat Ansible Automation Platform

 Are you ready to revolutionize your IT operations? As digital transformation accelerates, the demand for ...

Calling All Security Pros: Ready to Race Through Boston?

Hey Splunkers, .conf25 is heading to Boston and we’re kicking things off with something bold, competitive, and ...

Beyond Detection: How Splunk and Cisco Integrated Security Platforms Transform ...

Financial services organizations face an impossible equation: maintain 99.9% uptime for mission-critical ...