Getting Data In

CSV with custom quoting

Path Finder

Hi There

i have a CSV/UDR without headers with following example rows

session_start,0    ,0    ,2017-03-07 20:00:50 +0200     ,                              ,172.99.99.0~86056588,labtest             ,172.99.99.0       ,       [0],         0,         0,         0,         0
usage_start  ,0    ,0    ,2017-03-07 20:25:37 +0200     ,2017-03-07 20:25:47 +0200     ,172.99.99.0~86056588,labtest             ,172.99.99.0       ,   [27770],     20549,     18187,      2362,         0
usage_int    ,0    ,3    ,2017-03-07 20:33:05 +0200     ,2017-03-07 20:41:54 +0200     ,172.99.99.0~86056588,labtest             ,172.99.99.0       ,   [15457],     54450,     36051,     18399,         0
usage_stop   ,0    ,5    ,2017-03-07 20:46:23 +0200     ,2017-03-07 20:46:23 +0200     ,172.99.99.0~86056588,labtest             ,172.99.99.0       ,    [6322],         0,         0,         0,         0
session_stop ,0    ,59   ,2017-03-07 20:00:50 +0200     ,2017-03-07 20:59:32 +0200     ,172.99.99.0~86056588,labtest             ,172.99.99.0       ,       [0],         0,         0,         0,         0

currently i have a props that looks like this:

[sde_rg_udr]
SHOULD_LINEMERGE=false
KV_MODE = NONE
TIME_FORMAT = %Y-%m-%d %H:%M:%S %z
TIME_PREFIX=(?:.*?,){3}
MAX_TIMESTAMP_LOOKAHEAD = 26
INDEXED_EXTRACTIONS = CSV
FIELD_NAMES = RecordType,RecordStatus,RecordNumber,StartTime,EndTime,AcctSessionId,SubscriberId,FramedIp,[ServiceId],TotalBytes,RxBytes,TxBytes,Time
TZ = Africa/Harare

Unfortunately the ServiceID field is encapsulated in [] braces and i just need the id without the braces - how can i change my props to extract the field at index time without the braces?

0 Karma
1 Solution

Path Finder

So i tried @cusello approach of the SEDCMD but it didn't do what i wanted it to 😞

i Ended up down the regex extraction route mainly because of all the additional spaces used for padding - for the benefit of others my final props config was

[sde_rg_udr]
SHOULD_LINEMERGE=false
KV_MODE = NONE
TIME_FORMAT = %Y-%m-%d %H:%M:%S %z
TIME_PREFIX=(?:.*?,){3}
MAX_TIMESTAMP_LOOKAHEAD = 26
EXTRACT-sde_rg_udr = ^(?P<RecordType>[^,]*)\s*,(?P<RecordStatus>[^,]*)\s*,(?P<RecordNumber>[^,]*)\s*,(?P<StartTime>\d{4}-\d{2}-\d{2}\s\d{2}:\d{2}:\d{2}\s\+\d{4})\s*,(?P<EndTime>\d{4}-\d{2}-\d{2}\s\d{2}:\d{2}:\d{2}\s\+\d{4})\s*,(?P<AcctSessionId>[^,]*)\s*,(?P<SubscriberId>[^,]*)\s*,(?P<FramedIp>[^,]*)\s*,\s*\[(?P<ServiceId>[^,]*)\],\s*(?P<TotalBytes>[^,]*),\s*(?P<RxBytes>[^,]*),\s*(?P<TxBytes>[^,]*),\s*(?P<Time>[^,]*)

View solution in original post

0 Karma

Path Finder

So i tried @cusello approach of the SEDCMD but it didn't do what i wanted it to 😞

i Ended up down the regex extraction route mainly because of all the additional spaces used for padding - for the benefit of others my final props config was

[sde_rg_udr]
SHOULD_LINEMERGE=false
KV_MODE = NONE
TIME_FORMAT = %Y-%m-%d %H:%M:%S %z
TIME_PREFIX=(?:.*?,){3}
MAX_TIMESTAMP_LOOKAHEAD = 26
EXTRACT-sde_rg_udr = ^(?P<RecordType>[^,]*)\s*,(?P<RecordStatus>[^,]*)\s*,(?P<RecordNumber>[^,]*)\s*,(?P<StartTime>\d{4}-\d{2}-\d{2}\s\d{2}:\d{2}:\d{2}\s\+\d{4})\s*,(?P<EndTime>\d{4}-\d{2}-\d{2}\s\d{2}:\d{2}:\d{2}\s\+\d{4})\s*,(?P<AcctSessionId>[^,]*)\s*,(?P<SubscriberId>[^,]*)\s*,(?P<FramedIp>[^,]*)\s*,\s*\[(?P<ServiceId>[^,]*)\],\s*(?P<TotalBytes>[^,]*),\s*(?P<RxBytes>[^,]*),\s*(?P<TxBytes>[^,]*),\s*(?P<Time>[^,]*)

View solution in original post

0 Karma

SplunkTrust
SplunkTrust

Hi anthonysomerset,
try to insert in your props.conf file the following line to change ",[" and "]," with commas:

SEDCMD-drop1 = s/,\[/,/g
SEDCMD-drop2 = s/\],/,/g

Bye.
Giuseppe

0 Karma

Path Finder

So i tried this approach and it did the job in the raw event data however the field still had the braces in 😞

0 Karma
Don’t Miss Global Splunk
User Groups Week!

Free LIVE events worldwide 2/8-2/12
Connect, learn, and collect rad prizes
and swag!