I have a data source I am pulling syslog data from (a modular input). The data returned from this API is syslog formatted, however one of the fields within the syslog data contains a JSON formatted object within it. I am wondering what the best approach would be to start enabling default search-time extraction of the data held within this field:
<38>1 2017-01-13T17:31:46Z - VENDORHERE - EVENTYPEHERE [USERHERE@PIDHERE ... threatsInfoMap="[{\"threatID\":\"...\", \"threatType\":\"...\", \"classification\":\"...\", \"threatUrl\":\"...\", \"threatTime\":\"...\", \"threat\":\"...\", \"campaignID\":\"...\"},{\"threatID\":\"...\", \"threatType\":\"...\", \"classification\":\"...\", \"threatUrl\":\"...\", \"threatTime\":\"...\", \"threat\":\"...\", \"campaignID\":\"...\"}]" ...]
The ... 's are obviously redaction's of the data. The 'threatsInfoMap' field is the field containing JSON formatted data within the syslog data however. It is basically an array, that can contain no/single/many individual threats. Expanded out (and the escaped quote marks removed):
[
{
"threatID":"...",
"threatType":"...",
"classification":"...",
"threatUrl":"...",
"threatTime":"...",
"threat":"...",
"campaignID":"..."
},
{
"threatID":"...",
"threatType":"...",
"classification":"...",
"threatUrl":"...",
"threatTime":"...",
"threat":"...",
"campaignID":"..."
}
]
I would like to make these fields default searchable in the TA (add-on / app) that I am developing, however I am finding it difficult to be able to extract multiple values for the same field name using regexes. Can someone please point me in the right direction or suggestions so that I should begin to explore adding this type of search time extraction and ingestion of these values?
... View more