This is more of a bug report than a question.
I am working with the Billing data coming from our new GCP environment and have found that CSV data where a string contains a comma is not being parsed properly.
I have discovered this from data where the Credit1 field says "External IPs will not be charged until April 1, 2020." It splits the row into two columns, pushing all data out by one and therefore losing the Description Field value from that event.
The data looks like this is Splunk
Account ID: 111111-222222-333333
Credit1: "External IPs will not be charged until April 1
Credit1 Amount: 2020."
Credit1 Currency: -0.156831
End Time: 2020-02-20T00:00:00-08:00
Line Item: com.google.cloud/services/compute-engine/ExternalIp
Measurement1 Total Consumption: 183661
Measurement1 Units: seconds
Project ID: 12344566778
Project Labels: project
Project Name: project
Project Number: GBP
Start Time: 2020-02-19T00:00:00-08:00
This data parsing isnt done in the props of the sourcetype but in the python code. Line 260 of Splunk_TA_google-cloudplatform/bin/splunk_ta_gcp/modinputs/billing.py
I'm not an expert in GCP data, so I dont know what the best course of action would be to fix this bug without breaking other things.
My assumption is that no data values will have "word,word" but "word, word" with a space, so would the fix here to change the split from being just a comma to a comma and no space afterwards. ie in python line = line.split(',[^\s]')