Solved: Extraction of a field containing a comma

softwareimprove · ‎07-30-2012

Hi

I'm trying to process a static csv in splunk. It looks like:

"Name","Owner","Organizations","Type","Actual","Total" "Project name","Seller","Customer","Type","2010-06-07 13:59:15 +0200",1,851,000.00

I can't do much about how the csv is generated. Splunk extracts the first 4 fields correctly, timestamps correctly using the 5th field but then extract the number (should be currency) as three different fields. How can I get it merged into a single field (to be charted)? Thanks

lguinn2 · ‎07-30-2012

This is pretty simple, and should work

yoursearchhere
| eval currency=tonumber(field1+field2+field3)

If that doesn't work because of the commas, do this

yoursearchhere
| eval currency=replace(field1+field2+field3, ",", "")
| eval currency=tonumber(currency)

If Splunk doesn't give you the 3 fields automatically, then do this

yoursearchhere
| rex field=_raw "\",(?<currency>.*?)$"
| eval currency=tonumber(currency)

Or

yoursearchhere
| rex field=_raw "\",(?<currency>.*?)$"
| eval currency=replace(currency, ",", "")
| eval currency=tonumber(currency)

The last 2 options assume that everything after the last ", is the currency.
HTH

View solution in original post

lguinn2 · ‎07-30-2012

This is pretty simple, and should work

yoursearchhere
| eval currency=tonumber(field1+field2+field3)

If that doesn't work because of the commas, do this

yoursearchhere
| eval currency=replace(field1+field2+field3, ",", "")
| eval currency=tonumber(currency)

If Splunk doesn't give you the 3 fields automatically, then do this

yoursearchhere
| rex field=_raw "\",(?<currency>.*?)$"
| eval currency=tonumber(currency)

Or

yoursearchhere
| rex field=_raw "\",(?<currency>.*?)$"
| eval currency=replace(currency, ",", "")
| eval currency=tonumber(currency)

The last 2 options assume that everything after the last ", is the currency.
HTH

softwareimprove · ‎07-31-2012

It worked, thanks!
eval curr=tonumber(tostring(cur1)+tostring(cur2))
was the exact way.

Kate_Lawrence-G · ‎07-30-2012

You can still do one of 2 things in this case:
You can create a new field out of the 3 fields already created using eval. Something like:

 eval currency=field1+field2+field3.

The advantage of this is you can add formatting in if needed.

You could also just do the field extraction again from the _raw data and ignore the fields it pulls out altogether and only use your new one. The believe the regex for that currency section would look something like this: (d{1,2}.d{3}.d{3}.d{2})

Kate_Lawrence-G · ‎07-30-2012

I think the best way to do it would be to use:

rex field=Total "$(?\d+,\d{3}.\d{2})"

That should get you the number cleanly, however it will still have the comma in it. Splunk may or may not be OK with that....

If it's not happy with that you can always use the eval tonumber(x,y) function and force the behavior

softwareimprove · ‎07-30-2012

Thanks, I managed to preprocess the file and now Splunk extract the timestamp and the resto of the fields correctly. The "total", is stored as "$24,800.00", which is not a number and thus can't be graph'd versus time. Can you suggest a way (regex?) to parse it into a numeric value? With rex field=Total "(?[^.]+)" I can strip the decimal part, but I can't find a way to strip the $ and the comma.

thanks again!

cburr2012 · ‎07-30-2012

Hi,

I just tested it. Put the entire number in quotes like you did the rest of the CSV.

i.e.

Name,Money
John,"1,851,000.00"

cburr2012 · ‎07-30-2012

Oh, I read that part, just forgot. Sorry about that!

softwareimprove · ‎07-30-2012

Thank you but I can't manipulate the source. I might try some awk tricks but I'll rather inject the csv directly in splunk.

Extraction of a field containing a comma

[Puzzles] Solve, Learn, Repeat: Dynamic formatting from XML events

Enter the Agentic Era with Splunk AI Assistant for SPL 1.4

Stronger Security with Federated Search for S3, GCP SQL & Australian Threat ...

Join the Conversation

Extraction of a field containing a comma

[Puzzles] Solve, Learn, Repeat: Dynamic formatting from XML events

Enter the Agentic Era with Splunk AI Assistant for SPL 1.4

Stronger Security with Federated Search for S3, GCP SQL & Australian Threat ...