i have logs coming in as CSV files, but sometimes junk data is truncated on the front by the system generating them, sometimes not. they are otherwise identical. i have no control on this system.
the upside is the junk is always of a certain pattern:
blah:blah,gooddata,gooddata,gooddata,gooddata,gooddata
versus the clean ones which are just:
gooddata,gooddata,gooddata,gooddata,gooddata
how do i get a transform to drop that first column before indexing ONLY IF it has X:X as a value?
been beating my head against this for 2 days... regex is not my strong point.
Well a regex that will match anything at the beginning of the line until the first comma, with a :
character in it would look like this:
^[^,]+:[^,]+,
Well a regex that will match anything at the beginning of the line until the first comma, with a :
character in it would look like this:
^[^,]+:[^,]+,
an update, installed and working perfectly.
you sir/mam, are the man/woman. the end result:
^([^,]+:[^,]+,)?(.*)
works great. i should be able to just feed $2 back into the raw for either type and always have the same result.
Sorry, forgot two +
signs in my regex. Editing my answer with a correct regex.
though when i put what you provided into http://regexlib.com/RETester.aspx
as: ^[^,]:[^,],(.*)
with data as: something:anything,stuff1,stuff2,stuff3:stuff4,stuff5
nothing comes back.
and then i would use the Dest_Key=_Raw in the transform stanza to replace the raw log with my newly cleaned one i presume?