Archive

Date column has some bad data. I just want to remove the row if the date is doubled up in a row. How do I discard a row based on character count or other logic?

Communicator

Hello Splunkers,

Question about discarding rows, I want to disgard a row that is longer than 19 characters, if found in my query.
See below we have some junk data and I want to remove whole row if I see an occurrence as seen below.

Is there a way to say row != varchar(19) throw out row? similar to sql.. Not perfect syntax on my part:)

Throw out entire row based on this ugly row
2016-02-12 09:32:592016-02-12 10:14:38

This Row type keep etc. etc..
2016-02-12 09:32:59

Thank You,
Daniel MacGillivray

0 Karma
1 Solution

Champion

Please check this one -
to list down the events that are more than 19 char long -

your search... |  eval length=len(_raw) | where length > 19 | table _raw _time

to discard the events that are more than 19 char length, .....this will delete the indexed data (Caution: Removing data is irreversible. )

   your search... |  eval length=len(_raw) | where length > 19 | delete

View solution in original post

0 Karma

Communicator

This was just done on site. I think we solved it anyway. It would be good to see what other folks find.

| eval length=len(ReportGenerationStart_Time) | where length<20

0 Karma

Champion

sure, this will list out the lines which are less than 20 characters length.
when you said "I want to disgard a row", i thought to use the delete.

0 Karma

Communicator

Ahh, cool. Thanks, yeah, should have clarified, want to delete from search only. I would rather keep it around so we can talk to the data owners about it. Much appreciated !

0 Karma

Champion

Please check this one -
to list down the events that are more than 19 char long -

your search... |  eval length=len(_raw) | where length > 19 | table _raw _time

to discard the events that are more than 19 char length, .....this will delete the indexed data (Caution: Removing data is irreversible. )

   your search... |  eval length=len(_raw) | where length > 19 | delete

View solution in original post

0 Karma

Communicator

Hello inventsekar,

Thank you so much, quick question though. Does this need admin to be run? I wonder as I see the delete and is the delete command in this case only deleting from the SPL output only? Either way, thanks !!

0 Karma

Champion

yes, this needs admin privilege or, user must have "can_delete" role.
https://docs.splunk.com/Documentation/Splunk/6.4.3/Admin/Aboutusersandroles

Please be aware - the delete command deletes the indexed data.

0 Karma

Communicator

Understood, glad you said that for other folks to be aware of.

0 Karma

Splunk Employee
Splunk Employee

To clarify two things:

1) Even admin doesn't have the can_delete capability by default. I consider it best practice to create a separate user with that capability, so bad things don't happen as easily... 😉

2) | delete only marks events such that they are no longer returned when searching. It will not get removed from disk until it ages out. Just saying.

0 Karma

Communicator

Thanks Ssievert. I have my own delete user name and follow that logic. It was not a consideration for me to use the delete command so freely so I asked about admin but forgot it is not set up by default and would not be a good idea to add that to it.

I fully concur that without a clean command you are not deleting that data anyway. I have no intention of deleting just skipping by the data on the way to the DB from the upsert with the new DB connect..

Really good advice out here as usual. Best forum of any product on the net !

0 Karma