Copy Field and remove duplicate

wessam · ‎06-07-2017

Hello All,

I have a column list of records as below

recordA
recordB
recordA
RecordB
RecordC
RecordD

and I would like to remove duplicate and copy them in another column to have finally a column for all records (records) and second record for records after dedup (record_after_dedup)
I tried to use eval with dedup but it doesn't work

eval record_after_dedup=dedup(records)

somesoni2 · ‎06-08-2017

You could you mvdedup command like this (assuming you already have multivalued field

your current search giving multivalued field say 'mvfield'
| eval uniqmvfield=mvdedup(mvfield)

More of mvdedup here
http://docs.splunk.com/Documentation/SplunkCloud/6.6.0/SearchReference/MultivalueEvalFunctions#mvded...

If you don't have a multivalued field already, you can do like this

your current search giving field with single value say myfield
| stats list(myfield) as all_values values(myfield) as uniq_values

szabados · ‎06-08-2017

... | eval record_after_dedup=records | dedup record_after_dedup

szabados · ‎06-08-2017

... | eval record_after_dedup=records | dedup record_after_dedup

cmerriman · ‎06-07-2017

can you just do something like: ...|eval records_after_dedup=records|stats values(records_after_dedup) as records_after_dedup list(records) as records

wessam · ‎06-08-2017

when i write |eval records_after_dedup=records
it won't copy all records , it will only add new column called "records_after_dedup" where the name records is written as below

records_after_dedup
records
records
records
records
...

cmerriman · ‎06-08-2017

that's bizarre. try adding a single quote around records.
|eval records_after_dedup='records'

wessam · ‎06-08-2017

Thanks it works 🙂

eval records_dedup='records'|stats values(records) as records list(records_dedup) as records_dedup| dedup records_dedup | table records_dedup,records

Now it displayed two columns first one records_Dedup where records exist after removing duplicates and second column "records" where all records exist
i am trying now to display both of them using chart for specific month
| chart count over "date_month" by "records,records_dedup"
but now data is found 😞

cmerriman · ‎06-08-2017

if i'm understanding, the syntax should look more like :

eval records_dedup='records'|stats values(records_dedup) as records_dedup list(records) as records by date_month| table date_month records_dedup records

values lists the unique field values and list lists all field values.

i'm trying to understand what you're chart is supposed to look like. do you have an example?

wessam · ‎06-08-2017

i would like to display a graph that represent number of records_dedup and number of records over month

cmerriman · ‎06-08-2017

try adding |eval record_dedup_count=mvcount(records_dedup), records_count=mvcount(records)

wessam · ‎06-08-2017

it works ,much appreciated Thanks but it displayed also records_dedup and records in chart

cmerriman · ‎06-08-2017

if you'd like those fields removed, use |fields - records_dedup records

wessam · ‎06-08-2017

it works 🙂 Thanks for your help

dineshraj9 · ‎06-07-2017

Do you want ignore the case and remove duplicates?

wessam · ‎06-07-2017

yes, i just need to have two columns the first one includes all the records and second one include the records after removing duplicates

dineshraj9 · ‎06-07-2017

The number of rows would differ once you have removed the duplicates, instead use a multi-valued field -

<your search> | eval record_new=lower(records) | stats values(records) as records,values(records_new) as records_after_dedup

wessam · ‎06-07-2017

Sorry i think there is misunderstanding , i am not asking to remove duplicates for lower and upper case or to ignore them , I am only asking to remove duplicates :
for Ex :
100
50
20
50
100
the above column contains all the record , i would like to have two columns first one "Records" for all the records and the second one "Records after dedup" for records after removing duplicate
to be as below
Records :
100
50
20
50
100
Records after dedup
100
50
20

Hope you get my point 🙂

dineshraj9 · ‎06-07-2017

Since you want it in two separate columns do it this way -

<your search > | table records | appendcols [ search <yoursearch> | dedup records | rename records as records_after_dedup ]

wessam · ‎06-07-2017

unfortunately it didn't work properly
index=xxx source="yyy.csv" |table records | appendcols [ index=xxx source="yyy.csv" | dedup records|rename records as records_after_dedup ]

it displayed all the records !

Copy Field and remove duplicate

Pro Tips for First-Time .conf Attendees: Advice from SplunkTrust

Raise Your Skills at the .conf25 Builder Bar: Your Splunk Developer Destination

Hunt Smarter, Not Harder: Discover New SPL “Recipes” in Our Threat Hunting Webinar

Are you a member of the Splunk Community?

Copy Field and remove duplicate

Pro Tips for First-Time .conf Attendees: Advice from SplunkTrust

Raise Your Skills at the .conf25 Builder Bar: Your Splunk Developer Destination

Hunt Smarter, Not Harder: Discover New SPL “Recipes” in Our Threat Hunting Webinar