Getting Data In

oneshot and delete re-index duplicate data help

5plunked
Explorer

Hi,

I have this file path source specified in the main index that i want to re-index everything collected into a new index that I have created.

Problem:

  1. Switched over the indexing of new data from the main index to new index. But noticed that old data doesnt get indexed as Splunk dont do duplicate indexing.

  2. Tried doing bt probe to force reindexing which doesnt work which i proceeded to use the | delete command to remove all data from file path source in the new index. Great, it's all cleaned up. (should have made sure the bt probe had worked 😞 )

  3. NO it was not. i foolishly went to use the one shot command in an attempt to re-index all data from old index to new index.

  4. Now when i do a search in the new index, i am seeing different results and event count as the old index. When the same query gets run on both indexes now, the results that came out is different.

  5. I attempted to run the | delete command again in the new index, however it returns with zero events being deleted.

  6. Now the new index has all the data but event count and search query are still different from the old index.

I have some other event log source in the new index, so I am unable to just delete the whole index.

Could i get some help on how i can just force delete all data from only the file path source in the new index? and from there can i reindex the exact same indexed data as the old index? I do not mind losing the indexing of new data in the time being.

Thank you in advance!

0 Karma

jgbricker
Contributor

The search that returns results that you wish to delete can then be piped to delete. Are you using the source as part of your search filter and getting results prior to piping to delete? Also is your ingest sourcetype set correctly? When you oneshot you can specify ‘-sourcetype mysourcetype’

Also if you want to ingest the exact same data you can export the old index data as raw. You may want to do a monthly dump(may vary based on number of events and size). Basically you search for the stuff you want in old index and then after results render you click the export arrow in the upper right of results and choose raw as the output type.

0 Karma

niketn
Legend

@5plunked, are you using collect command to move already indexed data to new index?

https://docs.splunk.com/Documentation/Splunk/latest/SearchReference/Collect#Moving_events_to_a_diffe...

____________________________________________
| makeresults | eval message= "Happy Splunking!!!"
0 Karma

5plunked
Explorer

hi i have read up on the Collect command documentation.

From what i have understand, this command actually moves the data into a summary index and summarizes the existing events, which is not what i want. i have also tried running the command like this:

index="main" host="host" source="(filepath)" sourcetype="tsv" action="(some_action)" | collect index="newindex"

and it does not appear in the summary index nor the new index. Is there something wrong with my search query?

0 Karma

5plunked
Explorer

Thank you for the response, this process might be abit risky as i do not know exactly what data is missing, is there any quick way i can accurately check the differences?

0 Karma
Get Updates on the Splunk Community!

Federated Search for Amazon S3 | Key Use Cases to Streamline Compliance Workflows

Modern business operations are supported by data compliance. As regulations evolve, organizations must ...

New Dates, New City: Save the Date for .conf25!

Wake up, babe! New .conf25 dates AND location just dropped!! That's right, this year, .conf25 is taking place ...

Introduction to Splunk Observability Cloud - Building a Resilient Hybrid Cloud

Introduction to Splunk Observability Cloud - Building a Resilient Hybrid Cloud  In today’s fast-paced digital ...