Getting Data In

Problem with csv file import: how to prevent event doubling?

BastianSchlaak
New Member

Hello,

I am importing a csv file (database dump) with the following format:

Header:

FirstName; LastName; EntryDate; ExitDate; InternalName; Remarks; Description; Phone; PhoneMobile; City; Building; Floor; Room; CentralAccount; DefaultEmailAddress; IsInActive; IsTerminalServerAllowed; IsExternal; PersonalTitle; PersonnelNumber; XDateInserted; XDateUpdated

Example Event:

XXXX;XXXX;XX.XX.XXXX XX:XX:XX;XX.XX.XXXX XX:XX:XX;XXXX, XXXX;;TER, 2012-02-27;;;;;;;XXXXX;XXXX@XXXX.de;True;True;False;;00430160;XX.XX.XXXX;XX.XX.XXXX

Splunkd does the import and slo indexes the data, but it only adds the data and does not compare it or delete the old dump data. So after four/n import/index-cycles i do have every event four/n times in splunk.

I configured the import with the GUI but found no way to prevent my data from being added instead of actualized. What do i have to do?

Tags (1)
0 Karma

Lucas_K
Motivator

You need to craft your database query so that your only exporting newer events.

You could do an initial dump to get historical data populated but after that use a more refined query.

Looking at your example data it doesn't seem to be a time series so it is probably better if you used Ayn's suggestion of just using it as a lookup.

0 Karma

BastianSchlaak
New Member

the dump file has about 120k events in it. i do not want to eidt the data with spunk. But i think it must be possible that spunk either compares still imported events with new events and only imports new events or that it deletes old events before importing new events.

0 Karma

Ayn
Legend

You're free to think whatever you want, but there are no mechanisms within Splunk to compare new data with existing data during indexing.

0 Karma

Ayn
Legend

You can't modify existing data in the index. Splunk isn't a general-purpose database where you can do something like that.

If you're just working with CSV data and it's not large volumes of data, you could use the CSVs as lookups and work with them directly that way using inputlookup, without going via a Splunk index.

0 Karma
Get Updates on the Splunk Community!

CX Day is Coming!

Customer Experience (CX) Day is on October 7th!! We're so excited to bring back another day full of wonderful ...

Strengthen Your Future: A Look Back at Splunk 10 Innovations and .conf25 Highlights!

The Big One: Splunk 10 is Here!  The moment many of you have been waiting for has arrived! We are thrilled to ...

Now Offering the AI Assistant Usage Dashboard in Cloud Monitoring Console

Today, we’re excited to announce the release of a brand new AI assistant usage dashboard in Cloud Monitoring ...