Getting Data In

cannot index a directory anymore

wsw70
Communicator

Hello,

I use splunk to index various sources, including files dropped into a directory and indexed to a given index.
As of a sudden my files do not get indexed anymore.

-- UPDATE --

The troubleshooting test described below (as INITIAL TROUBLESHOOTING) finally worked. I do not know why it took so much time to index (about an hour, usually indexes in minutes).

This does not solve the initial problem though: I wanted to reindex data over a certain period. I did a

index=myindex | delete

over the period I wanted to reindex (90 days ago to now). This got rid of the data (at least on the search part).

I reloaded the files in the tracked directory but the data did not reappear. I though that the cause might be that the source filenames are the same. So I renamed them (prefixing with a 0.). Same thing: the new data does not reappear.

So the problem now is not that the indexing of files in a directory does not work (good thing) but that I do not know how to force the reindexing on these new files (new = different filename, the contents will still match data indexed previously, but deleted as per above)

Thanks for the help

-- INITIAL TROUBLESHOOTING --

(this part now works, please see above)

In order to investigate I created a brand new index and a brand new directory to host the files I want to drop. I took a few files which used to be indexed correctly, they are full of lines like

Wed Aug 28 07:25:18 2013 N_hostip="10.103.43.253" N_netbios="UNKNOWN" N_dnsname="UNKNOWN" N_os="Linux Kernel 2.6.18-92cpx86_64 (x86_64)" N_pluginName="SSL Self-Signed Certificate" N_group="SSL" N_pluginID="57582" N_severity="2" N_risk="Medium" N_cvss="6.4" N_patch="UNKNOWN" N_dnt="0" N_subnetname="MHX" N_scanname="RECURRENT-Scheduled-003" N_vendor="ssl" N_product="UNKNOWN"

and dropped them into that directory.

They are not visible in splunk

  • The index in Manager is seen as empty (no events, 1 MB size)
  • The directory I use, as seen in Manager (Data Inputs) shows 49 files, which is correct
  • There is ample disk space on the machine
  • splunk works as it (I can search etc.)
  • I searched for "All time" -- even though the events are max 90 days old
  • I even restarted splunk for good measure
  • The license is OK (50 MB out of 1 GB today, the files are a few MB)

I would appreciate any help on what to test now to get these data in, before I open a ticket (I hope I missed something obvious bo no idea where)

Thanks!

Tags (2)
0 Karma
1 Solution

MuS
Legend

Hi wsw70,

so what did changed As of a sudden?

I mean like:

  • Permission changes?
  • any Software update?
  • are you searching the right index?
  • did you checked index=_internal for any information about your drop directory or any file inside this directory?

hope this helps to get your started with your troubleshooting.

cheers, MuS

View solution in original post

0 Karma

MuS
Legend

MuS
Legend

| delete does not delete events, the events are no longer searchable but are still in the index. Therefore your files get not reindex. You have to clean the fishbucket to reindex the files

MuS
Legend

Hi wsw70,

so what did changed As of a sudden?

I mean like:

  • Permission changes?
  • any Software update?
  • are you searching the right index?
  • did you checked index=_internal for any information about your drop directory or any file inside this directory?

hope this helps to get your started with your troubleshooting.

cheers, MuS

0 Karma

MuS
Legend

glad I could help and thanks for accepting the answer 🙂

0 Karma

wsw70
Communicator

The fishbucket comment looks like the true solution (I diod not know about the real effects of the "delete" function).

0 Karma

wsw70
Communicator

Thanks for the note -- please see my update as the shifted a bit. To answer your questions: no chnages in permission / software, I am checking the right index (triple checked that :)) and index=_internal does not show anything particular related to this index / files

0 Karma
Get Updates on the Splunk Community!

Announcing Scheduled Export GA for Dashboard Studio

We're excited to announce the general availability of Scheduled Export for Dashboard Studio. Starting in ...

Extending Observability Content to Splunk Cloud

Watch Now!   In this Extending Observability Content to Splunk Cloud Tech Talk, you'll see how to leverage ...

More Control Over Your Monitoring Costs with Archived Metrics GA in US-AWS!

What if there was a way you could keep all the metrics data you need while saving on storage costs?This is now ...