- Mark as New
- Bookmark Message
- Subscribe to Message
- Mute Message
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Ingesting Word document
Hello all,
My latest challenge is to ingest a Word doc into our environment. According to everything I have read so far, this should be straight forward as Splunk can ingest 'any' file. At this point I should point out that I am not concerned about the contents of the file (as this all needs to be obfuscated). I only need to ingest the file to get its name. I am not concerned about whether or not Splunk can read the 'Word' type formatting.
The file is created daily with the format - "My Word Doc ddmmyyyy hh mm.doc"
I am only interested in the "ddmmyyyy hh mm" part to ensure that it has been created today.
I cannot get the doc file to ingest at all. Not even in an unformatted state. If I save the file as a ".txt" file, then it is ingested. Unfortunately, the 'save as' option is not an option in production.
I have tried using 'whitelist=' option without any success.
Can anyone suggest a solution? Is there something in my installation that is stopping Word docs from being ingested? Has anyone else had a similar experience?
Thanks
- Mark as New
- Bookmark Message
- Subscribe to Message
- Mute Message
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content


Ingesting an entire Word package, possibly several MB, just to find out if a file exists seems wasteful to me.
As I suggested earlier, consider a script to test for the presence of the file and report to Splunk.
If this reply helps you, Karma would be appreciated.
- Mark as New
- Bookmark Message
- Subscribe to Message
- Mute Message
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
The file is tiny. I will look at what other options are available. Thanks
- Mark as New
- Bookmark Message
- Subscribe to Message
- Mute Message
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content


Consider writing a python script to test for the presence of the file and making it a scripted input.
If this reply helps you, Karma would be appreciated.
- Mark as New
- Bookmark Message
- Subscribe to Message
- Mute Message
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Thanks for the reply. I am, however, still confused. There are a number of Questions about how to ingest with the correct format - eg https://community.splunk.com/t5/Archive/How-to-ingest-doc-format-file-into-splunk-with-correct-forma...
As I have stated, I am not concerned with the format within the doc, only the filename is of importance.
