Are you trying to use splunk to search within your docx / PDF or simply store it?
Either way, splunk doesn't provide a default way to handle this.
You could use a script in combination with some kind of docx / pdf to text utility to load your docx / PDF's textual content into splunk.
If you want to try simply indexing the files straight-up, then simply add something like this to your props.conf file:
NO_BINARY_CHECK = true
Which should force your PDFs to be indexed even though they are binary. I suspect you will not like the results, but you can give it a try.