Getting Data In

Monitor log files with spaces in the file name

nmohammed
Contributor

hi,

I am having issues with splunk universal forwarder monitoring log files with spaces in the name . The file is a regular text file and not binary , but the forwarder is considering the file as binary

Example of log file name :

"NMDox_PRD.EP6XWBSDE26931 Started 2017-09-14.txt"

12-20-2017 22:55:47.122 -0800 WARN FileClassifierManager - The file '/esc_logs/OLD/NMDox_PRD.EP6XWBSDE26931 Started 2017-09-14.txt' is invalid. Reason: binary
12-20-2017 22:55:47.123 -0800 INFO TailReader - Ignoring file '/esc_logs/NMDox_PRD.EP6XWBSDE26931 Started 2017-09-14.txt' due to: binary

inputs.conf
[monitor:///eds_logs/]
disabled = false
whitelist = .txt$
index = esd_prod
sourcetype = esd:trace
host_regex = (EP\d\w+)
crcSalt =

Appreciate any guidance on this problem.
Thanks

0 Karma
1 Solution

nmohammed
Contributor

Thanks harsmarvania57 .., !! that fixed it. Able to see the data correctly now.

and thank you everyone for contributing quickly to the solution.

As suggested created props.conf on forwarder with the following and restarted the forwarder to resolved the issue.

[esc:trace]
CHARSET = UTF-16LE

View solution in original post

0 Karma

nmohammed
Contributor

Thanks harsmarvania57 .., !! that fixed it. Able to see the data correctly now.

and thank you everyone for contributing quickly to the solution.

As suggested created props.conf on forwarder with the following and restarted the forwarder to resolved the issue.

[esc:trace]
CHARSET = UTF-16LE

0 Karma

mayurr98
Super Champion

hey, can you give us sample log file name that you want to monitor and also the log files which you do not want to monitor!
Also you can write make use of

blacklist = <regular expression>
* If set, files from this input are NOT monitored if their path matches the
  specified regex.
* Takes precedence over the deprecated _blacklist setting, which functions
  the same way.
* If a file matches the regexes in both the blacklist and whitelist settings,
  the file is NOT monitored. Blacklists take precedence over whitelists.
0 Karma

nmohammed
Contributor

NMDox_PRD.EP6XWBSDE26931 Started 2017-09-14.txt is the example of log file which I want to monitor. All the files are of same type in the directory .. only the number in the log file name after BSDE changes.

Also the date stamp in the log file of course.

0 Karma

mayurr98
Super Champion

Try this!

whitelist = NMDox\_PRD\.EP6XWBSDE\d{5}\sStarted\s\d{4}-d{2}-\d{2}.txt$
0 Karma

nikita_p
Contributor

Hi @nmohammed,
Please check if you are monitoring proper file path.
And if your file is not rolling file then provide crcSalt= and if it rolling which i assume is your case then provide crcSalt = abcd(any random alphabets).
Also try disabling and enabling input after changes and check if it works.

0 Karma

deepashri_123
Motivator

Hi nmohammed,

Can u confirm the path in the monitor stanza?
In the inputs.conf it shows eds_logs whereas in the error it shows esc_logs.

0 Karma

nmohammed
Contributor

Thanks @deepashri_123

file NMDox_PRD.EP6XWBSDE26931\ Started\ 2017-09-14.txt

NMDox_PRD.EP6XWBSDE26931 Started 2017-09-14.txt: Little-endian UTF-16 Unicode English text, with very long lines, with CRLF line terminators

and the path is "esc_logs" just confirmed from inputs.conf

0 Karma

harsmarvania57
SplunkTrust
SplunkTrust

Hi @nmohammed,

This problem might occur when there will be garbage character in your txt file. Can you please check file type using command file NMDox_PRD.EP6XWBSDE26931 Started 2017-09-14.txt and let us know the output.

0 Karma

nmohammed
Contributor

hi @harsmarvania57

file NMDox_PRD.EP6XWBSDE26931\ Started\ 2017-09-14.txt

NMDox_PRD.EP6XWBSDE26931 Started 2017-09-14.txt: Little-endian UTF-16 Unicode English text, with very long lines, with CRLF line terminators

0 Karma

Elsurion
Communicator

It starts binary, i've made a file with UTF-16 myself and it starts with 2 Bytes Binary...

me@myserver ✓  08:59 $ file bla-utf16.log
bla-utf16.log: Little-endian UTF-16 Unicode text, with no line terminators
[~]
me@myserver ✓  08:59 $ cat bla-utf16.log
▒▒Das ist ein Test

Have you tried to add CHARSET to your props.conf?
http://docs.splunk.com/Documentation/Splunk/7.0.1/Data/Configurecharactersetencoding

0 Karma

nmohammed
Contributor

Thanks.,

I have created props.conf on Universal Forwarder :

[esd:trace]
CHARSET = AUTO

but when I search the data , it is shown in binary format.

0 Karma

harsmarvania57
SplunkTrust
SplunkTrust

Can you please try with CHARSET = UTF-16LE

0 Karma

harsmarvania57
SplunkTrust
SplunkTrust

Looks like binary, can you please try to read log files using less command less NMDox_PRD.EP6XWBSDE26931 Started 2017-09-14.txt ? If it's binary then it will ask you that file is in binary continue anyway ? After that give yes and try to find those special/garbage character in that file.

0 Karma

nmohammed
Contributor

Those log files are written by an .NET application running on Windows onto a CIFS share. I have mounted the CIFS share on a linux server, I had issues of extreme slowness and lag monitoring logs directly from CIFS shares directly using a universal forwarder running on Windows server.

Now there 100's of such logs files that need to be monitored and written continuously.

0 Karma

nickhills
Ultra Champion

Why not run the UF on the windows server running the app, Maybe this would avoid mounting the share in the first place? (but maybe not)

If my comment helps, please give it a thumbs up!
0 Karma

harsmarvania57
SplunkTrust
SplunkTrust

So you are not facing slowness issues on Linux server ? As mentioned by @Elsurion, you can try to set CHARSET or if you want to read binary file anyway then you can set NO_BINARY_CHECK = true in props.conf

NO_BINARY_CHECK = [true|false]
* When set to true, Splunk processes binary files.
* Can only be used on the basis of [<sourcetype>], or [source::<source>],
  not [host::<host>].
* Defaults to false (binary files are ignored).
* This setting applies at input time, when data is first read by Splunk.
  The setting is used on a Splunk system that has configured inputs
  acquiring the data.

nmohammed
Contributor

Thanks.,

I have created props.conf on Universal Forwarder :

[esd:trace]
CHARSET = AUTO

but when I search the data , it is shown in binary format.

0 Karma
Get Updates on the Splunk Community!

The Splunk Success Framework: Your Guide to Successful Splunk Implementations

Splunk Lantern is a customer success center that provides advice from Splunk experts on valuable data ...

Splunk Training for All: Meet Aspiring Cybersecurity Analyst, Marc Alicea

Splunk Education believes in the value of training and certification in today’s rapidly-changing data-driven ...

Investigate Security and Threat Detection with VirusTotal and Splunk Integration

As security threats and their complexities surge, security analysts deal with increased challenges and ...