Getting Data In

follow tail returning jumbled mess

carmackd
Communicator

I’m currently getting a new log source ready for production, and I almost have it except for one issue. I’m forwarding email logs, which the email application appends each entry to. I’m using the followTail directive, which works but the appended data is coming in to the indexer all jumbled and cooked looking while the original data in the file is not. Below is the inputs.conf file from the indexer and a screen shot of what I’m seeing. Please help……….thanks!

[default]
host = TEST_SERVER

[monitor:///home/dcarmack/myLogs2]
disabled = false
host = TEST
sourcetype = TEST
index = default
followTail = 1

alt text

Tags (1)
0 Karma

Vishal_Patel
Splunk Employee
Splunk Employee

"the source type equals the file path" ... did you mean sourcetype or source equals file path?

0 Karma

carmackd
Communicator

yes, sorry to confuse, source=path

0 Karma

Stephen_Sorkin
Splunk Employee
Splunk Employee

Output like this means that the data isn't valid UTF-8 when it arrives at the indexer.

I find it very odd that "source" is not properly set for this data. When forwarding and receiving, we typically expect source, sourcetype and host to be properly set by the forwarder. What does this directory structure look like?

As a side note, followTail is rarely a desired setting. Splunk will automatically start reading where it left off in a file. This setting is used to tell Splunk to reset this point to the end of the file, not where we last read up to.

This setting could possibly be related if there's a bad interaction with archived files (that don't look like text) or files with a character set that requires some long history to decode (this doesn't seem to be the case here).

0 Karma

carmackd
Communicator

The data is xml and comes from an email security appliance. Each entry has a common header. Yes, the files are archived using gzip

0 Karma

Stephen_Sorkin
Splunk Employee
Splunk Employee

It will show as tcp:5000 either if it's raw TCP in or if the forwarder isn't properly applying the source at input time. I don't suspect that it's raw TCP. I'm more curious about the file reading code. Are these archive files?

0 Karma

gkanapathy
Splunk Employee
Splunk Employee

It looks to me, because you're data is showing tcp:5000 that it's being sent to and received on a plain TCP port number 5000. I'm not sure where that would come from. Perhaps you have some rogue conf files around.

The followTail behavior may be an artifact of how your files are being written? Perhaps they are being modified near the top of the file when they are appended?

0 Karma

Stephen_Sorkin
Splunk Employee
Splunk Employee

I'm more curious about the files within the directory /home/dcarmack/myLogs2. I'm also wondering why we're reindex the files as that should not happen. Do they share a common header? What does the data inside the files look like?

0 Karma

carmackd
Communicator

When I don't use followTail, the entire file gets re-indexed. One other thing I should mention, when the original file is indexed, the source type equals the file path, when the data that's appended to the file gets indexed, the source equals tcp:5000. As far as the directory structure, the forwarder is sitting in /home/dcarmack and the log files are located at /home/dcarmack/myLogs2

0 Karma

gkanapathy
Splunk Employee
Splunk Employee

followTail has little to do with this. It seems to me that you are sending data to a standard TCP port, not a Splunk TCP port. Is that your intention? If you're using a Splunk forwarder, you should not do that. Standard TCP ports are for raw TCP log streams.

0 Karma

carmackd
Communicator

No, I'm using the [splunktcp:] stanza on my indexer.

0 Karma
Get Updates on the Splunk Community!

Monitoring Postgres with OpenTelemetry

Behind every business-critical application, you’ll find databases. These behind-the-scenes stores power ...

Mastering Synthetic Browser Testing: Pro Tips to Keep Your Web App Running Smoothly

To start, if you're new to synthetic monitoring, I recommend exploring this synthetic monitoring overview. In ...

Splunk Edge Processor | Popular Use Cases to Get Started with Edge Processor

Splunk Edge Processor offers more efficient, flexible data transformation – helping you reduce noise, control ...