Getting Data In

Ingest entire XML file

nicofantinato
Path Finder

Hello everybody,

we are monitoring via Universal Forwarder several directories with a large XML file in there (around 1000 lines). These files changes every few seconds, and the change also involves the timestamp which is written in the first 256 bytes of the file.

I need to ingest these files entirely at every change but, instead, Splunk ingest me these files only one time every some hours or even days. Do you have any suggestion on how can I fix this?

Here's the props.conf in my heavy forwarders (we have a distributed environment):
[xml_atm]
TRANSFORMS-routing=xmlatm-route
SHOULD_LINEMERGE=true
LINE_BREAKER=(?:restart)([\r\n]+)
CHARSET=ISO-8859-1
CHECK_METHOD = modtime
MAX_EVENTS=4000
TRUNCATE=0
disabled=false
TIME_PREFIX=restart-flag="
REPORT-xmlext=xml-extr

While inputs.conf in UF is this:
[monitor://D:\ABC\Monitor\Monitor\Inputs\*\*.xml]
disabled = 0
host_segment = 5
index = my_index
sourcetype = xml_atm

Thanks in advance.

0 Karma
1 Solution

nicofantinato
Path Finder

Hi, turned out we also needed to add directive crcSalt = <SOURCE> in inputs.conf on UFs. Adding this all worked as expected.

inputs.conf became simply:

[monitor://D:\ABC\Monitor\Monitor\Inputs\*\*.xml]
disabled = 0
host_segment = 5
crcSalt = <SOURCE>
index = my_index
sourcetype = xml_atm

View solution in original post

0 Karma

nicofantinato
Path Finder

Hi, turned out we also needed to add directive crcSalt = <SOURCE> in inputs.conf on UFs. Adding this all worked as expected.

inputs.conf became simply:

[monitor://D:\ABC\Monitor\Monitor\Inputs\*\*.xml]
disabled = 0
host_segment = 5
crcSalt = <SOURCE>
index = my_index
sourcetype = xml_atm

0 Karma

ff9231
Loves-to-Learn

I have the exact same config as you, the only difference is that I want "source" also, if I define a source value then host names goes back to default servername, in this case host/host_regex/host_segment, nothing works if "source" is defined.

Do you have any suggestion what I can try? I am also configuring on UF.

Sample:

[monitor://D:\ABC\Monitor\Monitor\Inputs\*\*.xml]
disabled = 0
host_segment = 5
crcSalt = <SOURCE>
index = my_index
sourcetype = xml_atm

source = B_wks

0 Karma

nicofantinato
Path Finder

Hi, do you have any suggestion? I'm still unable to ingest entire XML files every time their modification time changes.

0 Karma

richgalloway
SplunkTrust
SplunkTrust

Please share the relevant inputs.conf stanza from the UFs.

---
If this reply helps you, Karma would be appreciated.
0 Karma

nicofantinato
Path Finder

Here is the inputs.conf on the UF
[monitor://D:\ABC\Monitor\Monitor\Inputs\*\*.xml]
disabled = 0
host_segment = 5
index = my_index
sourcetype = xml_atm

0 Karma

richgalloway
SplunkTrust
SplunkTrust

Splunk tries its best to avoid re-indexing entire files that are ingesting via a monitor stanza.  I'm not aware of any setting to override that behavior.

Consider using batch input, instead.  Splunk will read the entire file, but will delete it afterward.  That means your application must be prepared to re-create the file.  It also runs the risk of a race condition between Splunk and your app.  Can the application be configured to write a new file instead of overwriting existing files?

---
If this reply helps you, Karma would be appreciated.
0 Karma
Get Updates on the Splunk Community!

Join Us for Splunk University and Get Your Bootcamp Game On!

If you know, you know! Splunk University is the vibe this summer so register today for bootcamps galore ...

.conf24 | Learning Tracks for Security, Observability, Platform, and Developers!

.conf24 is taking place at The Venetian in Las Vegas from June 11 - 14. Continue reading to learn about the ...

Announcing Scheduled Export GA for Dashboard Studio

We're excited to announce the general availability of Scheduled Export for Dashboard Studio. Starting in ...