Getting Data In

Ingest entire XML file

nicofantinato
Path Finder

Hello everybody,

we are monitoring via Universal Forwarder several directories with a large XML file in there (around 1000 lines). These files changes every few seconds, and the change also involves the timestamp which is written in the first 256 bytes of the file.

I need to ingest these files entirely at every change but, instead, Splunk ingest me these files only one time every some hours or even days. Do you have any suggestion on how can I fix this?

Here's the props.conf in my heavy forwarders (we have a distributed environment):
[xml_atm]
TRANSFORMS-routing=xmlatm-route
SHOULD_LINEMERGE=true
LINE_BREAKER=(?:restart)([\r\n]+)
CHARSET=ISO-8859-1
CHECK_METHOD = modtime
MAX_EVENTS=4000
TRUNCATE=0
disabled=false
TIME_PREFIX=restart-flag="
REPORT-xmlext=xml-extr

While inputs.conf in UF is this:
[monitor://D:\ABC\Monitor\Monitor\Inputs\*\*.xml]
disabled = 0
host_segment = 5
index = my_index
sourcetype = xml_atm

Thanks in advance.

0 Karma
1 Solution

nicofantinato
Path Finder

Hi, turned out we also needed to add directive crcSalt = <SOURCE> in inputs.conf on UFs. Adding this all worked as expected.

inputs.conf became simply:

[monitor://D:\ABC\Monitor\Monitor\Inputs\*\*.xml]
disabled = 0
host_segment = 5
crcSalt = <SOURCE>
index = my_index
sourcetype = xml_atm

View solution in original post

0 Karma

nicofantinato
Path Finder

Hi, turned out we also needed to add directive crcSalt = <SOURCE> in inputs.conf on UFs. Adding this all worked as expected.

inputs.conf became simply:

[monitor://D:\ABC\Monitor\Monitor\Inputs\*\*.xml]
disabled = 0
host_segment = 5
crcSalt = <SOURCE>
index = my_index
sourcetype = xml_atm

0 Karma

ff9231
Loves-to-Learn

I have the exact same config as you, the only difference is that I want "source" also, if I define a source value then host names goes back to default servername, in this case host/host_regex/host_segment, nothing works if "source" is defined.

Do you have any suggestion what I can try? I am also configuring on UF.

Sample:

[monitor://D:\ABC\Monitor\Monitor\Inputs\*\*.xml]
disabled = 0
host_segment = 5
crcSalt = <SOURCE>
index = my_index
sourcetype = xml_atm

source = B_wks

0 Karma

nicofantinato
Path Finder

Hi, do you have any suggestion? I'm still unable to ingest entire XML files every time their modification time changes.

0 Karma

richgalloway
SplunkTrust
SplunkTrust

Please share the relevant inputs.conf stanza from the UFs.

---
If this reply helps you, Karma would be appreciated.
0 Karma

nicofantinato
Path Finder

Here is the inputs.conf on the UF
[monitor://D:\ABC\Monitor\Monitor\Inputs\*\*.xml]
disabled = 0
host_segment = 5
index = my_index
sourcetype = xml_atm

0 Karma

richgalloway
SplunkTrust
SplunkTrust

Splunk tries its best to avoid re-indexing entire files that are ingesting via a monitor stanza.  I'm not aware of any setting to override that behavior.

Consider using batch input, instead.  Splunk will read the entire file, but will delete it afterward.  That means your application must be prepared to re-create the file.  It also runs the risk of a race condition between Splunk and your app.  Can the application be configured to write a new file instead of overwriting existing files?

---
If this reply helps you, Karma would be appreciated.
0 Karma
Get Updates on the Splunk Community!

Unlock Database Monitoring with Splunk Observability Cloud

  In today’s fast-paced digital landscape, even minor database slowdowns can disrupt user experiences and ...

Purpose in Action: How Splunk Is Helping Power an Inclusive Future for All

At Cisco, purpose isn’t a tagline—it’s a commitment. Cisco’s FY25 Purpose Report outlines how the company is ...

[Upcoming Webinar] Demo Day: Transforming IT Operations with Splunk

Join us for a live Demo Day at the Cisco Store on January 21st 10:00am - 11:00am PST In the fast-paced world ...