Solved: how to index entire txt file each time ?

mah · ‎12-30-2020

Hi

My issue is that I have txt file to index entirely each time it is modified (adding or suppression).

At this time, it is indexing only new lines.

My txt file example :

id,name,app,env,start,end   
1234,test,splunk_app,dev,29-12-2020 15:00,29-12-2020 16:00
5678,test2,splunk_app2,dev,29-12-2020 15:00,29-12-2020 16:00

My inputs.conf :

[monitor:///opt/splunk/etc/apps/<app>/bin/file.txt]
index = test
sourcetype = st
disabled = 0
crcSalt = <SOURCE>
initCrcLength = 2000

My props.conf :

[st]
SHOULD_LINEMERGE = false
DATETIME_CONFIG = CURRENT
FIELD_DELIMITER=,
HEADER_FIELD_DELIMITER=,
FIELD_QUOTE="

Can you tell me how to do that ?

Thanks!

to4kawa · ‎12-30-2020

https://github.com/splunk/corona_virus
This dashboard is throwing CSV data into the lookups folder.
Your data seems to be light, and I don't think it needs to be indexed.

| inputlookup your_csv
| eval start=strptime(start,"%d-%m-%Y %T"),end=strptime(end,"%d-%m-%Y %T")
| eval duration=tostring(end-start,"duration")

Here is like above at searching.

View solution in original post

to4kawa · ‎12-30-2020

https://github.com/splunk/corona_virus
This dashboard is throwing CSV data into the lookups folder.
Your data seems to be light, and I don't think it needs to be indexed.

| inputlookup your_csv
| eval start=strptime(start,"%d-%m-%Y %T"),end=strptime(end,"%d-%m-%Y %T")
| eval duration=tostring(end-start,"duration")

Here is like above at searching.

mah · ‎12-31-2020

Hi,

By following the solution you suggest and the example of the link, the solution that I can do is :

1- Instead of putting my script input on the Heavy Forwarder, I can put it on the Search Head :

My inputs.conf on SH will be like :

[script://./bin/my_script.py]
interval : *****
index = test
sourcetype = st
disabled = 0

[monitor:///opt/splunk/etc/apps/<app>/lookups/file.csv]
index = test
sourcetype = st
disabled = 0

My props.conf on the SH will be like :

[st]
DATETIME_CONFIG =
INDEXED_EXTRACTIONS = csv
KV_MODE = none
NO_BINARY_CHECK = true
SHOULD_LINEMERGE = false
category = Structured
description = Comma-separated value format. Set header and other settings in "Delimited Settings"
disabled = false
pulldown_type = true

2- I will find my file.csv in lookup app

3- In my dashboard, I have just run the | inputlookup to retrieve my data

Can you confirm me this please ?

to4kawa · ‎12-31-2020

[monitor:///opt/splunk/etc/apps/<app>/lookups/file.csv]
index = test
sourcetype = st
disabled = 0

This is not necessary.

You don't have to put it in the index to search for it.

mah · ‎12-31-2020

I have just set it and it works perfectly.

Indeed the data of the csv are not essential so they do not need to be indexed and given that I find them up to date in the csv it is the main one.

So finally my inputs.conf on the SH looks like :

[script://./bin/my_script.py]
interval : *****
disabled = 0

Thanks a lot !

somesoni2 · ‎12-30-2020

When Splunk monitors a file, it creates a CrC checkpoint value with first few characters of the file. THis checkpoint is used to uniquely identify the file to avoid duplicate ingestion. If you want to ingest the whole file everything something changes, either the first few characters of the file should changes (so that CRC checkpoint is changed) OR you add file name to the CRC (which you're doing here using `CrcSalt =<SOURCE>`), and make sure that file name changes everything an update is made.

So you got two options,

1. Change the file name every time an update is made (and keep the current setting in inputs.conf).

2. Instead of file monitoring, you can setup scripted inputs and create a simple script which simply reads your CSV and outputs its content on console. Scripted input will send that to Splunk. Only drawback is that, the scripted input would run on a schedule and with every run it'll re-ingest the file content. So, this option is helpful if you update the file at a particular interval.

somesoni2 · ‎12-30-2020

How is the file generated? Do you have control over name of file being written?

mah · ‎12-30-2020

The file is output from a script with always the same name.

how to index entire txt file each time ?

troubleshooting

.conf24 | Registration Open!

ICYMI - Check out the latest releases of Splunk Edge Processor

Introducing the 2024 SplunkTrust!