Splunk Enterprise

how to index entire txt file each time ?

mah
Builder

Hi 

My issue is that I have txt file to index entirely each time it is modified (adding or suppression). 

At this time, it is indexing only new lines. 

My txt file example :

id,name,app,env,start,end   
1234,test,splunk_app,dev,29-12-2020 15:00,29-12-2020 16:00
5678,test2,splunk_app2,dev,29-12-2020 15:00,29-12-2020 16:00

My inputs.conf :

[monitor:///opt/splunk/etc/apps/<app>/bin/file.txt]
index = test
sourcetype = st
disabled = 0
crcSalt = <SOURCE>
initCrcLength = 2000

My props.conf :

[st]
SHOULD_LINEMERGE = false
DATETIME_CONFIG = CURRENT
FIELD_DELIMITER=,
HEADER_FIELD_DELIMITER=,
FIELD_QUOTE="

Can you tell me how to do that ? 

Thanks!

Labels (1)
0 Karma
1 Solution

to4kawa
Ultra Champion

https://github.com/splunk/corona_virus
This dashboard is throwing CSV data into the lookups folder.
Your data seems to be light, and I don't think it needs to be indexed.

| inputlookup your_csv
| eval start=strptime(start,"%d-%m-%Y %T"),end=strptime(end,"%d-%m-%Y %T")
| eval duration=tostring(end-start,"duration")

Here is like above at searching.

View solution in original post

to4kawa
Ultra Champion

https://github.com/splunk/corona_virus
This dashboard is throwing CSV data into the lookups folder.
Your data seems to be light, and I don't think it needs to be indexed.

| inputlookup your_csv
| eval start=strptime(start,"%d-%m-%Y %T"),end=strptime(end,"%d-%m-%Y %T")
| eval duration=tostring(end-start,"duration")

Here is like above at searching.

mah
Builder

Hi,

By following the solution you suggest and the example of the link, the solution that I can do is :

1- Instead of putting my script input on the Heavy Forwarder, I can put it on the Search Head :

My inputs.conf on SH will be like :

[script://./bin/my_script.py]
interval : *****
index = test
sourcetype = st
disabled = 0

[monitor:///opt/splunk/etc/apps/<app>/lookups/file.csv]
index = test
sourcetype = st
disabled = 0

My props.conf on the SH will be like :

[st]
DATETIME_CONFIG =
INDEXED_EXTRACTIONS = csv
KV_MODE = none
NO_BINARY_CHECK = true
SHOULD_LINEMERGE = false
category = Structured
description = Comma-separated value format. Set header and other settings in "Delimited Settings"
disabled = false
pulldown_type = true

2- I will find my file.csv in lookup app

3- In my dashboard, I have just run the | inputlookup to retrieve my data 

Can you confirm me this please ?

 

0 Karma

to4kawa
Ultra Champion
[monitor:///opt/splunk/etc/apps/<app>/lookups/file.csv]
index = test
sourcetype = st
disabled = 0

This is not necessary.

You don't have to put it in the index to search for it.

mah
Builder

I have just set it and it works perfectly. 

Indeed the data of the csv are not essential so they do not need to be indexed and given that I find them up to date in the csv it is the main one. 

So finally my inputs.conf on the SH looks like :

[script://./bin/my_script.py]
interval : *****
disabled = 0

Thanks a lot ! 

0 Karma

somesoni2
Revered Legend

When Splunk monitors a file, it creates a CrC checkpoint value with first few characters of the file. THis checkpoint is used to uniquely identify the file to avoid duplicate ingestion. If you want to ingest the whole file everything something changes, either the first few characters of the file should changes (so that CRC checkpoint is changed) OR you add file name to the CRC (which you're doing here using `CrcSalt =<SOURCE>`), and make sure that file name changes everything an update is made.

 

So you got two options,

1. Change the file name every time an update is made (and keep the current setting in inputs.conf).

2. Instead of file monitoring, you can setup scripted inputs and create a simple script which simply reads your CSV and outputs its content on console. Scripted input will send that to Splunk. Only drawback is that, the scripted input would run on a schedule and with every run it'll re-ingest the file content. So, this option is helpful if you update the file at a particular interval.

0 Karma

somesoni2
Revered Legend

How is the file generated? Do you have control over name of file being written?

Tags (1)
0 Karma

mah
Builder

The file is output from a script with always the same name. 

0 Karma
Get Updates on the Splunk Community!

.conf24 | Registration Open!

Hello, hello! I come bearing good news: Registration for .conf24 is now open!   conf is Splunk’s rad annual ...

ICYMI - Check out the latest releases of Splunk Edge Processor

Splunk is pleased to announce the latest enhancements to Splunk Edge Processor.  HEC Receiver authorization ...

Introducing the 2024 SplunkTrust!

Hello, Splunk Community! We are beyond thrilled to announce our newest group of SplunkTrust members!  The ...