Field Extraction Issue After Hot to Warm Bucket Ro...

jscraig2006

Greetings Splunkers. I have an unusual issue with a group of CSV files. When the file is ingested into Splunk, the fields are correct. After a roll of the bucket to warm, some of the file content becomes a field. Usually after 7 days. I have a sourcetype at the indexers.

[mysourcetype]
BREAK_ONLY_BEFORE_DATE =
DATETIME_CONFIG =
HEADER_FIELD_LINE_NUMBER = 1
HEADER_FIELD_DELIMITER =,
FIELD_NAMES = FileDate, Feild_1, Feild_2, Feild_3
PREAMBLE_REGEX = FileDate, Feild_1, Feild_2, Feild_3
INDEXED_EXTRACTIONS = csv
KV_MODE = none
LINE_BREAKER = ([\r\n]+)
NO_BINARY_CHECK = true
SHOULD_LINEMERGE = false
TIMESTAMP_FIELDS = FileDate
TIME_FORMAT = %Y-%m-%d %H:%M:%S
category = Structured
description = Comma-separated value format. Set header and other settings in "Delimited Settings"
pulldown_type = true

I know that rolling should only change the state of the bucket, by i am seeing a pattern during the roll process. We are on v9 on prem.

Thanks in Advance

PrewinThomas

@jscraig2006

Thats strange. As you mentioned, rolling a bucket from hot to warm should not change field extraction.

Can you keep your props simple, like removing PREAMBLE_REGEX. Below setting should be fine to start with

Eg:

[mysourcetype]
INDEXED_EXTRACTIONS = csv
HEADER_FIELD_LINE_NUMBER = 1
FIELD_NAMES = FileDate,Field_1,Field_2,Field_3
TIMESTAMP_FIELDS = FileDate
TIME_FORMAT = %Y-%m-%d %H:%M:%S
SHOULD_LINEMERGE = false

And validate and confirm the active sourcetype settings.

$SPLUNK_HOME/bin/splunk btool props list mysourcetype --debug

Then test with sample csv and roll the bucket manually and confirm fields remain correct.

Regards,
Prewin
🌟If this answer helped you, please consider marking it as the solution or giving a Karma. Thanks!

livehybrid

Hi @jscraig2006

This is very unusual and not something I have seen myself, I am wondering if you have any custom tsidx settings in the indexes.conf definitions which could perhaps be causing an issue? Is everything 'standard' in your index definition?

Probably unrelated but I dont think you should need PREAMBLE_REGEX because this would be for something before the header fields?

🌟 Did this answer help you? If so, please consider:

Adding karma to show it was useful
Marking it as the solution if it resolved your issue
Commenting if you need any clarification

Your feedback encourages the volunteers in this community to continue contributing

jscraig2006

Thanks! It is an unusual issue. When the file is ingested, the data is fine. Day 7, some of the data becomes headers and the headers become data in the fields. No custom tsidx

[my_index]
# 180 Days
repFactor=auto
homePath = volume:primary/my_index/db
coldPath = volume:secondary/my_index/colddb
thawedPath = $SPLUNK_DB/my_index/thaweddb
frozenTimePeriodInSecs = 15552000
maxDataSize = auto
maxHotBuckets = 5
maxWarmDBCount = 652
quarantinePastSecs = 15552000
quarantineFutureSecs = 2592000

Field Extraction Issue After Hot to Warm Bucket Roll

troubleshooting

.conf25 technical session recap of Observability for Gen AI: Monitoring LLM ...

A Season of Skills: New Splunk Courses to Light Up Your Learning Journey

Announcing the Migration of the Splunk Add-on for Microsoft Azure Inputs to ...

Join the Conversation

Field Extraction Issue After Hot to Warm Bucket Roll

troubleshooting

.conf25 technical session recap of Observability for Gen AI: Monitoring LLM ...

A Season of Skills: New Splunk Courses to Light Up Your Learning Journey

Announcing the Migration of the Splunk Add-on for Microsoft Azure Inputs to ...