Splunk Enterprise

Field Extraction Issue After Hot to Warm Bucket Roll

jscraig2006
Communicator

Greetings Splunkers. I have an unusual issue with a group of CSV files. When the file is ingested into Splunk, the fields are correct. After a roll of the bucket to warm, some of the file content becomes a field. Usually after 7 days.  I have a sourcetype at the indexers.  

[mysourcetype]
BREAK_ONLY_BEFORE_DATE =
DATETIME_CONFIG =
HEADER_FIELD_LINE_NUMBER = 1
HEADER_FIELD_DELIMITER =,
FIELD_NAMES = FileDate, Feild_1, Feild_2, Feild_3
PREAMBLE_REGEX = FileDate, Feild_1, Feild_2, Feild_3
INDEXED_EXTRACTIONS = csv
KV_MODE = none
LINE_BREAKER = ([\r\n]+)
NO_BINARY_CHECK = true
SHOULD_LINEMERGE = false
TIMESTAMP_FIELDS = FileDate
TIME_FORMAT = %Y-%m-%d %H:%M:%S
category = Structured
description = Comma-separated value format. Set header and other settings in "Delimited Settings"
pulldown_type = true

I know that rolling should only change the state of the bucket, by i am seeing a pattern during the roll process. We are on v9 on prem. 

Thanks in Advance

 

 

Labels (1)
0 Karma

PrewinThomas
Motivator

@jscraig2006 

Thats strange. As you mentioned, rolling a bucket from hot to warm should not change field extraction.

Can you keep your props simple, like removing PREAMBLE_REGEX. Below setting should be fine to start with

Eg:

[mysourcetype]
INDEXED_EXTRACTIONS = csv
HEADER_FIELD_LINE_NUMBER = 1
FIELD_NAMES = FileDate,Field_1,Field_2,Field_3
TIMESTAMP_FIELDS = FileDate
TIME_FORMAT = %Y-%m-%d %H:%M:%S
SHOULD_LINEMERGE = false

And validate and confirm the active sourcetype settings.

$SPLUNK_HOME/bin/splunk btool props list mysourcetype --debug


Then test with sample csv and roll the bucket manually and confirm fields remain correct.

Regards,
Prewin
🌟If this answer helped you, please consider marking it as the solution or giving a Karma. Thanks!

0 Karma

livehybrid
SplunkTrust
SplunkTrust

Hi @jscraig2006 

This is very unusual and not something I have seen myself, I am wondering if you have any custom tsidx settings in the indexes.conf definitions which could perhaps be causing an issue? Is everything 'standard' in your index definition?

Probably unrelated but I dont think you should need PREAMBLE_REGEX because this would be for something before the header fields?

🌟 Did this answer help you? If so, please consider:

  • Adding karma to show it was useful
  • Marking it as the solution if it resolved your issue
  • Commenting if you need any clarification

Your feedback encourages the volunteers in this community to continue contributing

0 Karma

jscraig2006
Communicator

Thanks! It is an unusual issue. When the file is ingested, the data is fine. Day 7, some of the data becomes headers and the headers become data in the fields. No custom tsidx

[my_index]
# 180 Days
repFactor=auto
homePath = volume:primary/my_index/db
coldPath = volume:secondary/my_index/colddb
thawedPath = $SPLUNK_DB/my_index/thaweddb
frozenTimePeriodInSecs = 15552000
maxDataSize = auto
maxHotBuckets = 5
maxWarmDBCount = 652
quarantinePastSecs = 15552000
quarantineFutureSecs = 2592000

0 Karma
Get Updates on the Splunk Community!

.conf25 technical session recap of Observability for Gen AI: Monitoring LLM ...

If you’re unfamiliar, .conf is Splunk’s premier event where the Splunk community, customers, partners, and ...

A Season of Skills: New Splunk Courses to Light Up Your Learning Journey

There’s something special about this time of year—maybe it’s the glow of the holidays, maybe it’s the ...

Announcing the Migration of the Splunk Add-on for Microsoft Azure Inputs to ...

Announcing the Migration of the Splunk Add-on for Microsoft Azure Inputs to Officially Supported Splunk ...