Getting Data In

Why is the same data missing in the some of our monitored events? How can I find the cause of this problem?

New Member
  1. Only about 1/1000 of similar events having missing data. Using monitoring to load the data in.
  2. All the events that have missing data are missing the same data.
  3. When trying to load the file again (by manual upload or monitoring), the same "problematic" events are loaded ok.
  4. The "problematic" events are not in the end of the file.
  5. The event break is set to the default (by timestamp) multiline.
Tags (2)
0 Karma

New Member

I have found that the data is not missing but is split to two sequential events.
The teo events has the same time stamp although the time appears only in the first event. Why should splunk cut the event?

0 Karma

Esteemed Legend

Show us these files for those events: inputs.conf, props.conf, and transforms.conf. Also check index=_* for any log messages related to this input/sourcetype.

0 Karma

Esteemed Legend

The only time that I have ever seen this happen is when two systems are writing the same file to the same directory on the same server at the same time. So the first system writes the file and then Splunk starts forwarding it. In the middle of this process, the other system overwrites the file before Splunk is done with it. This will generally cause a truncated event followed by everything else in the file being missing entirely.

0 Karma

New Member

Thanks, but this doesn't seem to be the case.
There are no two systems and there are no missing events.

0 Karma

Contributor

Take a look at this answer:

http://answers.splunk.com/answers/41648/linebreakingprocessor-truncating-line-because-limit-of-10000...

It may address your occasional truncation issue. Looks like you have TRUNCATE= set to Splunk's default value.

0 Karma

New Member

This is not the case since
A - the events are not 10k
B - most of the similar events 99.9% are the same size, and they are ok

0 Karma

Champion

You may need to turn on indexer acknowledgement to prevent events from being dropped. Also you many need to inspect event to see if there are not being merge due to some weird line breaking.

0 Karma

Esteemed Legend

Do you mean that "events are missing data" (_raw is truncated for some events) or do you mean that "source files are missing events" or do you mean that "evens are missing field extractions"? There is a HUGE difference...

0 Karma

New Member

I mean that raw is truncated for some events.
The same event being load again is ok.

0 Karma

SplunkTrust
SplunkTrust

Could you post the following three things?

The monitor stanza you are using.
A few of the lines that work fine.
A line or two that do NOT work right.

With those, I think solving this will be a lot easier.

New Member

(I put some asterisks instead sensitive data)

SourceType Definition in SearchHead:

[BnhpDFSWrapper]
ANNOTATE_PUNCT=True
AUTO_KV_JSON=true
BREAK_ONLY_BEFORE_DATE=True
CHARSET=UTF-8
DATETIME_CONFIG=/etc/datetime.xml
EXTRACT-Thread Number=(?i)\[\S*\s:\s(?P\d+)(?=\])
EXTRACT-elapsed_time=elapsed time:\s+(?P\d+)
EXTRACT-elapsed_time==elapsed time=\s+(?P\d+)
LEARN_SOURCETYPE=true
LINE_BREAKER_LOOKBEHIND=100
MAX_DAYS_AGO=2000
MAX_DAYS_HENCE=2
MAX_DIFF_SECS_AGO=3600
MAX_DIFF_SECS_HENCE=604800
MAX_EVENTS=256
MAX_TIMESTAMP_LOOKAHEAD=128
SEGMENTATION=indexing
SEGMENTATION-all=full
SEGMENTATION-inner=inner
SEGMENTATION-outer=outer
SEGMENTATION-raw=none
SEGMENTATION-standard=standard
SHOULD_LINEMERGE=True
TRUNCATE=10000
detect_trailing_nulls=false
maxDist=100

Input Definition in Forwarder:

[monitor:///usr/PDC/logs/BnhpDFSWrapperErr.log]
_rcvbuf=1572864
host=***
index=documentum
sourcetype=BnhpDFSWrapperErr
0 Karma

New Member

(I put some asterisks instead sensitive data)
Example of a problematic case

2015-08-20 16:48:03.641 DEBUG [WMQJCAResourceAdapter : 6] operationState=entered  dctmRequestId=b23eb5d9-69c1-4d74-9d62-ee055945d468 operationName=createDocuments  docData4CreateList: [[ docDataForCreate: [ typeName=null, docCustomerData: 
    [ custom: [customText=directiveSourceCode=0;postponedAccountInd=false]
    , group: null
    , pensionFund: [ pensionFundNbr=null, planholderNumber=null ]
    , docDetails: [ channelId=0, businessProcessId=null, currency_code=*, documentFormId=**********, documentEditionNbr=*, templateDataExistsInd=false, businessSubAreaCode=**, systemCode=**, legacyDocumentId=**********, legacyDocumentEntryDttm=Thu Aug 20 16:49:23 GMT+02:00 2015, concatenatedEventIds=null, documentGroupIds=null, scanStatusCode=*, docCompletenessCode=null, ongoingOrHistoryCode=*, projectId=*, businessAreaCode=**, objectName=***, transaction_amt=****.*, signatureStatusCode=null, docDeliveryNum=null ]
    , executorDetails: [ executingBankId=**, executingBranchId=***, empIdDocumentTypeCode=*, executingEmpIdCode=*********, ipAddress=***.***.***.***, terminalChannelId=**, bankolId=**, executingEmpFullName=*****, instructionReceiveTypeCode=null ]

The same event without misssing data:

2015-08-20 16:48:03.641 DEBUG [WMQJCAResourceAdapter : 6] operationState=entered  dctmRequestId=b23eb5d9-69c1-4d74-9d62-ee055945d468 operationName=createDocuments  docData4CreateList: [[ docDataForCreate: [ typeName=null, docCustomerData: 
    [ custom: [customText=directiveSourceCode=0;postponedAccountInd=false]
    , group: null
    , pensionFund: [ pensionFundNbr=null, planholderNumber=null ]
    , docDetails: [ channelId=0, businessProcessId=null, currency_code=*, documentFormId=**********, documentEditionNbr=*, templateDataExistsInd=false, businessSubAreaCode=**, systemCode=**, legacyDocumentId=**********, legacyDocumentEntryDttm=Thu Aug 20 16:49:23 GMT+02:00 2015, concatenatedEventIds=null, documentGroupIds=null, scanStatusCode=*, docCompletenessCode=null, ongoingOrHistoryCode=*, projectId=*, businessAreaCode=**, objectName=***, transaction_amt=****.*, signatureStatusCode=null, docDeliveryNum=null ]
    , executorDetails: [ executingBankId=**, executingBranchId=***, empIdDocumentTypeCode=*, executingEmpIdCode=********, ipAddress=***.***.***.***, terminalChannelId=**, bankolId=**, executingEmpFullName=*****, instructionReceiveTypeCode=null ]
    , customerKeys: [[customerId=*, customerIdDocumentTypeCode=*, completeCustomerIdCode=*********, customerSerialNbr=*, customerFullName=****, occasionalCustomerInd=false]]
    , bankAccounts: [[ accountBankId=** branchId=*** accountNbr=****** divisionId=* specialHandlingCode=false ]] ]
    , docFile: [ docFormat=pdf, docSize=, docStream=null, docURL=****.pdf, checkSum=null]
    , secondaryDocFile: null
    , docPropertyExtensions: [[ extensionName=****, propertyKeyValues:[[ key=document_page_cnt, value=*], [ key=archive_date, value=2015-08-20T16:49:23], [ key=doc_location_code, value=*]]]]], docEventData: [[ autoEventInd=false bankolId=** channelId=* concatenatedEventId= empIdDocumentTypeCode=* eventCategoryCode=*** eventCategoryErrorCode=* eventCategoryStatusCode=* eventCategoryTypeCode=* eventDescText=**** eventGroupId=null executingBranchId=*** executingEmpFullName=*** executingEmpIdCode=*** ipAddress=***.***.***.*** legacyEventEntryDttm=Thu Aug 20 16:49:23 GMT+02:00 2015 objectName= *** ]] ] ] securityContext: [ repositoryName=***, userName=***, password=*** ] versionLabel= null requestId= null switchRelatedDocuments: null
0 Karma