When I do a particular search on a unique record ID, I get 1 piece of raw data back, but some of the fields are reporting 2 entries. I believe this is skewing my results further down the line.
For a particular search:
index=aws-bill  RecordId=39613589688296092585051622
I get exactly 1 Event, but hovering over the field for Blended cost, I see 2 lots of data. Value is "0.00000170" but the count is 2. Why is this?
Also, when I do this search and show as a chart:
index=aws-bill  RecordId=39613589688296092585051622 | timechart sum(BlendedCost) as $ by showback
I get a barchart with the value as "0.00000340" which is double the Blended cost.
Where is this coming from? What are my options for getting better results?
OK, that explains it; you are telling Splunk to extract json fields twice: once at index time ( INDEXED_EXTRACTIONS=json ) and once at search time ( KV_MODE=json).  Get rid of the KV_MODE setting.
See this Q&A for a more complete discussion:
http://answers.splunk.com/answers/174939/why-are-my-json-fields-extracted-twice.html
The latest version 2.0.9 uses just KV_MODE=json, so it does not cause any duplicates. Thanks to woodcock for the heads up.
OK, that explains it; you are telling Splunk to extract json fields twice: once at index time ( INDEXED_EXTRACTIONS=json ) and once at search time ( KV_MODE=json).  Get rid of the KV_MODE setting.
See this Q&A for a more complete discussion:
http://answers.splunk.com/answers/174939/why-are-my-json-fields-extracted-twice.html
Your picture is unambiguously clear: it is be because your 1 matching event has a multivalued field called BlendedCost with 2 values, both of which are the same: 0.00000170.  How is the BlendedCost field created?  What is in the raw data (_raw field)?
inputs.conf:
 [script:///opt/splunk/etc/apps/SplunkAppforAWSBilling/bin/ProcessDetailedReport.py]
 disabled = 0
 index = aws-bill
 interval = 10800
 passAuth = splunk-system-user
 source = SplunkAppforAWSBilling_Import
 sourcetype = SplunkAppforAWSBilling_Processor
props.conf:
 [source::SplunkAppforAWSBilling_Import]
 INDEXED_EXTRACTIONS=json
 KV_MODE=json
 TIME_PREFIX=\"UsageStartDate\"\:
 TIME_FORMAT=%Y-%m-%d %H:%M:%S
transforms.conf
 #######################
 #  Lookups
 #######################
 [payer_account_id]
 filename = payer_account_id.csv
 [linked_account_id]
 filename = linked_account_id.csv
					
				
			
			
				
			
			
			
			
			
			
			
		Hi, No it seems to be a single entry in the raw data: "BlendedCost": "0.00000170"
It is sucked in from a spreadsheet which comes from AWS billing. BlendedCost is one of the columns in the spreadsheet and that also only has the single entry.
Raw data is:
{"user:hostname": "awswarsp01", "PricingPlanId": "505699", "user:showback": "IT:Aris", "ProductName": "Amazon Elastic Compute Cloud", "ResourceId": "i-9fdb5ea1", "PayerAccountId": "311971337317", "UsageStartDate": "2015-08-01 00:00:00", "BlendedCost": "0.00000170", "InvoiceID": "Estimated", "ReservedInstance": "N", "RecordType": "LineItem", "RecordId": "39613589688296092585051622", "Operation": "InterZone-Out", "user:Name": "inst-aris-app-01", "SubscriptionId": "28816468", "user:project": "aris design", "ItemDescription": "$0.010 per GB - regional data transfer - in/out/between EC2 AZs or using IPs or ELB", "UnBlendedCost": "0.00000170", "UnBlendedRate": "0.0100000000", "UsageType": "APS2-DataTransfer-Regional-Bytes", "LinkedAccountId": "311971337317", "BlendedRate": "0.0100000000", "user:environment": "production", "UsageQuantity": "0.00016988", "UsageEndDate": "2015-08-01 01:00:00", "RateId": "3510837"}
Mark
I didn't say it was in the raw data twice (although that is one way to have a multivalued field created with the same value twice).  So now we have half of the pieces of the puzzle; what are your Splunk configurations (particularly inputs.conf, props.conf and transforms.conf)?
inputs.conf:
[script:///opt/splunk/etc/apps/SplunkAppforAWSBilling/bin/ProcessDetailedReport.py]
 disabled = 0
 index = aws-bill
 interval = 10800
 passAuth = splunk-system-user
 source = SplunkAppforAWSBilling_Import
 sourcetype = SplunkAppforAWSBilling_Processor
props.conf:
[source::SplunkAppforAWSBilling_Import]
 INDEXED_EXTRACTIONS=json
 KV_MODE=json
 TIME_PREFIX=\"UsageStartDate\":
 TIME_FORMAT=%Y-%m-%d %H:%M:%S
transforms.conf
#######################
 #  Lookups
 #######################
[payer_account_id]
 filename = payer_account_id.csv
[linked_account_id]
 filename = linked_account_id.csv
trying to highlight that 2nd search but get this error:
You are only allowed to submit 2 posts per day until you reach 40 points of reputation level.