All Apps and Add-ons

Splunk App for AWS Billing: Why is a single entry of raw data showing 2 results (count=2 and 200%)?

mjm295
Path Finder

When I do a particular search on a unique record ID, I get 1 piece of raw data back, but some of the fields are reporting 2 entries. I believe this is skewing my results further down the line.

For a particular search:

index=aws-bill  RecordId=39613589688296092585051622

I get exactly 1 Event, but hovering over the field for Blended cost, I see 2 lots of data. Value is "0.00000170" but the count is 2. Why is this?
alt text

Also, when I do this search and show as a chart:

index=aws-bill  RecordId=39613589688296092585051622 | timechart sum(BlendedCost) as $ by showback

I get a barchart with the value as "0.00000340" which is double the Blended cost.

Where is this coming from? What are my options for getting better results?

0 Karma
1 Solution

woodcock
Esteemed Legend

OK, that explains it; you are telling Splunk to extract json fields twice: once at index time ( INDEXED_EXTRACTIONS=json ) and once at search time ( KV_MODE=json). Get rid of the KV_MODE setting.

See this Q&A for a more complete discussion:

http://answers.splunk.com/answers/174939/why-are-my-json-fields-extracted-twice.html

View solution in original post

monkee
Path Finder

The latest version 2.0.9 uses just KV_MODE=json, so it does not cause any duplicates. Thanks to woodcock for the heads up.

0 Karma

woodcock
Esteemed Legend

OK, that explains it; you are telling Splunk to extract json fields twice: once at index time ( INDEXED_EXTRACTIONS=json ) and once at search time ( KV_MODE=json). Get rid of the KV_MODE setting.

See this Q&A for a more complete discussion:

http://answers.splunk.com/answers/174939/why-are-my-json-fields-extracted-twice.html

View solution in original post

woodcock
Esteemed Legend

Your picture is unambiguously clear: it is be because your 1 matching event has a multivalued field called BlendedCost with 2 values, both of which are the same: 0.00000170. How is the BlendedCost field created? What is in the raw data (_raw field)?

mjm295
Path Finder

inputs.conf:

 [script:///opt/splunk/etc/apps/SplunkAppforAWSBilling/bin/ProcessDetailedReport.py]
 disabled = 0
 index = aws-bill
 interval = 10800
 passAuth = splunk-system-user
 source = SplunkAppforAWSBilling_Import
 sourcetype = SplunkAppforAWSBilling_Processor

props.conf:

 [source::SplunkAppforAWSBilling_Import]
 INDEXED_EXTRACTIONS=json
 KV_MODE=json
 TIME_PREFIX=\"UsageStartDate\"\:
 TIME_FORMAT=%Y-%m-%d %H:%M:%S

transforms.conf

 #######################
 #  Lookups
 #######################
 [payer_account_id]
 filename = payer_account_id.csv

 [linked_account_id]
 filename = linked_account_id.csv
0 Karma

mjm295
Path Finder

Hi, No it seems to be a single entry in the raw data: "BlendedCost": "0.00000170"

It is sucked in from a spreadsheet which comes from AWS billing. BlendedCost is one of the columns in the spreadsheet and that also only has the single entry.

Raw data is:

{"user:hostname": "awswarsp01", "PricingPlanId": "505699", "user:showback": "IT:Aris", "ProductName": "Amazon Elastic Compute Cloud", "ResourceId": "i-9fdb5ea1", "PayerAccountId": "311971337317", "UsageStartDate": "2015-08-01 00:00:00", "BlendedCost": "0.00000170", "InvoiceID": "Estimated", "ReservedInstance": "N", "RecordType": "LineItem", "RecordId": "39613589688296092585051622", "Operation": "InterZone-Out", "user:Name": "inst-aris-app-01", "SubscriptionId": "28816468", "user:project": "aris design", "ItemDescription": "$0.010 per GB - regional data transfer - in/out/between EC2 AZs or using IPs or ELB", "UnBlendedCost": "0.00000170", "UnBlendedRate": "0.0100000000", "UsageType": "APS2-DataTransfer-Regional-Bytes", "LinkedAccountId": "311971337317", "BlendedRate": "0.0100000000", "user:environment": "production", "UsageQuantity": "0.00016988", "UsageEndDate": "2015-08-01 01:00:00", "RateId": "3510837"}

Mark

0 Karma

woodcock
Esteemed Legend

I didn't say it was in the raw data twice (although that is one way to have a multivalued field created with the same value twice). So now we have half of the pieces of the puzzle; what are your Splunk configurations (particularly inputs.conf, props.conf and transforms.conf)?

0 Karma

mjm295
Path Finder

inputs.conf:

[script:///opt/splunk/etc/apps/SplunkAppforAWSBilling/bin/ProcessDetailedReport.py]
disabled = 0
index = aws-bill
interval = 10800
passAuth = splunk-system-user
source = SplunkAppforAWSBilling_Import
sourcetype = SplunkAppforAWSBilling_Processor

props.conf:

[source::SplunkAppforAWSBilling_Import]
INDEXED_EXTRACTIONS=json
KV_MODE=json
TIME_PREFIX=\"UsageStartDate\":
TIME_FORMAT=%Y-%m-%d %H:%M:%S

transforms.conf

#######################
# Lookups
#######################

[payer_account_id]
filename = payer_account_id.csv

[linked_account_id]
filename = linked_account_id.csv

0 Karma

mjm295
Path Finder

trying to highlight that 2nd search but get this error:
You are only allowed to submit 2 posts per day until you reach 40 points of reputation level.

0 Karma
Register for .conf21 Now! Go Vegas or Go Virtual!

How will you .conf21? You decide! Go in-person in Las Vegas, 10/18-10/21, or go online with .conf21 Virtual, 10/19-10/20.