I have data from 2 different data sources. I am trying to figure out how to distribute a value into a cost until the cost is "used up". In other words, until the sum of VALUES=COST. Then it moves on to the next COST and does the same thing with the remaining values until all the VALUES are exhausted.
Some sample data:
COST VALUES 20000 30000 20000 5000 20000 2000 8000 15000
Given this, I need to be able to identify which VALUES are associated with which COST. As seen below.
COST VALUES 20000 5000 15000 20000 20000 20000 8000 2000 10000
The nested values add up to the COST. Also notice 30000 from the sample data was split into 20000 and 10000, due to only needing 20000 to satisfy one of the COSTs.
I have been banging my head against this wall for a week, and I am leaning towards just scripting it in python and passing it back into Splunk, unless of course some of you Splunk geniuses know of a more "splunkish" way to accomplish this.
Thanks for your help!
If you fix your example, which I am pretty sure is broken (the VALUES values are not the same between the 2 sections), I will make an attempt to answer.
The values are correct. One of the initial values was 30000. Because only 20000 of that was need to fully account for a cost, it needs to be split to 20000 and the remainder, 10000
OK, I did not read/understand the comment at the bottom which clarifies it. I see what you need now but it is a doozy.
To your point about scripting in python, to make this a bit more "splunkish" you could create a custom search command http://dev.splunk.com/view/python-sdk/SP-CAAAEU2. So your python code could be called as a command from within the query (SPL)
The algorithm required does appear a very poor fit for the SPL command set. Writing a program to do just this and then sending the data to it is probably the best way (i.e. custom command).