This is an extension of the question http://answers.splunk.com/answers/171571/using-splunk-to-create-and-view-table-metadata.html
I am namely talking about the
stat command or the
transaction command, unless there is a command which would do this job better! I am receiving input from a database and grouping them up into
insert pairs based on their key and update time, due to a technical debt which disallowed us from doing proper SQL updates.
Some rows however, are not joining, as even though the actions are supposed to be atomic, I found that some rows get updated into the database 1ms apart, breaking the
update_time field. I therefore want to add a special column in splunk and then use a simple equals evaluation to indicate that these rows should be joined. However, I am having trouble telling splunk's
statto combine rows based on these 2 columns being equal. Can anyone help?
@aweitzman This is an extension to my old question, but I figured out you'd be bored of helping me by now! I found that using
stat combined the fields better than
transaction did because I make database updates in batches and whilst the operations are atomic, it still created situations abit like this:
DELETE (key: fdgd) INSERT (key: abcd) INSERT (key: fdgd)
i.e. actions performed on another row got in between the updates, which I think is confusing the
transaction command and not making it work properly.
stat command combines everything beautifully when I use it with the _time field, however there are cases where the operations were performed 1 milisecond away which is causing the operation to break on those few cases.
wait, are you saying it was supposed to be able to handle it? How does transaction work on the background, anyway? I thought it was breaking on those few cases because there were rows for other records being executed in between the insert-delete pair, thus confusing the transaction command.
transaction is supposed to be able to handle it even when there are other records in between, since the command is qualified with the fields that you're working with, and it should be able to separate them out based on that. But sometimes reality doesn't match documentation, and you have to hack your way around it. And it sounds like you've done that.
would you perchance have any idea why? or how I could make them match up? I don't get how
stat works the way I'm expecting
transaction to work and am afraid that this little technical debt will bite me back big time in the future
Can you give an example of what you're doing with
stats that works for you? It does surprise me at some level that it solves your problem, but I'm not seeing how you're using it, so I can't really tell.
nevermind, I just wasn't sorting the transaction results properly. Transactions are now grouping them up correctly after changing from
EXECUTED_TIME (which, I think, is the timestamp of when the command was actually sent from the client) ->
_time which I found on another table and did a
join on. Yay!
I have no idea why. Maybe it's a bug? I don't use
transaction very much, but in order to help you I cobbled together some examples, most using standard Splunk data, but some even using
dbquery to pull the data in, and it seemed to work fine for me.
You haven't told us what the contents of your special new column is, so it's going to be hard to understand what you're really trying to do. You wouldn't use
transaction to test whether the values of two separate columns are equal anyways, and while you could do it with
stats, that's only if you want to aggregate that information across your entire result set, which it doesn't sound like you want to do.
It seems like you want that information for each event, in which case you can simply create a new field using
eval newfield=if(field1==field2,1,0) and then work with that.
(That said, if the times are really just a millisecond off, maybe expanding the
maxspan of your
2s might solve your problem, I would guess.)