I am completely new to Splunk!
I have a database where updates are performed in a very ad-hoc way: I delete records and insert a new record with the new values. I hook this up to another 'logging' database which records each insert/delete operation and the details of the affected record (e.g. the timestamp of the operation, the ID of the record, etc.) . I'd like to use splunk to group these update operations and view them - what field was changed when - , bearing in mind that I also do normal inserts and deletes.
I am aware that you can input SQL and use splunk to present the data, but how do I go about almost creating a new field by itself?
For example:
Table Customers:
Customer ID | Customer Name | Customer Address |
001 | John F | 213 Privet Drive
002 | Kyle A | 16 Gammon Road
Table Customers-History:
TIMESTAMP | OPERATION | Customer ID | Customer Name | Customer Address
1-Dec-2010 09:52:1232| INSERT | 002 | Kyle A | 10 Gammon Road
2-Dec-2010 09:54:9500| DELETE| 002 | Kyle A | 10 Gammon Road
2-Dec-2010 09:54:9500| INSERT | 002 | Kyle A | 16 Gammon Road
2-Dec-2010 09:55:9921| DELETE | 003 | Josh C | 21 Drury Lane
In the above example, the 2nd and 3rd logs of the Customers-History table show an edit operation. I want splunk to go through and record all the changes to the Customers table, automatically grouping up the edit operations as well. How should I do this?
What you want to do is quite doable, but if you're completely new to Splunk, then you should start with the tutorial. Your first order of business will be getting your logs into Splunk and ensuring that the fields that you want to appear are being extracted properly. Then your question about a search that will group update operations will be easier to answer.
I won't say it's impossible, but if it is possible, I can't think of any intelligent way to do it. It would be a major hacky exercise to make it show up that way within Splunk. It's not really meant for that.
If sorting them by your key-chain fields and then by time gets you the events for each key-chain in proximity to each other, with DELETE/INSERT pairs identified as UPDATES, I think you've solved your problem. The amount of effort it would take to cram them into a single event at this point just isn't worth it.
Sorry to bug you again! In the image above (http://i.stack.imgur.com/1hVDt.jpg) you see a row where the FIELD_VALUE
is updated for that OBJECT_KEY
. However, if another FIELD_VALUE
for the same OBJECT_KEY
is updated, this causes all the FIELD_VALUE
s on that object_key to be deleted and re-inserted (in effect, the OBJECT_KEY
is what gets deleted and inserted on an update)
Therefore, the problem is that splunk considers all the deletions and re-insertions on the unupdated FIELD_VALUE
s as interested events even though nothing is actually changed; in our log it shows the operation as an UPDATE but with the same FIELD_VALUE
i.e. in the image above, the FIELD_VALUE
would just be FALSE
instead of FALSE
TRUE
.
I want to hide results where the FIELD_VALUE
only has 1 value, as those are part of the row which was re-inserted (or better yet, ignore them to increase search time). How can I do this?
By "main search goes here" I mean the search to pull your table into the Splunk search context. Like:
| dbquery search-to-get-the-Customers-History-table
The transaction
command isn't part of your SQL query. It's a Splunk command designed to manipulate the data in Splunk. So what I meant was something like this:
| dbquery search-to-get-the-Customers-History-table | transaction "Customer ID" maxspan=1s maxevents=2 startswith="DELETE" endswith="INSERT"
Note the pipe between the dbquery
clause and the transaction
clause. It's to take the results of your first search (the dbquery
part, where you get stuff out of your SQL database) and use it as the basis for manipulation by the second part (the transaction
part, to extract the DELETE-INSERT pairs).
You really should read the tutorial.
No, you can only use dbquery
to get data out of your database, not into Splunk. To get your database data into Splunk, you need to configure a database input. See this page for details: http://docs.splunk.com/Documentation/DBX/latest/DeployDBX/Configuredatabasemonitoring
As for your grouping, what you want is to be able to take the DELETE-INSERT combination and interpret it as a single EDIT action, correct? If so, that can be done in Splunk using the transaction
command. You would define a transaction as something that starts with a DELETE, ends with INSERT, has the same Customer ID value, and happens within a very short timespan. So something like:
main search goes here
| transaction "Customer ID" maxspan=1s maxevents=2 startswith="DELETE" endswith="INSERT"
See here for more details: http://docs.splunk.com/Documentation/Splunk/latest/SearchReference/Transaction
@aweitzman: sorry I should mention: I've already got the data into splunk, just unsure how to create the query.
OK, now that you've supplemented your post with sample records, I noticed the "dbconnect" tag in your initial post. Since you're representing your data as tables rather than Splunk search results, it would appear that you are accessing this data entirely through dbquery
instead of importing your "logging" database data into Splunk and having it indexed there. Is that the case?
I'm also unclear on what you mean by grouping. Do you just want the data from your logging database sorted by customer ID and time? Or something more than that?
If the data is already in Splunk, then can you please edit your post to include some sample records and field values? That will make it far easier to figure out how to create the query you want.
What you want to do is quite doable, but if you're completely new to Splunk, then you should start with the tutorial. Your first order of business will be getting your logs into Splunk and ensuring that the fields that you want to appear are being extracted properly. Then your question about a search that will group update operations will be easier to answer.
Hi, I have a further constraint to transaction
. I want to add a condition that makes splunk group operations only when an integer column I'll call version
is one away from each other, as well as the row having the same key. Is this possible to do?
I don't think you can do this in the transaction
command itself, but you can do it after the fact by adding something like this on the end:
| where mvcount(version)=2 AND tonumber(mvindex(version,1))=tonumber(mvindex(version,0))+1