All Apps and Add-ons

Using splunk to create and view table metadata

sjanwity
Communicator

I am completely new to Splunk!

I have a database where updates are performed in a very ad-hoc way: I delete records and insert a new record with the new values. I hook this up to another 'logging' database which records each insert/delete operation and the details of the affected record (e.g. the timestamp of the operation, the ID of the record, etc.) . I'd like to use splunk to group these update operations and view them - what field was changed when - , bearing in mind that I also do normal inserts and deletes.

I am aware that you can input SQL and use splunk to present the data, but how do I go about almost creating a new field by itself?

For example:

Table Customers:

Customer ID | Customer Name | Customer Address |
001 | John F | 213 Privet Drive
002 | Kyle A | 16 Gammon Road


Table Customers-History:

TIMESTAMP         | OPERATION | Customer ID | Customer Name | Customer Address
1-Dec-2010 09:52:1232| INSERT | 002       | Kyle A          | 10 Gammon Road
2-Dec-2010 09:54:9500| DELETE| 002         | Kyle A         | 10 Gammon Road
2-Dec-2010 09:54:9500| INSERT | 002         | Kyle A        | 16 Gammon Road
2-Dec-2010 09:55:9921| DELETE | 003         | Josh C        | 21 Drury Lane

In the above example, the 2nd and 3rd logs of the Customers-History table show an edit operation. I want splunk to go through and record all the changes to the Customers table, automatically grouping up the edit operations as well. How should I do this?

1 Solution

aweitzman
Motivator

What you want to do is quite doable, but if you're completely new to Splunk, then you should start with the tutorial. Your first order of business will be getting your logs into Splunk and ensuring that the fields that you want to appear are being extracted properly. Then your question about a search that will group update operations will be easier to answer.

View solution in original post

aweitzman
Motivator

I won't say it's impossible, but if it is possible, I can't think of any intelligent way to do it. It would be a major hacky exercise to make it show up that way within Splunk. It's not really meant for that.

If sorting them by your key-chain fields and then by time gets you the events for each key-chain in proximity to each other, with DELETE/INSERT pairs identified as UPDATES, I think you've solved your problem. The amount of effort it would take to cram them into a single event at this point just isn't worth it.

0 Karma

sjanwity
Communicator

Sorry to bug you again! In the image above (http://i.stack.imgur.com/1hVDt.jpg) you see a row where the FIELD_VALUE is updated for that OBJECT_KEY. However, if another FIELD_VALUE for the same OBJECT_KEY is updated, this causes all the FIELD_VALUEs on that object_key to be deleted and re-inserted (in effect, the OBJECT_KEYis what gets deleted and inserted on an update)

Therefore, the problem is that splunk considers all the deletions and re-insertions on the unupdated FIELD_VALUEs as interested events even though nothing is actually changed; in our log it shows the operation as an UPDATE but with the same FIELD_VALUE i.e. in the image above, the FIELD_VALUE would just be FALSE instead of FALSE TRUE.

I want to hide results where the FIELD_VALUE only has 1 value, as those are part of the row which was re-inserted (or better yet, ignore them to increase search time). How can I do this?

0 Karma

aweitzman
Motivator

By "main search goes here" I mean the search to pull your table into the Splunk search context. Like:

| dbquery search-to-get-the-Customers-History-table

The transaction command isn't part of your SQL query. It's a Splunk command designed to manipulate the data in Splunk. So what I meant was something like this:

| dbquery search-to-get-the-Customers-History-table | transaction "Customer ID" maxspan=1s maxevents=2 startswith="DELETE" endswith="INSERT"

Note the pipe between the dbquery clause and the transaction clause. It's to take the results of your first search (the dbquery part, where you get stuff out of your SQL database) and use it as the basis for manipulation by the second part (the transaction part, to extract the DELETE-INSERT pairs).

You really should read the tutorial.

aweitzman
Motivator

No, you can only use dbquery to get data out of your database, not into Splunk. To get your database data into Splunk, you need to configure a database input. See this page for details: http://docs.splunk.com/Documentation/DBX/latest/DeployDBX/Configuredatabasemonitoring

As for your grouping, what you want is to be able to take the DELETE-INSERT combination and interpret it as a single EDIT action, correct? If so, that can be done in Splunk using the transaction command. You would define a transaction as something that starts with a DELETE, ends with INSERT, has the same Customer ID value, and happens within a very short timespan. So something like:

main search goes here 
| transaction "Customer ID"  maxspan=1s maxevents=2 startswith="DELETE" endswith="INSERT"

See here for more details: http://docs.splunk.com/Documentation/Splunk/latest/SearchReference/Transaction

0 Karma

sjanwity
Communicator

@aweitzman: sorry I should mention: I've already got the data into splunk, just unsure how to create the query.

0 Karma

aweitzman
Motivator

OK, now that you've supplemented your post with sample records, I noticed the "dbconnect" tag in your initial post. Since you're representing your data as tables rather than Splunk search results, it would appear that you are accessing this data entirely through dbquery instead of importing your "logging" database data into Splunk and having it indexed there. Is that the case?

I'm also unclear on what you mean by grouping. Do you just want the data from your logging database sorted by customer ID and time? Or something more than that?

aweitzman
Motivator

If the data is already in Splunk, then can you please edit your post to include some sample records and field values? That will make it far easier to figure out how to create the query you want.

aweitzman
Motivator

What you want to do is quite doable, but if you're completely new to Splunk, then you should start with the tutorial. Your first order of business will be getting your logs into Splunk and ensuring that the fields that you want to appear are being extracted properly. Then your question about a search that will group update operations will be easier to answer.

sjanwity
Communicator

Hi, I have a further constraint to transaction. I want to add a condition that makes splunk group operations only when an integer column I'll call version is one away from each other, as well as the row having the same key. Is this possible to do?

0 Karma

aweitzman
Motivator

I don't think you can do this in the transaction command itself, but you can do it after the fact by adding something like this on the end:

| where mvcount(version)=2 AND tonumber(mvindex(version,1))=tonumber(mvindex(version,0))+1
0 Karma
Get Updates on the Splunk Community!

Announcing Scheduled Export GA for Dashboard Studio

We're excited to announce the general availability of Scheduled Export for Dashboard Studio. Starting in ...

Extending Observability Content to Splunk Cloud

Watch Now!   In this Extending Observability Content to Splunk Cloud Tech Talk, you'll see how to leverage ...

More Control Over Your Monitoring Costs with Archived Metrics GA in US-AWS!

What if there was a way you could keep all the metrics data you need while saving on storage costs?This is now ...