All Apps and Add-ons

Can Splunk index .gzip'd XMLs from a SQL database?

nick405060
Motivator

We have a SQL table where a binary_log_content field is a .gzip'd XML for the logs we'd like to ingest.

How can I ingest these logs, short of using Python to query and save to a disk location that Splunk then monitors and auto-extracts?

0 Karma
1 Solution

nick405060
Motivator

How can I ingest these logs, short of using Python to query and save to a disk location that Splunk then monitors and auto-extracts?

Okay well ignoring the "short of" part... this is how you do this. Despite being in the db_connect directory, this does not use db_connect. You can move things to system/local or another directory that you think is more correct, if desired.

crontab -e

#!/usr/bin/env python3
* * * * * python3 /opt/splunk/etc/apps/splunk_app_db_connect/HOSTNAME_query.py

/opt/splunk/etc/apps/splunk_app_db_connect/HOSTNAME_query.py

import pymssql

conn = pymssql.connect(server='HOSTNAME:PORT',user='USERNAME',password='PASSWORD',database = 'DATABASE')
cursor = conn.cursor()
cursor.execute('SELECT MYFIELD,log_date FROM DATABASE.dbo.log_history WHERE log_date > DATEADD(day, -1, GETDATE())')

row = cursor.fetchone()
count = 0
while row:
    count = count+1
    row = cursor.fetchone()
    date=str(row[1])[:str(row[1]).index(' ')]
    open('/opt/splunk/var/run/splunk/xmlballs/'+date+'.gz','wb').write(row[0])

/opt/splunk/etc/apps/splunk_app_db_connect/local/inputs.conf

[monitor:///opt/splunk/var/run/splunk/xmlballs]
disabled = 0
sourcetype=compressed_xml
index=MYINDEX

View solution in original post

0 Karma

nick405060
Motivator

How can I ingest these logs, short of using Python to query and save to a disk location that Splunk then monitors and auto-extracts?

Okay well ignoring the "short of" part... this is how you do this. Despite being in the db_connect directory, this does not use db_connect. You can move things to system/local or another directory that you think is more correct, if desired.

crontab -e

#!/usr/bin/env python3
* * * * * python3 /opt/splunk/etc/apps/splunk_app_db_connect/HOSTNAME_query.py

/opt/splunk/etc/apps/splunk_app_db_connect/HOSTNAME_query.py

import pymssql

conn = pymssql.connect(server='HOSTNAME:PORT',user='USERNAME',password='PASSWORD',database = 'DATABASE')
cursor = conn.cursor()
cursor.execute('SELECT MYFIELD,log_date FROM DATABASE.dbo.log_history WHERE log_date > DATEADD(day, -1, GETDATE())')

row = cursor.fetchone()
count = 0
while row:
    count = count+1
    row = cursor.fetchone()
    date=str(row[1])[:str(row[1]).index(' ')]
    open('/opt/splunk/var/run/splunk/xmlballs/'+date+'.gz','wb').write(row[0])

/opt/splunk/etc/apps/splunk_app_db_connect/local/inputs.conf

[monitor:///opt/splunk/var/run/splunk/xmlballs]
disabled = 0
sourcetype=compressed_xml
index=MYINDEX

View solution in original post

0 Karma

nick405060
Motivator

You definitely cannot do it using dbconnect. If you do, you get the non-displayable column type image message:

https://answers.splunk.com/answers/587501/splunk-db-connect-checkpoint-set-to-null.html

0 Karma

woodcock
Esteemed Legend

Splunk understands how to eat gzipped files so you do not have to do anything. Because you already have this data in a database, there is no reason to archive the extracts so I suggest something like this so that Splunk deletes the files after it forwards them:

[batch:///path/to/your/files/*.xml.gz]
move_policy = sinkhole
sourcetype=foo
index=bar
host_segment=bat

nick405060
Motivator

I don't think this actually answers my question @woodcock

0 Karma

woodcock
Esteemed Legend

You are correct, it was not immediately clear that it is a field in the CSV that is gzipped. #cribl can do this.

0 Karma
Did you miss .conf21 Virtual?

Good news! The event's keynotes and many of its breakout sessions are now available online, and still totally FREE!