Splunk Dev

formatting data for syslog to splunk

jalfrey
Communicator

I'm writing a python library to take some CPU/process data and send it via syslog to splunk. The data is naturally tabular and contains quite a few rows. I will be polling this information frequently (every second or every 10 seconds). Is it best to send each process name with it's own CPU stats (% of total and total time) 2 values or instead send all of them in a single log message?

If I choose to send it all as one log message what is the best way to format it? I currently have it stored as nested python dictionary. Here are a couple of values:

{'Current CPU Percentage': '0.00', 'Total CPU Seconds': '0.00'}
the local_dict for process tSchedObjTimer looks like:
{'Current CPU Percentage': '0.00', 'Total CPU Seconds': '0.00'}
the local_dict for process tIkeMsgTask looks like:

I'm thinking something like:

process:tSchedObjTimer ['Current CPU Percentage': '0.00', 'Total CPU Seconds': '0.00'] process: tIkeMsgTask ['Current CPU Percentage': '0.00', 'Total CPU Seconds': '0.00']

or maybe

tSchedObjTimer=['Current CPU Percentage': '0.00', 'Total CPU Seconds': '0.00'] tIkeMsgTask=['Current CPU Percentage': '0.00', 'Total CPU Seconds': '0.00']

I was thinking about this last night. I'm going to have a lot of data and it would probably be best to aggregate the data and pack it.

tSchedObjTimer=['Current CPU Percentage': '0.00','0.00','0.00','0.00','0.00','0.00','0.00','0.00','0.00','0.00', 'Total CPU Seconds': '0.00','0.00','0.00','0.00','0.00','0.00','0.00','0.00','0.00','0.00' tIkeMsgTask=['Current CPU Percentage': '0.00','0.00','0.00','0.00','0.00','0.00','0.00','0.00','0.00','0.00', 'Total CPU Seconds': '0.00','0.00','0.00','0.00','0.00','0.00','0.00','0.00','0.00','0.00']

0 Karma

_d_
Splunk Employee
Splunk Employee

An easier way is to write each process' tabular data in its own file. Then have Splunk monitor that directory. Source, an indexed field that does not count against the license, can be your process name. Ex.

cat /var/log/metrics/tSchedObjTimer
2013-09-26T08:43Z,0.00,2.00,3.00...
2013-09-26T08:44Z,0.00,2.00,3.00...
2013-09-26T08:45Z,0.00,2.00,3.00...

Use DELIMS and FIELDS in transforms.conf to extract your values. You can also use a TRANSFORMS-xx that extracts the precise process name based on path.

Note that this method, while offering good packing, it makes the logs pretty difficult to read.

0 Karma

_d_
Splunk Employee
Splunk Employee

Simply install a Universal Forwarder that monitors the directory where files are written and point it to the Splunk indexer.

0 Karma

jalfrey
Communicator

I like this solution. The host generating these log events is not the splunk system. How do I write to that file over the network?

0 Karma

sdaniels
Splunk Employee
Splunk Employee

There are trade offs of course. More events means more to process and license. Packing the data in as you have above means less data but a lot more time spent on configuring your fields etc...to get the data in properly.

0 Karma

jalfrey
Communicator

I seem to have nested data. I have Date/Time, Process Name, CPU Total, CPU Percent.
I could split the output into multiple syslog messages if need be.

01/01/2013 1:00 process_name=tSchedObjTimer CurrentCpuPercentage=0.00 TotalCpuSeconds=0.00
01/01/2013 1:00 process_name=tlkeMsgTask CurrentCpuPercentage=0.00 TotalCpuSeconds=0.00

I have 120 processes and I would like to poll them frequently (every second or every 10 seconds). So that makes for a lot of syslog. Is it better to opt for readability or instead pack the data?

0 Karma

sdaniels
Splunk Employee
Splunk Employee

Using key-value pairs would be ideal.

http://dev.splunk.com/view/logging-best-practices/SP-CAAADP6

0 Karma
Get Updates on the Splunk Community!

Automatic Discovery Part 1: What is Automatic Discovery in Splunk Observability Cloud ...

If you’ve ever deployed a new database cluster, spun up a caching layer, or added a load balancer, you know it ...

Real-Time Fraud Detection: How Splunk Dashboards Protect Financial Institutions

Financial fraud isn't slowing down. If anything, it's getting more sophisticated. Account takeovers, credit ...

Splunk + ThousandEyes: Correlate frontend, app, and network data to troubleshoot ...

 Are you tired of troubleshooting delays caused by siloed frontend, application, and network data? We've got a ...