Knowledge Management

transformation of log files

Jananee_iNautix
Path Finder

I've different log FTP files coming in. Each log file will be in a differnt format but with some common data across the log files. I wanted to extract those common information across the files and transform it into one source file AT INDEX TIME (not at search time).
I'm able to transform the data if it's from one source.
Please advise on how to combine the data from different files into one source file AT INDEX TIME.


UPDATE:

This is what we are trying to do AT THE INDEX TIME. The requirement is to use Splunk to only transform the data from different logs into a specified format(as mentioned below) only. We don't want to search any information. Additionally, please confirm in the below mentioned scenario if Splunk will be used as a Forwarder.

I have mentioned sample events from six different log files where each log file contains each information in it that . For example log 1 may contain the "IP address",log2 may contain the "No of bytes transferred" and such like that .I want to combine the logs together so that I can change the format of the logs to the below Ouput format

Log 1:

2013-11-22 00:03:06,124 [29682968] INFO: Secure storage keystore is disabled. [.utils.security.KeystoreReader]
2013-11-22 00:03:59,585 [68ea68ea] LOGON: There have been 1 logon failures for unrecognized userids in 0 seconds for these userids, IP addresses, and times: [null 167.113.182.41 11/22/13 0:3:59]. [.config.clientitf.FuncUser]

Log2:

2013-11-22 06:32:41,057 [29682968] FINE: Processing socket READ event [com.maverick.nio.SocketConnection]
2013-11-22 06:32:41,057 [29682968] FINE: 127845 bytes transferred

Log3:

13-11-22 05:26:54 [29682968] FINE: Received file c:/demo.csv
13-11-22 05:26:54 [5f7e5f7e] FINE: Processing socket READ event

Log4:

13-11-22 00:18:03 [69b269b2] INFO: Secure storage keystore is disabled.
13-11-22 00:33:03 [29682968] INFO: file c:/demo.csv in ascii mode.

Log5:

2013-11-22 00:03:19,224 [29682968] SUCCESS: Compressed the file demo.csv: systemid=27, machinename=rsbe.bnymellon.net, module=Transformation Service, description=Transformation Service [.alarms.heartbeatgenerator.GeneratorThread]
2013-11-22 00:10:19,523 [6d886d88] SUCCESS: Success, Alarm Heartbeat updated: systemid=27, machinename=rsbe.bnymellon.net, module=Transformation Service, description=Transformation Service [.alarms.heartbeatgenerator.GeneratorThread]

Log6:

2013-11-22 00:00:29,024 [51d451d4] SUCCESS: Scan successful: There are no documents for throttler Throttling Integration with owner ENCRYPT_DECRYPT to output. [.io.agents.throttle.ThrottleAgent]
2013-11-22 00:00:36,018 [516c516c] FINE: Scanning directory [/ftxprd1/BNYM_NONPROD_01/biz_outbound] [.io.agents.dirmon.DirMonInbound]
2013-11-22 00:00:36,018 [516c516c] FINE: Scanning dir [/ftxprd1/BNYM_NONPROD_01/biz_outbound] [.io.agents.dirmon.DirMonInbound]

Log7:

2013-11-22 00:03:07,734 [19581958] SUCCESS: Success, Alarm Heartbeat updated: systemid=25, machinename=rsbe.bnymellon.net, module=Inbound Listeners, description=Inbound Listeners [.alarms.heartbeatgenerator.GeneratorThread]
2013-11-22 00:10:08,023 [29682968] SUCCESS: Success, Alarm Heartbeat updated: systemid=25, machinename=rsbe.bnymellon.net, module=Inbound Listeners, description=Inbound Listeners [.alarms.heartbeatgenerator.GeneratorThread]

Tags (1)
0 Karma

kristian_kolb
Ultra Champion

You should be able to install a full Splunk instance and configure it to;

a) read some files
b) transform event data on a per event basis.
c) not index any events locally, and
d) forward them as syslog traffic

On your syslog server you can write the incoming data into a file.

I have had mixed results with sending syslog out of splunk, but that was a long time ago.

0 Karma

Jananee_iNautix
Path Finder

Thanks for the response. If Splunk can extract specific data from different logs during index time, we thought we can continue with Splunk. We are in the learning process only.

Also can you please confirm if the same can be achieved if Splunk were to be used as Forwarder. In data routing is it possible to select the data based on a pattern and send it to a receiver? (http://docs.splunk.com/Splexicon:Datarouting http://docs.splunk.com/Splexicon:Datarouting )

Can the receiver be a database / a file (ie) can the routed data be exported.

0 Karma

kristian_kolb
Ultra Champion

There one thing that I wonder about is that if you do not want to search the data, are you sure that you picked the right tool? While splunk certainly can transform data on a per event basis, the primary benefit of Splunk is to be able to collect vast amounts of logs and search through them.

kristian_kolb
Ultra Champion

Please update your existing questions rather than starting a new one, as this will allow the people who are trying to help you to better follow your progress.

0 Karma

lguinn2
Legend

What Ayn said. Splunk does not combine the files. What you can do in Splunk

1 - place all the data in the same index (probably a good idea)

2 - give all the inputs the same sourcetype (possibly a good idea)

3 - give all the inputs the same source (probably a bad idea)

4 - transform the incoming data based on source, sourcetype, host or regex pattern matching - at index time

5 - extract fields at search time or index time - based on regex pattern matching plus source, sourcetype or host

I don't understand your statement that you can "transform the data if it's from one source". Having all the data be from "one source" is not a requirement of Splunk.

I am not sure what you are extracting, so I cannot give specific advice about index time parsing. However, if you want to extract fields at index time, I will tell you now that it is a bad idea to do it that way 99.9% of the time. Search time field extraction is almost always better. Everyone else in the community will say the same thing. But with more information about the data, we might see it differently and be able to give clearer advice.

So why do you need one source file? And why do you need to do this at index time?

Jananee_iNautix
Path Finder

I mean one source file as single log file combining different FTP log files

0 Karma

Ayn
Legend

What do you mean by "into one source file"? Splunk doesn't keep files that way. Splunk will however add some metadata about where it got events from (which is recorded in the "source" field). Is this what you mean? That you want to write the same source metadata for several different sources?

Get Updates on the Splunk Community!

.conf24 | Registration Open!

Hello, hello! I come bearing good news: Registration for .conf24 is now open!   conf is Splunk’s rad annual ...

ICYMI - Check out the latest releases of Splunk Edge Processor

Splunk is pleased to announce the latest enhancements to Splunk Edge Processor.  HEC Receiver authorization ...

Introducing the 2024 SplunkTrust!

Hello, Splunk Community! We are beyond thrilled to announce our newest group of SplunkTrust members!  The ...