Splunk Search

How to configure Splunk to parse Perforce logging structured data in CSV files?

JScordo
Path Finder

I am trying to ingest the structured logs from our main Perforce server. I have the structured logs split out to multiple files (Commands.csv, Errors.csv, Audit.csv, Track.csv, User.csv, Events.csv, Integrity.csv, and Auth.csv). The issue with this is that each file is not coming in the same file format. The way Perforce does it is by specifying an "event type" as the first character of every single event (1-16). Each number specifies a different set of fields for that single event. The Commands and Track CSVs come with multiple event types. Has anyone encountered this yet or does anyone know how to split out the field values based on that first character? Below are some test files from Track.csv and as you can see, the event types 7, 8, 9, 14 all represent different field values in the event.

9,1464803105,104007315,2016/06/01 18:45:05 104007315,144447,1,user,server,user-sync,IP,p4,2015.2/NTX64/1311674,file,db,db.counters,3,0,2,0,0,0,0,1,0,0,0,0
8,1464803105,104007315,2016/06/01 18:45:05 104007315,144447,1,user,server,user-sync,IP,p4,2015.2/NTX64/1311674,file,rpc,2,4,222,483,318788,523588,92,0
7,1464803105,104007315,2016/06/01 18:45:05 104007315,144447,1,user,server,user-sync,IP,p4,2015.2/NTX64/1311674,file,usage,.098s,3,2,0,8,0,0,4332,0
14,1464803105,103663508,2016/06/01 18:45:05 103663508,144447,1,user,server,user-sync,IP,p4,2015.2/NTX64/1311674,file,0,0,0,0,0,0
14,1464803105,100381835,2016/06/01 18:45:05 100381835,144447,1,user,server,user-sync,IP,p4,2015.2/NTX64/1311674,file,0,0,0,0,0,0
0 Karma

gabriel_vasseur
Contributor

If I understood you correctly, it sounds like you need to configure a per-event sourcetype. Have a look and let us know if it makes sense.

0 Karma

gabriel_vasseur
Contributor

About your older comment: First up, I'm not familiar with REPORT-fields and I have no time to google right now, but did it work at all? Did the Eventtype field get populated with numbers? I wouldn't be surprised if the other fields didn't get populated as I think there is an issue in your regexes. For instance in:

[schema_14]
 REGEX = ^(14)([^,]*)
 FORMAT = Eventtype::$1 timestamp::$2 timestamp2::$3 date::$4 pid::$5 cmdno::$6 user::$7 client::$8 func::$9 host::$10 prog::$11 version::$12 args::$13 filesAdded::$14 fileUpdated::$15 filesDeleted::$16 bytesAdded::$17 bytesUpdated::$18 bytesDeleted::$19

Your regex will match the 14 at the start of the line and assign that to $1, which then should go into Eventtype. But after the 14, there is a comma in the data, therefore $2 will be empty. And of course there is no $3, $4, etc... I believe the regex line should be something like:

 REGEX = ^(14)(?:,([^,]*)){18}

You basically need to match the comma as a separator and then repeat the whole comma-followed-by-some-non-comma-chars pattern as many times as there are other fields.

Now, about your second comment, this is closer to the kind of things I've seen before. If I'm not mistaken this should set the sourcetype field to "command_log" if the event line starts with 7. Did that work? If yes, then you're mostly there, as you found a way to separate your initial data into bunches with the same schema. Just create as many TRANSFORMS as there are schemas. The next step is to define the format of each sourcetype somewhere. I'm not sure how to do that but that's a much simpler issue than your initial problem.

I appreciate I'm not being extremely helpful here... I'm afraid it's the blind leading the blind! Don't hesitate to comment back. Hopefully you'll be able to make some progress...

0 Karma

JScordo
Path Finder

So from my understanding of the per-event sourcetype this is my new props and transforms. I am still confused where i would set the field names for the events:

transforms.conf

[schema_7]
REGEX = ^(7)([^,]*)
FORMAT = sourcetype::command_log
DEST_KEY = MetaData:Sourcetype

props.conf

[source::/p4rotatedlogs/structuredlogs/track.csv]
TRANSFORMS-fields = schema_7 
0 Karma

gabriel_vasseur
Contributor

Hi JScordo, did you manage to get it working in the end?

0 Karma

JScordo
Path Finder

Yes i guess a per-event sourcetype is the direction i would need to go. I will read into that. The road i have gone down is based off the answer/comments here:
https://answers.splunk.com/answers/316273/field-extracting-lines-from-a-single-file-based-on.html
and this helped me get to these props and transforms. Clearly this isn't working though

Transforms.conf

[schema_7]
REGEX = ^(7)([^,]*)
FORMAT = Eventtype::$1 timestamp::$2 timestamp2::$3 date::$4 pid::$5 cmdno::$6 user::$7 client::$8 func::$9 host::$10 prog::$11 version::$12 args::$13 tracktype::$14 timer::$15 utime::$16 stime::$17 io_in::$18 io_out::$19 net_in::$20 net_out::$21 maxrss::$22 page_faults::$23

[schema_8]
REGEX = ^(8)([^,]*)
FORMAT = Eventtype::$1 timestamp::$2 timestamp2::$3 date::$4 pid::$5 cmdno::$6 user::$7 client::$8 func::$9 host::$10 prog::$11 version::$12 args::$13 tracktype::$14 recvCount::$15 SendCount::$16 recvBytes::$17 sendBytes::$18 rpc_hi_mark_fwd::$19 rpc_hi_mark_rev::$20 recvTime::$21 sendTime::$22

[schema_9]
REGEX = ^(9)([^,]*)
FORMAT = Eventtype::$1 timestamp::$2 timestamp2::$3 date::$4 pid::$5 cmdno::$6 user::$7 client::$8 func::$9 host::$10 prog::$11 version::$12 args::$13 tracktype::$14 dbName::$15 pagesIn::$16 pagesOut::$17 pagesCached::$18 reorderIntl::$19 reorderLeaf::$20 readLocks::$21 writeLocks::$22 f_gets::$23 f_positions::$24 f_scans::$25 f_puts::$26 f_deletes::$27

[schema_14]
REGEX = ^(14)([^,]*)
FORMAT = Eventtype::$1 timestamp::$2 timestamp2::$3 date::$4 pid::$5 cmdno::$6 user::$7 client::$8 func::$9 host::$10 prog::$11 version::$12 args::$13 filesAdded::$14 fileUpdated::$15 filesDeleted::$16 bytesAdded::$17 bytesUpdated::$18 bytesDeleted::$19

Props.conf
[track_log]
SHOULD_LINEMERGE=false
KV_MODE = none
TIME_FORMAT = %Y/%m/%d %H:%M:%S
TZ = +1:00
REPORT-fields = schema_7, schema_8, schema_9, schema_14
0 Karma
Get Updates on the Splunk Community!

Enterprise Security Content Update (ESCU) | New Releases

In December, the Splunk Threat Research Team had 1 release of new security content via the Enterprise Security ...

Why am I not seeing the finding in Splunk Enterprise Security Analyst Queue?

(This is the first of a series of 2 blogs). Splunk Enterprise Security is a fantastic tool that offers robust ...

Index This | What are the 12 Days of Splunk-mas?

December 2024 Edition Hayyy Splunk Education Enthusiasts and the Eternally Curious!  We’re back with another ...