About DUThibault

DUThibault · ‎01-29-2018

You're not understanding my question. Say I have: [source::<some path>] TRANSFORMS-a = some_transform sourcetype = <some sourcetype> [<some_sourcetype>] TRANSFORMS-b = some_other_transform I'm hoping the events from <some_path> will undergo TRANSFORMS-a , receive the sourcetype <some_sourcetype> and then (consequently) undergo TRANSFORMS-b .

DUThibault · ‎01-29-2018

I am thinking of merging a variety of sources being monitored by a Universal Forwarder into a single sourcetype for indexing (and later searching) purposes. The sources each have specific pre-processing that needs to be done, and then a bunch of common processing that I can assign to the sourcetype . Suppose I have a [source::<source_spec>] stanza that specifies a number of TRANSFORMS clauses and a sourcetype = <common_sourcetype> clause, and also a [<common_sourcetype>] stanza with its own TRANSFORMS clauses. Will the source have both sets of TRANSFORMS applied? Or will the first set be ignored because the sourcetype clause "overrides" it? If I have a force_local_processing = true clause in the sourcetype stanza, will the Universal Forwarder also process the search-time REPORT and EXTRACT clauses? The FIELDALIAS , EVAL , LOOKUP clauses? I suspect no on both counts. I know SEDCMD clauses are applied at index-time, but are they applied before TRANSFORMS ? Is the order in which they appear in a stanza significant?

DUThibault · ‎01-25-2018

The niketnilay code is incorrect. A zebibyte is not 1152921504606846976 bytes (that's an exbibyte) but rather 1180591620717411303424 bytes. This code should be closer: eval $bytes$=case($bytes$>=1152921504606846976, tostring(round($bytes$/1152921504606846976,2))+" EiB",$bytes$>=1125899906842624 AND $bytes$<1152921504606846976,tostring(round($bytes$/1125899906842624,2))+" PiB",$bytes$>=1099511627776 AND $bytes$<1125899906842624,tostring(round($bytes$/1099511627776,2))+" TiB",$bytes$>=1073741824 AND $bytes$<1099511627776, tostring(round($bytes$/1073741824,2))+" GiB", $bytes$>=1048576 AND $bytes$<1073741824, tostring(round($bytes$/1048576,2))+" MiB", $bytes$>=1024 AND $bytes$<1048576, tostring(round($bytes$/1024,2))+" KiB", $bytes$<1024,tostring($bytes$+" Bytes"),1=1,tostring(round($bytes$/1180591620717411303424,2))+" ZiB")

DUThibault · ‎01-25-2018

The format_bytes code yields gibibytes (GiB), not gigabytes (GB). Use either: if($bytes$>1073741824, tostring(round($bytes$/1073741824,2))+" GiB", if($bytes$>1048576, tostring(round($bytes$/1048576,2))+" MiB", if($bytes$>1024, tostring(round($bytes$/1024))+" KiB", tostring($bytes$)+" Bytes"))) or if($bytes$>1000000000, tostring(round($bytes$/1000000000,2))+" GB", if($bytes$>1000000, tostring(round($bytes$/1000000,2))+" MB", if($bytes$>1000, tostring(round($bytes$/1000))+" kB", tostring($bytes$)+" Bytes")))

DUThibault · ‎01-22-2018

Trying a different tack. I saw with https://answers.splunk.com/answers/593409/transformsconf-wont-let-me-change-the-sourcetype.html that I can change the sourcetype of events, so I figured I would use the csv source (e.g. .../cpu-*/cpu-nice-* ), transform its _raw data into linux:collectd:graphite format, and switch its sourcetype from collectd_csv_cpu_nice to linux:collectd:graphite But it seems I failed, as the Splunk instance continues to receive just collectd_csv_cpu_nice data in the untransformed format. To be clear, the csv files have lines like this: <unix_timestamp>,<cpu_nice_jiffies> whereas linux:collectd:graphite has lines like this: <host>.cpu-<cpu>.cpu-idle.value <cpu_nice_jiffies> <unix_timestamp> (Actually it expects percentages in floating point, but my old system (collectd 4.10) cannot supply that, only integer jiffy counts. I'm sure Splunk_TA_linux won't mind...much.) So I added to props.conf and transform.conf on the Splunk instance and on the Forwarder: props.conf [collectd_csv_cpu_nice] DATETIME_CONFIG = HEADER_FIELD_LINE_NUMBER = 1 INDEXED_EXTRACTIONS = csv NO_BINARY_CHECK = true SHOULD_LINEMERGE = false TIME_FORMAT = %s category = Metrics description = collectd CSV cpu-nice metric disabled = false pulldown_type = 1 REPORT-COLLECTD-CSV-CPU-NUMBER = TRANSFORM-COLLECTD-CSV-CPU-NUMBER REPORT-COLLECTD-CSV-CPU-NICE = REPORT-COLLECTD-CSV-CPU-NICE REPORT-COLLECTD-CSV-CPU-NICE-PAYLOAD = TRANSFORM-COLLECTD-CSV-CPU-NICE-PAYLOAD REPORT-COLLECTD-CSV-CPU-NICE-SOURCETYPE = TRANSFORM-COLLECTD-CSV-CPU-NICE-SOURCETYPE transforms.conf &num; Extracts the CPU number from the source's enclosing directory name [TRANSFORM-COLLECTD-CSV-CPU-NUMBER] FORMAT = cpu::$1 REGEX = ^.*/cpu-([0-9]+)/ SOURCE_KEY = source &num; Overall input format [REPORT-COLLECTD-CSV-CPU-NICE] DELIMS = "," FIELDS = "unix_timestamp","cpu_nice_jiffies" &num; Rewrites the _raw line to conform to linux:collectd:graphite format [TRANSFORM-COLLECTD-CSV-CPU-NICE-PAYLOAD] REGEX = (.*?) FORMAT = _raw::$host.cpu-$cpu.cpu-idle.value $cpu_nice_jiffies $unix_timestamp &num; Changes the sourcetype to linux:collectd:graphite [TRANSFORM-COLLECTD-CSV-CPU-NICE-SOURCETYPE] DEST_KEY = MetaData:Sourcetype REGEX = (.*?) FORMAT = sourcetype::linux:collectd:graphite I probably wrote the TRANSFORM-COLLECTD-CSV-CPU-NICE-PAYLOAD FORMAT line all wrong. Help?

DUThibault · ‎01-22-2018

To change the indexed source and/or sourcetype after the fact, you need to export the raw data, delete it from the index, then re-import it. This is explained here: https://answers.splunk.com/answers/39756/change-source-and-sourcetype.html In summary: 1) Export the logs with incorrect sourcetype so you have the raw, original logs: splunk search "index=myindex sourcetype=wrong_source_type" -maxout 0 -output rawdata > raw.logs (Optionally also specify earliest=<timestamp> latest=<timestamp> -preview 0 etc.) 2) Delete the logs with incorrect sourcetype: index=myindex sourcetype=wrong_source_type | delete (Use the same arguments as for the search) (You have to add the 'delete' role to your account before doing this) 3) Re-index the raw logs: splunk add oneshot raw.logs -host myhost -index myindex -sourcetype correct_sourcetype -rename-source correct_source (Without -rename-source , the new index would have source=raw.logs )

DUThibault · ‎01-22-2018

Just a note: Not all stanzas in transforms.conf will require REGEX, as extractions can end up listed in that file too (confusing, isn't it?). Also, your REGEX does not need to capture anything since you're not using the capture result. REGEX = .* will do just fine.

DUThibault · ‎01-12-2018

See you here: https://answers.splunk.com/answers/598234/importing-collectd-csv-data-for-consumption-by-spl.html And thanks for helping me!

DUThibault · ‎01-12-2018

Some progress. source type: collectd_csv_cpu_idle dest app: Search & Reporting category: Custom (would Metrics be better?) indexed extractions: csv timestamp: extraction: Advanced time zone: auto timestamp format: %s timestamp fields: (blank) Delimited settings: field delimiter: comma quote character: double quote (unused) File preamble: (blank) Field names: Line... Field names on line number: 1 Advanced: SHOULD_LINEMERGE: false (was true by default but since csv and collectd_http use false this makes more sense) Then defined some extractions and transformations. REPORT-COLLECTD-CSV-CPU-IDLE transformation type: delimiter-based delimiters: "," field list: "unix_timestamp","cpu_idle_jiffies" source key: _raw TRANSFORM-COLLECTD-CSV-CPU-NUMBER transformation type: regex-based regular expression: ^.*/cpu-([0-9]+)/ format: cpu::$1 source key: source collectd_csv_cpu_idle : REPORT-COLLECTD-CSV-CPU-IDLE extraction extraction/transform: REPORT-COLLECTD-CSV-CPU-IDLE collectd_csv_cpu_idle : REPORT-COLLECTD-CSV-CPU-NUMBER extraction extraction/transform: TRANSFORM-COLLECTD-CSV-CPU-NUMBER The collectd file header was still getting through, so based on answer 586952 I've tried copying the Splunk instance's /opt/splunk/etc/apps/search/local/props.conf and transforms.conf to the universal forwarder's /opt/splunkforwarder/etc/apps/_server_app_<server class>/local/ Since collectd generates new files every day, we'll know tomorrow if this has gotten rid of the headers being read as data (event _raw string "epoch,value"). I can readily extend this pattern of sourcetypes, transformations and extractions to map the rest of the collectd data ( df , interface , irq and so on). Now the problem is how to map this into the CIM. According to http://docs.splunk.com/Documentation/AddOns/released/Linux/Configure2 for instance, the Splunk Add-on for Linux expects the sourcetype linux:collectd:http:json but this does not appear in my list of sourcetypes, so I can't even inspect it to know what's in it.

DUThibault · ‎01-12-2018

How do I "link you"? I don't see anything resembling that on my original question's page.

DUThibault · ‎01-12-2018

So I should copy [Splunk Instance]/opt/splunk/etc/apps/search/local/props.conf and transforms.conf to [Splunk Universal Forwarder]/opt/splunkforwarder/etc/apps/_server_app_<server class>/local/ , correct?

DUThibault · ‎01-12-2018

Please clarify: when you say "you need to make sure you put the props/transforms on the forwarder", do you mean a forwarding Splunk instance, or do you mean a Splunk Universal Forwarder?

DUThibault · ‎01-11-2018

Having the Web interface state "default is" sounds like a lie, then. Okay, this is starting to make sense. The process is: 1) Create a transformation ( Settings: (Knowledge) Fields: Field transformations: New ) 2) Edit its permissions (if needed) 3) Create an extraction ( Settings: (Knowledge) Fields: Field extractions: New ) that uses the transformation 4) Edit its permissions (if needed) The transformation: destination app: search name: TRANSFORM-COLLECTD-CSV-CPU-NUMBER type: regex-based regular expression: ^.*/cpu-([0-9]+)/ source key: source The extraction: destination app: search name: COLLECTD-CSV-CPU-NUMBER (this will get a REPORT- prefix) apply to: sourcetype named: collectd_csv_cpu_idle type: uses transform extraction/transform: TRANSFORM-COLLECTD-CSV-CPU-NUMBER The extraction will be listed as collectd_csv_cpu_idle : REPORT-COLLECTD-CSV-CPU-NUMBER . I can then create more extractions that use the same transform for other sourcetypes (e.g. collectd_csv_cpu_interrupt : REPORT-COLLECTD-CSV-CPU-NUMBER , collectd_csv_cpu_nice : REPORT-COLLECTD-CSV-CPU-NUMBER , collectd_csv_cpu_softirq : REPORT-COLLECTD-CSV-CPU-NUMBER , collectd_csv_cpu_steal : REPORT-COLLECTD-CSV-CPU-NUMBER , collectd_csv_cpu_system : REPORT-COLLECTD-CSV-CPU-NUMBER , collectd_csv_cpu_user : REPORT-COLLECTD-CSV-CPU-NUMBER , collectd_csv_cpu_wait : REPORT-COLLECTD-CSV-CPU-NUMBER )

DUThibault · ‎01-11-2018

The slashes do not need escaping, and naming the capture group seems redundant (wouldn't the format then become "cpu::$cpu"?).

DUThibault · ‎01-10-2018

I have these events that come with a source attribute something like source = /var/collectd/csv/sv3vm5b/cpu-0/cpu-idle-2018-01-10 and I need to extract the CPU number (the cpu-0 part, which can also be cpu-1 , cpu-2 , or cpu-3 ). So I tried to create (for my sourcetype) a transformation ( Fields: Field transformations: Add new ). The destination app is search , the new field name is cpu , the type is regex-based with the regular expression ^.*/cpu-([0-9]+)/ and the source key source . According to the form, the default format ( <transform_stanza_name>::$1 ) should do just fine so I leave the Format box blank. But it won't save, yielding this error message: Encountered the following error while trying to save: Invalid FORMAT: (I would add a screen capture but I don't have enough karma yet). Help?

DUThibault · ‎12-15-2017

I’m trying to use the Splunk_TA_linux app (3412) with an old system (CentOS 5 vintage) as the target. Getting collectd to send its observations to Splunk is problematic (the collectd version is too old and I’m limited as to what I can change on the target), so I’ve been forced to set up collectd to merely dump its data locally in csv format, and I intend to have the Universal Forwarder monitor the data dump directory. The problem is in converting the event formats into what Splunk_TA_linux expects, namely event types such as linux_collectd_cpu , linux_collectd_memory , and so forth. I think I need to define a bunch of new sourcetypes, which will manipulate the events to transform them into the various event types expected. The forwarder is limited to INDEXED_EXTRACTIONS , but that should be enough. Collectd has been configured to monitor several system metrics, and uses its csv plugin for output. The csv files go in the /var/collectd/csv folder. Collectd then creates a single subfolder, named using <hostname> (in this case, sv3vm5b.etv.lab ). There are then a bunch of subfolders for the various metrics: cpu-0, cpu-1, cpu-2, cpu-3, df, disk-vda, disk-vda1, disk-vda2, interface, irq, load, memory, processes, processes-all, swap, tcp-conns-22-local, tcp-conns-111-local, tcp-conns-698-local, tcp-conns-2207-local, tcp-conns-2208-local, tcp-conns-8089-local, uptime . The cpu-* folders are tracking several cpu metrics ( idle, interrupt, nice, softirq, steal, system, user, wait ). The first metric (CPU idle time) generates daily files, e.g. cpu-idle-2017-12-12, cpu-idle-2017-12-13 , etc. This pattern is the same for each metric. The contents of cpu-idle-<date> are: epoch,value 1513025715,491259 1513025725,492242 ... Again, this pattern is the same for the other files: a header line listing the fields (although the names are pretty generic), then regular measurements consisting of a Unix timestamp followed by one to three integer or floating-point values. What the collectd input plugins measure is documented on the collectd wiki. In collectd JSON or Graphite mode, the Splunk source type is linux:collectd:http:json or linux:collectd:graphite , the event type is linux_collectd_cpu and the data model is "ITSI OS Model Performance.CPU". Splunk_TA_linux's eventtypes.conf ties linux_collectd_cpu to the two source types, so this gives rise to a first question: Will Splunk_TA_linux's eventtypes.conf need tweaking? Assuming I set the forwarder to monitoring /var/collectd/csv/*/cpu-*/cpu-idle-* (can I specify paths using jokers like that?), I could then set the source type for those daily files as a custom type. The process would be repeated for the various other collectd files and folders, resulting in a slew of custom source types. source type: collectd_csv_cpu_idle dest app: Search & Reporting (should this be Splunk_TA_linux ?) category: Metrics indexed extractions: csv timestamp: auto (this will recognise a Unix timestamp, right?) field delimiter: comma quote character: double quote (unused) File preamble: ^epoch,value$ Field names: custom …and that’s where I’m stumped. This expects a comma-separated list of field names. Is the first one _time or is that assumed? The “ITSI OS Model Performance.CPU” documentation has no fields for the jiffy counts ( cpu-idle, -interrupt, -nice, -softirq, -steal, -system, -user, -wait are reporting the number of jiffies spent in each of the possible CPU states, respectively idle, IRQ, nice, softIRQ, steal, system, user, wait-IO ) but does have cpu_time and cpu_user_percent fields. Isn’t there supposed to be a correspondence? Is Splunk_TA_linux further transforming the collectd inputs to fit them to the data models, so that I need more than just INDEXED_EXTRACTIONS ? And what about those fields that can only be extracted from the source paths, like the host ( sv3vm5b.etv.lab ) and number of CPUs, for instance?

DUThibault · ‎12-11-2017

Another approach, particularly when collectd is monitoring a remote system on which a Splunk Universal Forwarder is installed, would be to select the CSV output plugin, and then have the forwarder monitor the selected DataDir ,Another approach, particularly if collectd is monitoring a remote system with a Splunk Universal Forwarder on it, would be to select the CSV plugin for collectd output. The csv output directory ( DataDir setting) would then be monitored by the forwarder.

DUThibault · ‎12-05-2017

And how does one disable the search? At first I thought it would be on the Search or Configure panels of Linux Auditd but no...You have to go to 'Settings: (Knowledge) Searches, reports, and alerts', and then filter by 'App: Linux Auditd Technology Add-On (TA_linux-auditd)'. Locate the "Update learnt_posix_identities KVStore collection" search and then choose Disable from the Edit drop-down.

DUThibault · ‎12-04-2017

First mentioned by damien_chillet, this is by far the simplest approach. It would also be amenable to being made into a macro if I were so inclined.

DUThibault · ‎12-04-2017

DUThibault · ‎12-01-2017

It gets closer, but the fragment: | inputlookup myfieldstolookup.csv | table fieldName | rename fieldName as query | format "" "" "" "" "" "" now yields: "Authentication.action" "Authentication.app" "Authentication.dest" "Authentication.src" "Authentication.src_user" "Authentication.tag" date_zone eventtype host index info linecount source sourcetype splunk_server "tag::action" "tag::eventtype" which again fails to filter the fields. Looks like the quotes get added when "punctuation" (i.e. colons and periods) is present in the field names.

DUThibault · ‎12-01-2017

http://docs.splunk.com/Documentation/Splunk/7.0.0/Knowledge/Usefieldlookupstoaddinformationtoyourevents says "The table in the CSV file should have at least two columns", which is what initially led me to put two columns in there.

DUThibault · ‎12-01-2017

The macro solution is a good one, simpler than inputlookup . By the same token, it may be easier to do: | fields + `date_fields_to_keep` (Although I do realize that the two solutions are not equivalent when the events don't have a uniform field signature)

DUThibault · ‎12-01-2017

I tried: | datamodel Authentication Successful_Authentication search | search sourcetype=audittrail | table * [|inputlookup myfieldstolookup.csv | eval query="searchTerm=".fieldName | table query | format ] | fieldsummary where I used the Lookup Editor app to create myfieldstolookup.csv but I'm not sure what the contents of that file should be. I tried a file that looks like: fieldName,fieldNameOut Authentication.action,Authentication.action ... tag::eventtype,tag::eventtype Far from filtering on the 17 fields I want, it added 21 fields ( addr , auid , dev ...) to the unfiltered 49-field search. The fragment: [|inputlookup myfieldstolookup.csv | eval query="searchTerm=".fieldName | table query | format ] returns ( ("searchTerm=Authentication.action") OR ... ) instead of the expected ( (searchTerm=Authentication.action) OR ... ) What am I doing wrong?

DUThibault · ‎12-01-2017

What makes this bug particularly sneaky is that regardless of where you reached the Data Model Editor page from, the "Apps >" menu reverts to "Apps >" so you can't tell whether or not it'll be broken.

Posts	121
Solutions	4
Karma Given	34
Karma Received	10
Member Since	‎06-22-2016

Online Status	Offline
Date Last Visited	‎06-05-2020 02:04 AM

Shouldn't Splunk hire an historian?

Signaling a problem with a Splunk Web page

How to push *.conf to universal forwarders?

How to split an event at index time?

Do TRANSFORMS in a source stanza and a sourcetype ...

Invalid FORMAT when creating a field transformatio...

Importing collectd csv data for consumption by Spl...

Excluding a field name from fields command exclusi...

Common Information Model (CIM) Data Model Editor m...

Where is the logtype source type defined?

Re: Do TRANSFORMS in a source stanza and a sourcet...

Do TRANSFORMS in a source stanza and a sourcetype ...

Re: Some byte values not converting to GB, while o...

Re: Some byte values not converting to GB, while o...

Re: Importing collectd csv data for consumption by...

Re: How to move data from one sourcetype to anothe...

Re: transforms.conf won't let me change the source...

Re: How to skip header in CSV files before indexin...

Re: Importing collectd csv data for consumption by...

Re: How to skip header in CSV files before indexin...

Re: How to skip header in CSV files before indexin...

Re: How to skip header in CSV files before indexin...

Re: Invalid FORMAT when creating a field transform...

Re: Invalid FORMAT when creating a field transform...

Invalid FORMAT when creating a field transformatio...

Importing collectd csv data for consumption by Spl...

Re: RRD Data into Splunk

Re: Why is my Linux Auditd app confusing the UID's...

Re: Excluding a field name from fields command exc...

Re: Excluding a field name from fields command exc...

Re: Excluding a field name from fields command exc...

Re: Excluding a field name from fields command exc...

Re: Excluding a field name from fields command exc...

Re: Excluding a field name from fields command exc...

Re: Common Information Model (CIM) Data Model Edit...