How to convert raw csv data into table format

Splunk_sid · ‎06-18-2024

Hi Team,

We have onboarded csv data into Splunk and each row in csv is ingested into _raw field . I need to bring this back to tabular format and run query against it. Kindly assist.

Note- We are not supposed to add csv files directly into the Splunk via "Add inputs" option.

Regards,

Sid

yuanliu · ‎06-21-2024

I assume that each CSV file is ingested as one source. In that case, a search by source would retrieve all entries in that file. All you need is to add back headers. But keep in mind:

You cannot get the original order of headers back. Headers will be in ASCII order.
You cannot get the original row order back. Rows will be in matrix ASCII order.
If your original headers contain patterns incompatible with Splunk header standards, those headers will be changed to Splunk headers.

Here, I will use an ordinary CSV as example:

c, b, a
1,2,3
4,,5
,6,7
8,,

You can use the following

index=myindex sourcetype=mysourcetype source=mycsv
| tojson
| appendpipe
    [foreach *
        [eval header = mvappend(header, "<<FIELD>>")]]
| fields header
| where isnotnull(header)
| stats values(header) as header values(_raw) as json
| foreach json mode=multivalue
    [ eval csv = mvappend(csv, "\"" . mvjoin(mvmap(header, if(isnull(spath(<<ITEM>>, header)),"", spath(<<ITEM>>, header))), "\",\"") . "\"")]
| eval csv = mvjoin(header, ",") . "
" . mvjoin(csv, "
")

The mock data will give a csv field with the following value:

a,b,c
"3","2","1"
"5","","4"
"7","6",""
"","","8"

As explained above, much depends on the original CSV. For example,

I took precaution to quote each "cell" even though your original CSV may not use quotation marks.
I did not quote headers, even though your original CSV may have used quotation marks.
With certain column names and/or cell values, additional coding is needed.

You can play with the following data emulation and compare with real data.

| makeresults format=csv data="c, b, a
1,2,3
4,,5
,6,7
8,,"
| foreach *
    [eval _raw = if(isnull(_raw), "", _raw . ",") . if(isnull(<<FIELD>>), "", <<FIELD>>)]
``` the above emulates
index=myindex sourcetype=mysourcetype source=mycsv
```

Splunk_sid · ‎06-18-2024

@inventsekar There are multiple csv files from which data gets loaded into Splunk. So, the _raw will have column headers and other rows for each file. All I need is to convert back into rows and columns format just like what we see in csv. "table" command will not serve the purpose for my scenario.

inventsekar · ‎06-18-2024

Hi @Splunk_sid .. We may need more details from your side.

Your current search query, what table format you are looking for, ...

thanks and best regards,
Sekar

PS - If this or any post helped you in any way, pls consider upvoting, thanks for reading !

Splunk_sid · ‎06-19-2024

@inventsekar We have onboarded the data coming from csv files in inputs.conf as below and the data is loaded every day as new csv is created with date stamp.

[monitor:<path>/file_*.csv]

disabled = false

sourcetype = <sourcetype>

index=<index>

With this config, we are getting the data into splunk and each row in csv is loaded as separate event.

Query: index=<index> sourcetype=<sourcetype>.

All we need is to see the data similar to csv ie. we need to have a single line header and corresponding data for those columns (just how we see when we load the csv from Add inputs via Splunk GUI).

As of now, events are like as shown below and it repeats every day for new csv files

6/17/24
3:07:26.000 AM

col1,col2,col3,col4,col5,col6

host = <host>
source =<source>
sourcetype =<sourcetype>

6/17/24
3:07:26.000 AM

data1,data2,data3,data4,data5,data6

host = <host>
source =<source>
sourcetype =<sourcetype>

We need the output in below format when we run query:

col1	col2	col3	col4	col5	col6
data1	data2	data3	data4	data5	data6

Regards,

Sid

inventsekar · ‎06-18-2024

Hi @Splunk_sid

>>> Note- We are not supposed to add csv files directly into the Splunk via "Add inputs" option.
so you have onboarded the CSV file or not yet?

if you have onboarded the CSV file, then, just use the table command..

index=yourCSVindex source=someSource sourcetype=some | table *

thanks and best regards,
Sekar

PS - If this or any post helped you in any way, pls consider upvoting, thanks for reading !

How to convert raw csv data into table format

eval

field extraction

regex

AppDynamics Summer Webinars

SOCin’ it to you at Splunk University

Credit Card Data Protection & PCI Compliance with Splunk Edge Processor

Are you a member of the Splunk Community?

How to convert raw csv data into table format

eval

field extraction

regex

AppDynamics Summer Webinars

SOCin’ it to you at Splunk University

Credit Card Data Protection & PCI Compliance with Splunk Edge Processor