Splunk Search

How to convert raw csv data into table format

Splunk_sid
Explorer

Hi Team,

We have onboarded csv data into Splunk and each row in csv is ingested into _raw field . I need to bring this back to tabular format and run query against it. Kindly assist.

Note- We are not supposed to add csv files directly into the Splunk via "Add inputs" option.

Regards,

Sid

Labels (3)
0 Karma

yuanliu
SplunkTrust
SplunkTrust

I assume that each CSV file is ingested as one source.  In that case, a search by source would retrieve all entries in that file.  All you need is to add back headers.  But keep in mind:

  1. You cannot get the original order of headers back.  Headers will be in ASCII order.
  2. You cannot get the original row order back.  Rows will be in matrix ASCII order.
  3. If your original headers contain patterns incompatible with Splunk header standards, those headers will be changed to Splunk headers.

Here, I will use an ordinary CSV as example:

 

c, b, a
1,2,3
4,,5
,6,7
8,,

 

You can use the following

 

index=myindex sourcetype=mysourcetype source=mycsv
| tojson
| appendpipe
    [foreach *
        [eval header = mvappend(header, "<<FIELD>>")]]
| fields header
| where isnotnull(header)
| stats values(header) as header values(_raw) as json
| foreach json mode=multivalue
    [ eval csv = mvappend(csv, "\"" . mvjoin(mvmap(header, if(isnull(spath(<<ITEM>>, header)),"", spath(<<ITEM>>, header))), "\",\"") . "\"")]
| eval csv = mvjoin(header, ",") . "
" . mvjoin(csv, "
")

 

The mock data will give a csv field with the following value:

 

a,b,c
"3","2","1"
"5","","4"
"7","6",""
"","","8"

 

As explained above, much depends on the original CSV.  For example,

  • I took precaution to quote each "cell" even though your original CSV may not use quotation marks.
  • I did not quote headers, even though your original CSV may have used quotation marks.
  • With certain column names and/or cell values, additional coding is needed.

You can play with the following data emulation and compare with real data.

 

| makeresults format=csv data="c, b, a
1,2,3
4,,5
,6,7
8,,"
| foreach *
    [eval _raw = if(isnull(_raw), "", _raw . ",") . if(isnull(<<FIELD>>), "", <<FIELD>>)]
``` the above emulates
index=myindex sourcetype=mysourcetype source=mycsv
```

 

0 Karma

Splunk_sid
Explorer

@inventsekar There are multiple csv files from which data gets loaded into Splunk. So, the _raw will have column headers and other rows for each file. All I need is to convert back into rows and columns format just like what we see in csv. "table" command will not serve the purpose for my scenario.

0 Karma

inventsekar
SplunkTrust
SplunkTrust

Hi @Splunk_sid .. We may need more details from your side. 

Your current search query, what table format you are looking for, ...

thanks and best regards,
Sekar

PS - If this or any post helped you in any way, pls consider upvoting, thanks for reading !
0 Karma

Splunk_sid
Explorer

@inventsekar We have onboarded the data coming from csv files in inputs.conf as below and the data is loaded every day as new csv is created with date stamp.

[monitor:<path>/file_*.csv]
disabled = false
sourcetype = <sourcetype>
index=<index>
 
With this config, we are getting the data into splunk and each row in csv is loaded as separate event.
Query:  index=<index> sourcetype=<sourcetype>.
 
All we need is to see the data similar to csv ie. we need to have a single line header and corresponding data for those columns (just how we see when we load the csv from Add inputs via Splunk GUI). 
 
As of now, events are like as shown below and it repeats every day for new csv files
 
6/17/24
3:07:26.000 AM
 
col1,col2,col3,col4,col5,col6  

host = <host>
source =<source>
sourcetype =<sourcetype> 

6/17/24
3:07:26.000 AM
 
data1,data2,data3,data4,data5,data6

host = <host>
source =<source>
sourcetype =<sourcetype> 

 

We need the output in below format when we run query:

col1col2col3col4col5col6
data1data2data3data4data5data6

 

Regards,
Sid
0 Karma

inventsekar
SplunkTrust
SplunkTrust

Hi @Splunk_sid 

>>> Note- We are not supposed to add csv files directly into the Splunk via "Add inputs" option.
so you have onboarded the CSV file or not yet?

 

if you have onboarded the CSV file, then, just use the table command.. 

index=yourCSVindex source=someSource sourcetype=some | table *

 

thanks and best regards,
Sekar

PS - If this or any post helped you in any way, pls consider upvoting, thanks for reading !
0 Karma
Career Survey
First 500 qualified respondents will receive a $20 gift card! Tell us about your professional Splunk journey.

Can’t make it to .conf25? Join us online!

Get Updates on the Splunk Community!

Leveraging Automated Threat Analysis Across the Splunk Ecosystem

Are you leveraging automation to its fullest potential in your threat detection strategy?Our upcoming Security ...

Can’t Make It to Boston? Stream .conf25 and Learn with Haya Husain

Boston may be buzzing this September with Splunk University and .conf25, but you don’t have to pack a bag to ...

Splunk Lantern’s Guide to The Most Popular .conf25 Sessions

Splunk Lantern is a Splunk customer success center that provides advice from Splunk experts on valuable data ...