Solved: How does one extract timestamp and csv events wher...

1sebastinator · ‎04-26-2020

I am having trouble extracting individual events from a csv file with the data formatted in the following way.
I have tried to look for similar answers online, but can't see any that meet my requirements.

year/month airport total_flights num_flights_day_1 num_flights_day_2 num_flights_day_3 etc....31
202001 NEW YORK 5 1 0 0 0
202001 PARIS 10 0 5 5 0
202001 LONDON 15 6 4 6 0

Any help would be appreciated,
Thanks

to4kawa · ‎04-27-2020

props.conf

INDEXED_EXTRACTIONS = csv

what's wrong?

| makeresults 
| eval _raw="year_mon,airport,total_flights,1,2,3,4
202001,NEW YORK,5,1,0,0,0
202001,PARIS,10,0,5,5,0
202001,LONDON,15,6,4,6,0" 
| multikv forceheader=1 
| table year_mon,airport,total_flights,1,2,3,4 
| rename COMMENT as "This is sample you provide. From here, the logic."
| eval tmp=airport."_".total_flights 
| fields - airport total_flights 
| untable tmp days impliedcount 
| eventstats values(eval(if(days="year_mon",impliedcount,NULL))) as years by tmp 
| eval timestamp=strptime(years.days,"%Y%m%d") 
| eval _time=timestamp ,airport=mvindex(split(tmp,"_"),0), total_flights=mvindex(split(tmp,"_"),1) 
| where isnotnull(timestamp) 
| table _time airport total_flights impliedcount

I don't know your actual header. Please fix it.

View solution in original post

to4kawa · ‎04-27-2020

props.conf

INDEXED_EXTRACTIONS = csv

what's wrong?

| makeresults 
| eval _raw="year_mon,airport,total_flights,1,2,3,4
202001,NEW YORK,5,1,0,0,0
202001,PARIS,10,0,5,5,0
202001,LONDON,15,6,4,6,0" 
| multikv forceheader=1 
| table year_mon,airport,total_flights,1,2,3,4 
| rename COMMENT as "This is sample you provide. From here, the logic."
| eval tmp=airport."_".total_flights 
| fields - airport total_flights 
| untable tmp days impliedcount 
| eventstats values(eval(if(days="year_mon",impliedcount,NULL))) as years by tmp 
| eval timestamp=strptime(years.days,"%Y%m%d") 
| eval _time=timestamp ,airport=mvindex(split(tmp,"_"),0), total_flights=mvindex(split(tmp,"_"),1) 
| where isnotnull(timestamp) 
| table _time airport total_flights impliedcount

I don't know your actual header. Please fix it.

1sebastinator · ‎04-27-2020

My problem is that the actual year and the month are shown within every row of the 1st column, but then actual days are shown as column headers.

Within the intersection of each ("day") column header with each row(month/year) is a value.
The field name of that value is not shown in the column header(which shows the day). The field is implied to be a count statistic representing the the number of occurrences for an event occurring on that day. These are occurrences are then categorised by the row values of the 2nd column.

Essentially, this means that every single cell, represents a distinct event. How do you parse that?

To make things more complicated, the City and Totals column would not be a distinct event, but would be common to all events on the same row.

Using the input example in my original question, my goal would be to obtain event data like this:

time::2020/01/01airport::New Yorktotal::5impliedcount::1
_time::2020/01/01airport::PARIS____total::10impliedcount::0
time::2020/01/01airport::LONDONtotal::15impliedcount::6
time::2020/01/02airport::NEW YORKtotal::5_impliedcount::0
_time::2020/01/02airport::PARIS_____total::10impliedcount::5
_time::2020/01/02airport::LONDONtotal::15impliedcount::4
_time::2020/01/03airport::NEW YORKtotal::5impliedcount::0
_time::2020/01/03airport::PARIS_____total::10impliedcount::5
_time::2020/01/03airport::LONDONtotal::15___impliedcount::6
etc...

to4kawa · ‎04-27-2020

I make query. please confirm.

1sebastinator · ‎04-28-2020

I was hoping this could be done directly upon index-time, but I suppose I could save this as a file and then re-ingest it.

Thank you for this marvellous piece of work! I aspire to be able to manipulate data as you have done.

Appreciated

How does one extract timestamp and csv events where column headers are: year/month, someField1,someField2, day1, day2, day3, etc...day31

field extraction

Join the Splunk Community Slack to learn, troubleshoot, and make connections with fellow Splunk practitioners in real time!

Join Splunk User Groups to connect and learn in-person by region or remotely by topic or industry.

Build the Future of Agentic AI: Join the Splunk Agentic Ops Hackathon

[Puzzles] Solve, Learn, Repeat: Character substitutions with Regular Expressions

Splunk Community Badges!

Join the Conversation