Monitoring Splunk

waterfall saved search

MaryvonneMB
Path Finder

Hi all!

I've got problem. It's a bit complex to explain, but I hope I'll be as clear as possible.
First I want to generate a file based on 2 differents inputs. One of these file is big (about 400,000 lines) and the other one is about 100,000 lines. To correctly generate my new file I have to do 4 join and then an append (to stille have lines that doesn't match). So the result is a file wich is sum of these 2 inputs (about 500,000lines).

After that we want to index this file. Currently I create my file as a CSV based on these 2 inputs which are CSV too. LaterI index my result file.
What my team wants is using the 2 inputs (which for other reason will already be indexed) to generate the result file and index it directly (So we delete our result CSV file).

here is my global request:

|inputlookup firstInputFile.csv
|rex field="fieldA" mode=sed "s/^0{10}//g" 
| rex field="fieldB" mode=sed "s/^0{10}//g" 

| join type=left max=1 "fieldA" [ 
|inputlookup secondInputFile.csv 
|table "fieldC"
|eval "fieldA"='fieldC' 
| eval "new_column"="X"
] 

| join type=left max=1 "fieldB" [
|inputlookup seconInputFile.csv 
|table "fieldC"
|eval "fieldB"='fieldC' 
|eval "new_column"="X"
]

| eval "new_column"=if('new_column'="X","X","Y")

| join type=left max=1 "fieldA" [
| inputlookup thirdInputFile.csv  | search "fieldD"!=""
| table some fields
| eval "fieldA"='fieldD'
]

| join type=left max=1 "fieldB" [
| inputlookup  thirdInputFile.csv  | search "fieldE"!=""
| table some fields
| eval "fieldB"='fieldE' 
] 

| join type=left max=1 "fieldB" [
| inputlookup  thirdInputFile.csv  |search "fieldD"!=""
| table some fields
| eval "fieldB"='fieldD'
]

| join type=left max=1 "fieldA" [
| inputlookup  thirdInputFile.csv  |search "fieldE"!=""
| table some fields
| eval "fieldA"='fieldE'
]

| append [
|inputlookup thirdInputFile.csv 
| rename "fieldD" as "fieldA" "fieldE" as "fieldB" 
| table some fields
]

| fields - some fields
| rex mode=sed field="fieldF" "s/([0-9]{4})(0[1-9]|1[0-2])(0[1-9]|[1-2][0-9]|3[0-1])/\3\/\2\/\1/" 
| rex mode=sed field="fieldG" "s/([0-9]{4})(0[1-9]|1[0-2])(0[1-9]|[1-2][0-9]|3[0-1])/\3\/\2\/\1/" 
| rex mode=sed field="fieldH" "s/([0-9]{4})(0[1-9]|1[0-2])(0[1-9]|[1-2][0-9]|3[0-1])/\3\/\2\/\1/" 
| rex mode=sed field="fieldI" "s/([0-9]{4})(0[1-9]|1[0-2])(0[1-9]|[1-2][0-9]|3[0-1])/\3\/\2\/\1/"
| rex mode=sed field="fieldJ" "s/([0-9]{4})(0[1-9]|1[0-2])(0[1-9]|[1-2][0-9]|3[0-1])/\3\/\2\/\1/" 
| rex mode=sed field="fieldK" "s/([0-9]{4})(0[1-9]|1[0-2])(0[1-9]|[1-2][0-9]|3[0-1])/\3\/\2\/\1/"

When I run these request and add |collect index="myindex" sourcetype="mysourcetype" source="mysource" the task ended during results finalization and splunk disconnect. If i replace the collect by only a table * it also crash. It succeed if I delete at least the append and one join. So I though it is possible to cut these big request to have temporary results used by the next part of the request. As I want to automate this process I look around saved searches but I don't really see how I can solve my problem.

If anyone has an idea, thanks in advance!

0 Karma
1 Solution

woodcock
Esteemed Legend

Try this (EXACTLY as-is; translate your field names but do not change any commands):

|inputlookup firstInputFile.csv | eval DATASET="LeftJoin"
|appendpipe [|inputlookup secondInputFile.csv 
   | table fieldC
   | eval fieldA=fieldC 
   | eval new_column="X"
]
| stats values(*) AS * BY fieldA
| search DATASET="LeftJoin"

| appendpipe [|inputlookup secondInputFile.csv 
   | table fieldC
   | eval fieldB=fieldC
   | eval new_column="X"
]
| stats values(*) AS * BY fieldB
| search DATASET="LeftJoin"

| eval new_column=coalesce(new_column, "Y")

| appendpipe [| inputlookup thirdInputFile.csv
  | search "fieldD"!=""
  | eval fieldA=fieldD
  | table some fields
]
| stats values(*) AS * BY fieldA
| search DATASET="LeftJoin"

| appendpipe [| inputlookup  thirdInputFile.csv
  | search "fieldE"!=""
  | eval fieldB=fieldE
  | table some fields
]
| stats values(*) AS * BY fieldB
| search DATASET="LeftJoin"

| appendpipe [| inputlookup  thirdInputFile.csv
  |search "fieldD"!=""
  | eval fieldB=fieldD
  | table some fields
]
| stats values(*) AS * BY fieldB
| search DATASET="LeftJoin"

| appendpipe [| inputlookup  thirdInputFile.csv
  | search "fieldE"!=""
  | eval fieldA=fieldE
  | table some fields
]
| stats values(*) AS * BY fieldA
| search DATASET="LeftJoin"

| appendpipe [|inputlookup thirdInputFile.csv 
  | rename fieldD AS fieldA
                    fieldE AS fieldB
  | table some fields
]

| fields - some fields
| rex mode=sed field=fieldF "s/([0-9]{4})(0[1-9]|1[0-2])(0[1-9]|[1-2][0-9]|3[0-1])/\3\/\2\/\1/" 
| rex mode=sed field=fieldG "s/([0-9]{4})(0[1-9]|1[0-2])(0[1-9]|[1-2][0-9]|3[0-1])/\3\/\2\/\1/" 
| rex mode=sed field=fieldH "s/([0-9]{4})(0[1-9]|1[0-2])(0[1-9]|[1-2][0-9]|3[0-1])/\3\/\2\/\1/" 
| rex mode=sed field=fieldI "s/([0-9]{4})(0[1-9]|1[0-2])(0[1-9]|[1-2][0-9]|3[0-1])/\3\/\2\/\1/"
| rex mode=sed field=fieldJ "s/([0-9]{4})(0[1-9]|1[0-2])(0[1-9]|[1-2][0-9]|3[0-1])/\3\/\2\/\1/" 
| rex mode=sed field=fieldK "s/([0-9]{4})(0[1-9]|1[0-2])(0[1-9]|[1-2][0-9]|3[0-1])/\3\/\2\/\1/"

View solution in original post

woodcock
Esteemed Legend

Try this (EXACTLY as-is; translate your field names but do not change any commands):

|inputlookup firstInputFile.csv | eval DATASET="LeftJoin"
|appendpipe [|inputlookup secondInputFile.csv 
   | table fieldC
   | eval fieldA=fieldC 
   | eval new_column="X"
]
| stats values(*) AS * BY fieldA
| search DATASET="LeftJoin"

| appendpipe [|inputlookup secondInputFile.csv 
   | table fieldC
   | eval fieldB=fieldC
   | eval new_column="X"
]
| stats values(*) AS * BY fieldB
| search DATASET="LeftJoin"

| eval new_column=coalesce(new_column, "Y")

| appendpipe [| inputlookup thirdInputFile.csv
  | search "fieldD"!=""
  | eval fieldA=fieldD
  | table some fields
]
| stats values(*) AS * BY fieldA
| search DATASET="LeftJoin"

| appendpipe [| inputlookup  thirdInputFile.csv
  | search "fieldE"!=""
  | eval fieldB=fieldE
  | table some fields
]
| stats values(*) AS * BY fieldB
| search DATASET="LeftJoin"

| appendpipe [| inputlookup  thirdInputFile.csv
  |search "fieldD"!=""
  | eval fieldB=fieldD
  | table some fields
]
| stats values(*) AS * BY fieldB
| search DATASET="LeftJoin"

| appendpipe [| inputlookup  thirdInputFile.csv
  | search "fieldE"!=""
  | eval fieldA=fieldE
  | table some fields
]
| stats values(*) AS * BY fieldA
| search DATASET="LeftJoin"

| appendpipe [|inputlookup thirdInputFile.csv 
  | rename fieldD AS fieldA
                    fieldE AS fieldB
  | table some fields
]

| fields - some fields
| rex mode=sed field=fieldF "s/([0-9]{4})(0[1-9]|1[0-2])(0[1-9]|[1-2][0-9]|3[0-1])/\3\/\2\/\1/" 
| rex mode=sed field=fieldG "s/([0-9]{4})(0[1-9]|1[0-2])(0[1-9]|[1-2][0-9]|3[0-1])/\3\/\2\/\1/" 
| rex mode=sed field=fieldH "s/([0-9]{4})(0[1-9]|1[0-2])(0[1-9]|[1-2][0-9]|3[0-1])/\3\/\2\/\1/" 
| rex mode=sed field=fieldI "s/([0-9]{4})(0[1-9]|1[0-2])(0[1-9]|[1-2][0-9]|3[0-1])/\3\/\2\/\1/"
| rex mode=sed field=fieldJ "s/([0-9]{4})(0[1-9]|1[0-2])(0[1-9]|[1-2][0-9]|3[0-1])/\3\/\2\/\1/" 
| rex mode=sed field=fieldK "s/([0-9]{4})(0[1-9]|1[0-2])(0[1-9]|[1-2][0-9]|3[0-1])/\3\/\2\/\1/"

MaryvonneMB
Path Finder

Thanks a lot. And sorry for my delayed answer. Just using your request with lookups as input, the task ending as before. But finally I haved indexed my first and third input and with your request performance is better (without unexpected ending ^^)

0 Karma
Get Updates on the Splunk Community!

Introducing the 2024 SplunkTrust!

Hello, Splunk Community! We are beyond thrilled to announce our newest group of SplunkTrust members!  The ...

Introducing the 2024 Splunk MVPs!

We are excited to announce the 2024 cohort of the Splunk MVP program. Splunk MVPs are passionate members of ...

Splunk Custom Visualizations App End of Life

The Splunk Custom Visualizations apps End of Life for SimpleXML will reach end of support on Dec 21, 2024, ...