<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Splunk Regex (mail csv data extraction) in Dashboards &amp; Visualizations</title>
    <link>https://community.splunk.com/t5/Dashboards-Visualizations/Splunk-Regex-mail-csv-data-extraction/m-p/557804#M38805</link>
    <description>&lt;P&gt;Well you fixed my issues, thank you, &lt;span class="lia-unicode-emoji" title=":slightly_smiling_face:"&gt;🙂&lt;/span&gt;&lt;/P&gt;&lt;P&gt;I looked around and I found somepeople talk about summary index , do you think this would be a good option for me ?&lt;/P&gt;</description>
    <pubDate>Wed, 30 Jun 2021 12:47:28 GMT</pubDate>
    <dc:creator>Joannna</dc:creator>
    <dc:date>2021-06-30T12:47:28Z</dc:date>
    <item>
      <title>Splunk Regex (mail csv data extraction)</title>
      <link>https://community.splunk.com/t5/Dashboards-Visualizations/Splunk-Regex-mail-csv-data-extraction/m-p/557772#M38800</link>
      <description>&lt;P&gt;I have a field that's called file_content on an source type.&lt;BR /&gt;This has a CSV inside.&lt;/P&gt;&lt;P&gt;Meaning every event has a field (file_content) that has a csv inside it. Every event is an email Can't be field extraction as the "file_content" is really hard to find inside the data.&lt;/P&gt;&lt;P&gt;I used the regex query to extract the data, and I have 2 issues with it,&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;U&gt;&lt;STRONG&gt;1.&lt;/STRONG&gt;&lt;/U&gt; field TicketNumber has both, but my regex ignores the " and I get number2 on the next column&lt;BR /&gt;- ," number , number2 ",&lt;BR /&gt;- ,number,&lt;/P&gt;&lt;P&gt;&lt;BR /&gt;&lt;U&gt;&lt;STRONG&gt;2.&lt;/STRONG&gt; &lt;/U&gt;It's very slow as I get 1 CSV per hour every day. so i wonder if there is any automation or a better way to do this&lt;/P&gt;&lt;P&gt;&lt;BR /&gt;&lt;STRONG&gt;First i get all lines:&lt;/STRONG&gt;&lt;/P&gt;&lt;P&gt;| makemv delim="&lt;BR /&gt;" file_content&lt;BR /&gt;| mvexpand file_content&lt;BR /&gt;| table file_content _time&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;Then I get the regex per line&lt;/STRONG&gt;&lt;/P&gt;&lt;P&gt;| rex field=file_content "(?P&amp;lt;ContactId&amp;gt;[^\s,]+),(?P&amp;lt;Customernumber&amp;gt;[^\s,]+),(?P&amp;lt;AfterContactWorkDuration&amp;gt;[^\s,]+),(?P&amp;lt;AfterContactWorkEndTimestamp&amp;gt;[^\s,]+),(?P&amp;lt;AfterContactWorkStartTimestamp&amp;gt;[^\s,]+),(?P&amp;lt;AgentInteractionDuration&amp;gt;[^\s,]+),(?P&amp;lt;ConnectedToAgentTimestamp&amp;gt;[^\s,]+),(?P&amp;lt;CustomerHoldDuration&amp;gt;[^\s,]+),(?P&amp;lt;Hierarchygroups_Level1_GroupName&amp;gt;[^\s,]+),(?P&amp;lt;Hierarchygroups_Level2_GroupName&amp;gt;[^\s,]+),(?P&amp;lt;Hierarchygroups_Level3_GroupName&amp;gt;[^\s,]+),(?P&amp;lt;LongestHoldDuration&amp;gt;[^\s,]+),(?P&amp;lt;NumberOfHolds&amp;gt;[^\s,]+),(?P&amp;lt;Routingprofile&amp;gt;[^\s,]+),(?P&amp;lt;Agent&amp;gt;[^\s,]+),(?P&amp;lt;AgentConnectionAttempts&amp;gt;[^\s,]+),(?P&amp;lt;ConnectedToSystemTimestamp&amp;gt;[^\s,]+),(?P&amp;lt;DisconnectTimestamp&amp;gt;[^\s,]+),(?P&amp;lt;InitiationMethod&amp;gt;[^\s,]+),(?P&amp;lt;InitiationTimestamp&amp;gt;[^\s,]+),(?P&amp;lt;LastUpdateTimestamp&amp;gt;[^\s,]+),(?P&amp;lt;NextContactId&amp;gt;[^\s,]+),(?P&amp;lt;PreviousContactId&amp;gt;[^\s,]+),(?P&amp;lt;DequeueTimestamp&amp;gt;[^\s,]+),(?P&amp;lt;Duration&amp;gt;[^\s,]+),(?P&amp;lt;EnqueueTimestamp&amp;gt;[^\s,]+),(?P&amp;lt;Name&amp;gt;[^\s,]+),(?P&amp;lt;TransferCompletedTimestamp&amp;gt;[^\s,]+),(?P&amp;lt;HandleTime&amp;gt;[^\s,]+),(?P&amp;lt;TicketNumber&amp;gt;[^\s,]+),(?P&amp;lt;Account&amp;gt;[^\s,]+),(?P&amp;lt;AccountName&amp;gt;[^\s,]+),(?P&amp;lt;Country&amp;gt;[^\s,]+),(?P&amp;lt;Language&amp;gt;[^\s,]+),(?P&amp;lt;Site&amp;gt;[^\s,]+),(?P&amp;lt;WrapCode&amp;gt;[^\s,]+)"&lt;/P&gt;&lt;P&gt;&lt;BR /&gt;&lt;STRONG&gt;And here is an example of how the data should look like (in csv):&lt;/STRONG&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;ContactId,Customernumber,AfterContactWorkDuration,AfterContactWorkEndTimestamp,AfterContactWorkStartTimestamp,AgentInteractionDuration,ConnectedToAgentTimestamp,CustomerHoldDuration,Hierarchygroups_Level1_GroupName,Hierarchygroups_Level2_GroupName,Hierarchygroups_Level3_GroupName,LongestHoldDuration,NumberOfHolds,Routingprofile,Agent,AgentConnectionAttempts,ConnectedToSystemTimestamp,DisconnectTimestamp,InitiationMethod,InitiationTimestamp,LastUpdateTimestamp,NextContactId,PreviousContactId,DequeueTimestamp,Duration,EnqueueTimestamp,Name,TransferCompletedTimestamp,HandleTime,TicketNumber,Account,AccountName,Country,Language,Site,WrapCode&lt;BR /&gt;aaaa-xxxxxx,123456789,90,29/06/2021 01:00,29/06/2021 01:00,111,29/06/2021 01:00,0,country1,xx,yy,90,90,language,dummy,1,29/06/2021 01:00,29/06/2021 01:00,type_x,29/06/2021 01:00,29/06/2021 01:00,,,29/06/2021 01:00,11,29/06/2021 01:00,type_y,29/06/2021 01:00,201,A123,xxx,xxx,country_y,language,type_w,xxxx&lt;BR /&gt;bbbb-xxxxxx,987654321,90,29/06/2021 01:00,29/06/2021 01:00,111+P4,29/06/2021 01:00,0,country1,xx,yy,90,90,language,dummy,1,29/06/2021 01:00,29/06/2021 01:00,type_x,29/06/2021 01:00,29/06/2021 01:00,,,29/06/2021 01:00,11,29/06/2021 01:00,type_y,29/06/2021 01:00,201,"""A123,B123""",xxx,xxx,country_y,language,type_w,xxxx&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Wed, 30 Jun 2021 09:25:51 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Dashboards-Visualizations/Splunk-Regex-mail-csv-data-extraction/m-p/557772#M38800</guid>
      <dc:creator>Joannna</dc:creator>
      <dc:date>2021-06-30T09:25:51Z</dc:date>
    </item>
    <item>
      <title>Re: Splunk Regex (mail csv data extraction)</title>
      <link>https://community.splunk.com/t5/Dashboards-Visualizations/Splunk-Regex-mail-csv-data-extraction/m-p/557787#M38801</link>
      <description>&lt;P&gt;Your timestamps have spaces in (not accounted for in your expression) and the ticket number optionally has pairs of double quotes. Try something like this&lt;/P&gt;&lt;LI-CODE lang="markup"&gt;| rex field=file_content "(?P&amp;lt;ContactId&amp;gt;[^\s,]*),(?P&amp;lt;Customernumber&amp;gt;[^\s,]*),(?P&amp;lt;AfterContactWorkDuration&amp;gt;[^\s,]*),(?P&amp;lt;AfterContactWorkEndTimestamp&amp;gt;[^,]*),(?P&amp;lt;AfterContactWorkStartTimestamp&amp;gt;[^,]*),(?P&amp;lt;AgentInteractionDuration&amp;gt;[^\s,]*),(?P&amp;lt;ConnectedToAgentTimestamp&amp;gt;[^,]*),(?P&amp;lt;CustomerHoldDuration&amp;gt;[^\s,]*),(?P&amp;lt;Hierarchygroups_Level1_GroupName&amp;gt;[^\s,]*),(?P&amp;lt;Hierarchygroups_Level2_GroupName&amp;gt;[^\s,]*),(?P&amp;lt;Hierarchygroups_Level3_GroupName&amp;gt;[^\s,]*),(?P&amp;lt;LongestHoldDuration&amp;gt;[^\s,]*),(?P&amp;lt;NumberOfHolds&amp;gt;[^\s,]*),(?P&amp;lt;Routingprofile&amp;gt;[^\s,]*),(?P&amp;lt;Agent&amp;gt;[^\s,]*),(?P&amp;lt;AgentConnectionAttempts&amp;gt;[^\s,]*),(?P&amp;lt;ConnectedToSystemTimestamp&amp;gt;[^,]*),(?P&amp;lt;DisconnectTimestamp&amp;gt;[^,]*),(?P&amp;lt;InitiationMethod&amp;gt;[^\s,]*),(?P&amp;lt;InitiationTimestamp&amp;gt;[^,]*),(?P&amp;lt;LastUpdateTimestamp&amp;gt;[^,]*),(?P&amp;lt;NextContactId&amp;gt;[^\s,]*),(?P&amp;lt;PreviousContactId&amp;gt;[^\s,]*),(?P&amp;lt;DequeueTimestamp&amp;gt;[^,]*),(?P&amp;lt;Duration&amp;gt;[^\s,]*),(?P&amp;lt;EnqueueTimestamp&amp;gt;[^,]*),(?P&amp;lt;Name&amp;gt;[^\s,]*),(?P&amp;lt;TransferCompletedTimestamp&amp;gt;[^,]*),(?P&amp;lt;HandleTime&amp;gt;[^\s,]*),(?P&amp;lt;TicketNumber&amp;gt;((\"[^\"]*\")+|[^\s,]*)),(?P&amp;lt;Account&amp;gt;[^\s,]*),(?P&amp;lt;AccountName&amp;gt;[^\s,]*),(?P&amp;lt;Country&amp;gt;[^\s,]*),(?P&amp;lt;Language&amp;gt;[^\s,]*),(?P&amp;lt;Site&amp;gt;[^\s,]*),(?P&amp;lt;WrapCode&amp;gt;[^\s,]*)"&lt;/LI-CODE&gt;</description>
      <pubDate>Wed, 30 Jun 2021 11:23:53 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Dashboards-Visualizations/Splunk-Regex-mail-csv-data-extraction/m-p/557787#M38801</guid>
      <dc:creator>ITWhisperer</dc:creator>
      <dc:date>2021-06-30T11:23:53Z</dc:date>
    </item>
    <item>
      <title>Re: Splunk Regex (mail csv data extraction)</title>
      <link>https://community.splunk.com/t5/Dashboards-Visualizations/Splunk-Regex-mail-csv-data-extraction/m-p/557797#M38802</link>
      <description>&lt;P&gt;wow thank you so much the issue is now , the 2second one that it's too much data and I get this error :&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;DIV&gt;command.mvexpand: output will be truncated at 15700 results due to excessive memory usage. Memory threshold of 500MB as configured in limits.conf / [mvexpand] / max_mem_usage_mb has been reached.&lt;/DIV&gt;&lt;DIV&gt;&amp;nbsp;&lt;/DIV&gt;&lt;DIV&gt;Anything I can do here or any other way I can do this extraction?&lt;/DIV&gt;&lt;DIV&gt;For example you can run a report everyday and save the outcome to an lookup , but this wouldn't work as it would be too much data for a lookup , is there any other solution?&lt;/DIV&gt;&lt;DIV&gt;&amp;nbsp;&lt;/DIV&gt;</description>
      <pubDate>Wed, 30 Jun 2021 12:21:09 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Dashboards-Visualizations/Splunk-Regex-mail-csv-data-extraction/m-p/557797#M38802</guid>
      <dc:creator>Joannna</dc:creator>
      <dc:date>2021-06-30T12:21:09Z</dc:date>
    </item>
    <item>
      <title>Re: Splunk Regex (mail csv data extraction)</title>
      <link>https://community.splunk.com/t5/Dashboards-Visualizations/Splunk-Regex-mail-csv-data-extraction/m-p/557798#M38803</link>
      <description>&lt;P&gt;Try moving the table command to before the mvexpand that way _raw doesn't get copied into every event before being eliminated. Other than that, you could try increasing the limit in limits.conf, or there is a longer way around it described in a post I made a while ago (although this doesn't really solve the memory issue, only the row count issue).&lt;/P&gt;</description>
      <pubDate>Wed, 30 Jun 2021 12:28:24 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Dashboards-Visualizations/Splunk-Regex-mail-csv-data-extraction/m-p/557798#M38803</guid>
      <dc:creator>ITWhisperer</dc:creator>
      <dc:date>2021-06-30T12:28:24Z</dc:date>
    </item>
    <item>
      <title>Re: Splunk Regex (mail csv data extraction)</title>
      <link>https://community.splunk.com/t5/Dashboards-Visualizations/Splunk-Regex-mail-csv-data-extraction/m-p/557802#M38804</link>
      <description>&lt;P&gt;&lt;a href="https://community.splunk.com/t5/user/viewprofilepage/user-id/235950"&gt;@Joannna&lt;/a&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Instead of this&lt;/P&gt;&lt;LI-CODE lang="markup"&gt;| makemv delim="
" file_content
| mvexpand file_content
| table file_content _time&lt;/LI-CODE&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I suggest below solutions. You can pick any of them&lt;/P&gt;&lt;LI-CODE lang="markup"&gt;| rename file_content as _raw
| multikv forceheader=1
| table _time _raw&lt;/LI-CODE&gt;&lt;P&gt;OR&lt;/P&gt;&lt;LI-CODE lang="markup"&gt;| makemv delim="
" file_content
| stats count by _time file_content&lt;/LI-CODE&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Thanks&lt;BR /&gt;KV&lt;BR /&gt;▄︻̷̿┻̿═━一&lt;BR /&gt;&lt;BR /&gt;If any of my reply helps you to solve the problem Or gain knowledge, an upvote would be appreciated.&lt;/P&gt;</description>
      <pubDate>Wed, 30 Jun 2021 12:46:35 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Dashboards-Visualizations/Splunk-Regex-mail-csv-data-extraction/m-p/557802#M38804</guid>
      <dc:creator>kamlesh_vaghela</dc:creator>
      <dc:date>2021-06-30T12:46:35Z</dc:date>
    </item>
    <item>
      <title>Re: Splunk Regex (mail csv data extraction)</title>
      <link>https://community.splunk.com/t5/Dashboards-Visualizations/Splunk-Regex-mail-csv-data-extraction/m-p/557804#M38805</link>
      <description>&lt;P&gt;Well you fixed my issues, thank you, &lt;span class="lia-unicode-emoji" title=":slightly_smiling_face:"&gt;🙂&lt;/span&gt;&lt;/P&gt;&lt;P&gt;I looked around and I found somepeople talk about summary index , do you think this would be a good option for me ?&lt;/P&gt;</description>
      <pubDate>Wed, 30 Jun 2021 12:47:28 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Dashboards-Visualizations/Splunk-Regex-mail-csv-data-extraction/m-p/557804#M38805</guid>
      <dc:creator>Joannna</dc:creator>
      <dc:date>2021-06-30T12:47:28Z</dc:date>
    </item>
    <item>
      <title>Re: Splunk Regex (mail csv data extraction)</title>
      <link>https://community.splunk.com/t5/Dashboards-Visualizations/Splunk-Regex-mail-csv-data-extraction/m-p/557807#M38806</link>
      <description>&lt;P&gt;Thank you, i tested both and checked the fastest one is the second approach takes 1.321 secs my original after moving the table to the top as sugested was 1.789 secs. &lt;span class="lia-unicode-emoji" title=":slightly_smiling_face:"&gt;🙂&lt;/span&gt; thanks&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Wed, 30 Jun 2021 12:53:49 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Dashboards-Visualizations/Splunk-Regex-mail-csv-data-extraction/m-p/557807#M38806</guid>
      <dc:creator>Joannna</dc:creator>
      <dc:date>2021-06-30T12:53:49Z</dc:date>
    </item>
  </channel>
</rss>

