<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic How to extract data from PDF file? in Splunk Dev</title>
    <link>https://community.splunk.com/t5/Splunk-Dev/How-to-extract-data-from-PDF-file/m-p/483137#M8623</link>
    <description>&lt;P&gt;I'm new to Phantom and would like to know how I could extract data from a PDF file attached to an email. From my understanding, the workflow goes like this: email gets sent to mailbox, phantom ingests email, phantom then creates a vault artifact.&lt;/P&gt;
&lt;P&gt;Is it possible to:&lt;/P&gt;
&lt;UL&gt;
&lt;LI&gt;get the pdf file and read it's text&lt;/LI&gt;
&lt;LI&gt;determine important data in the pdf such as for example, IP addresses, URLs&lt;/LI&gt;
&lt;/UL&gt;
&lt;P&gt;and how?&lt;/P&gt;</description>
    <pubDate>Sun, 07 Jun 2020 18:46:05 GMT</pubDate>
    <dc:creator>maangellamatini</dc:creator>
    <dc:date>2020-06-07T18:46:05Z</dc:date>
    <item>
      <title>How to extract data from PDF file?</title>
      <link>https://community.splunk.com/t5/Splunk-Dev/How-to-extract-data-from-PDF-file/m-p/483137#M8623</link>
      <description>&lt;P&gt;I'm new to Phantom and would like to know how I could extract data from a PDF file attached to an email. From my understanding, the workflow goes like this: email gets sent to mailbox, phantom ingests email, phantom then creates a vault artifact.&lt;/P&gt;
&lt;P&gt;Is it possible to:&lt;/P&gt;
&lt;UL&gt;
&lt;LI&gt;get the pdf file and read it's text&lt;/LI&gt;
&lt;LI&gt;determine important data in the pdf such as for example, IP addresses, URLs&lt;/LI&gt;
&lt;/UL&gt;
&lt;P&gt;and how?&lt;/P&gt;</description>
      <pubDate>Sun, 07 Jun 2020 18:46:05 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Dev/How-to-extract-data-from-PDF-file/m-p/483137#M8623</guid>
      <dc:creator>maangellamatini</dc:creator>
      <dc:date>2020-06-07T18:46:05Z</dc:date>
    </item>
    <item>
      <title>Re: How to extract data from PDF file?</title>
      <link>https://community.splunk.com/t5/Splunk-Dev/How-to-extract-data-from-PDF-file/m-p/483138#M8624</link>
      <description>&lt;P&gt;The best way to get IOCs out of a PDF is to use the Phantom Parser App &lt;A href="https://my.phantom.us/4.6/docs/app_reference/phantom_parser"&gt;link text&lt;/A&gt;. &lt;/P&gt;

&lt;P&gt;This app will require the file to be in the vault or file location of the platform.  Normally if ingesting via email, pdf attachments are automatically attached to the File/Vault location.  Then you will need a vaultId from the File Artifact or Vault Artifact to send to the parser for it to extract the IOCs just like we do in emails.&lt;/P&gt;

&lt;P&gt;I hope this helps.&lt;/P&gt;</description>
      <pubDate>Thu, 14 Nov 2019 12:19:07 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Dev/How-to-extract-data-from-PDF-file/m-p/483138#M8624</guid>
      <dc:creator>rgresham_splunk</dc:creator>
      <dc:date>2019-11-14T12:19:07Z</dc:date>
    </item>
    <item>
      <title>Re: How to extract data from PDF file?</title>
      <link>https://community.splunk.com/t5/Splunk-Dev/How-to-extract-data-from-PDF-file/m-p/483139#M8625</link>
      <description>&lt;P&gt;The best way to get IOCs out of a PDF is to use the Phantom Parser App &lt;A href="https://my.phantom.us/4.6/docs/app_reference/phantom_parser"&gt;Phantom Parser App link&lt;/A&gt;. &lt;/P&gt;

&lt;P&gt;This app will require the file to be in the vault or file location of the platform.  Normally if ingesting via email, pdf attachments are automatically attached to the File/Vault location.  Then you will need a vaultId from the File Artifact or Vault Artifact to send to the parser for it to extract the IOCs just like we do in emails.&lt;/P&gt;

&lt;P&gt;I hope this helps.&lt;/P&gt;</description>
      <pubDate>Thu, 14 Nov 2019 12:19:25 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Dev/How-to-extract-data-from-PDF-file/m-p/483139#M8625</guid>
      <dc:creator>rgresham_splunk</dc:creator>
      <dc:date>2019-11-14T12:19:25Z</dc:date>
    </item>
    <item>
      <title>Re: How to extract data from PDF file?</title>
      <link>https://community.splunk.com/t5/Splunk-Dev/How-to-extract-data-from-PDF-file/m-p/483140#M8626</link>
      <description>&lt;P&gt;Thank you, rgresham! This was extremely helpful.&lt;/P&gt;</description>
      <pubDate>Fri, 15 Nov 2019 01:02:24 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Dev/How-to-extract-data-from-PDF-file/m-p/483140#M8626</guid>
      <dc:creator>maangellamatini</dc:creator>
      <dc:date>2019-11-15T01:02:24Z</dc:date>
    </item>
  </channel>
</rss>

