Getting Data In

How to generate PDF from view in REST API?

kknopp
Path Finder

I am using Splunk 6.1.1 and currently have a form that takes an integer input (foo) and timerange. The URL for this view after entering the values is https://splunk/app/app_name/view_name?earliest=-30d&latest=now&form.foo=1#. I then generate a PDF from this page. Thing is, I need this form for about 650 instances of foo, and I'd like to run it once a month.

I can't schedule the view since it requires input, and I refuse to create 650 instance of the form without input. My hope is to create a python or ruby script that calls this view with the required input, generates a PDF from the output, and gives me the document metadata so I can store elsewhere. Is this a thing? Looking through the API docs, I see lots for scheduled searches, but this isn't a scheduled search. Any assistance would be much appreciated.

Labels (1)
1 Solution

cwue
Engager

The /services/pdfgen/render/ endpoint takes a view [as answered here: https://answers.splunk.com/answers/223655/can-i-export-pdf-via-rest.html], but luckily also a input-dashboard-xml input, which accepts xml-dashboards - as long as all tokens/variables are resolved (!).

To generate a ton of different reports based on the same view/dashboard but different search parameters I wrote a python script.

Using this script all you have to do is
- save the dasboard code as .xml-file
- figure out for which report you want to replace which tokens and have a list of those tokens for every report.
- hand over a JSON List with tokens and their respective values
{"tokenlist":[{'token':'$example$' 'value':'value for individual search'}, {'token':'$example2$' 'value':'searchstring'}]
- run the script which will send the compatible dashboard code to the API and download the resulting pdf report

So my generateReport(tokenlist) function looks like this:
(Disclaimer: this is simplified to get the concept across- I stripped error detection for invalid reports, logging, metadata creation and the like - which is crucial if you plan on automatically mailing those reports)

# import requests

with open('dashboardFile.xml','rb') as XMLfile:
    XMLDashboard =XMLfile.read().replace('\n', '')
    XMLDashboard = XMLDashboard.replace('<', '%26lt%3B')  # otherwise the API will complain

# Replace all tokens in the XMLCode
for t in tokenlist:
    XMLDashboard = XMLDashboard.replace(t['token'], t['value'])

# Send XML code to endpoint, answer should be a pdf file
r = requests.get('https://splunkhost:8089/services/pdfgen/render', auth=(splunkuser, splunkpass), params={'input-dashboard-xml':XMLDashboard,'paper-size':'a4-landscape'})

if r.status_code == 200:
    with open('report_file.pdf', 'wb') as pdffile:
        pdffile.write(r.content)

View solution in original post

Iriseng94
Explorer

HI ,

Understand that this rest api able to generate a pdf report , I am not sure where should i locate the script ? What if i want to schedule the export of pdf like daily monthly or weekly ? Should i do a scheduling of report from splunk and action behind is to run a scripts ?

Any detailed steps for newbie ?

Thanks in advance .

0 Karma

cwue
Engager

The /services/pdfgen/render/ endpoint takes a view [as answered here: https://answers.splunk.com/answers/223655/can-i-export-pdf-via-rest.html], but luckily also a input-dashboard-xml input, which accepts xml-dashboards - as long as all tokens/variables are resolved (!).

To generate a ton of different reports based on the same view/dashboard but different search parameters I wrote a python script.

Using this script all you have to do is
- save the dasboard code as .xml-file
- figure out for which report you want to replace which tokens and have a list of those tokens for every report.
- hand over a JSON List with tokens and their respective values
{"tokenlist":[{'token':'$example$' 'value':'value for individual search'}, {'token':'$example2$' 'value':'searchstring'}]
- run the script which will send the compatible dashboard code to the API and download the resulting pdf report

So my generateReport(tokenlist) function looks like this:
(Disclaimer: this is simplified to get the concept across- I stripped error detection for invalid reports, logging, metadata creation and the like - which is crucial if you plan on automatically mailing those reports)

# import requests

with open('dashboardFile.xml','rb') as XMLfile:
    XMLDashboard =XMLfile.read().replace('\n', '')
    XMLDashboard = XMLDashboard.replace('<', '%26lt%3B')  # otherwise the API will complain

# Replace all tokens in the XMLCode
for t in tokenlist:
    XMLDashboard = XMLDashboard.replace(t['token'], t['value'])

# Send XML code to endpoint, answer should be a pdf file
r = requests.get('https://splunkhost:8089/services/pdfgen/render', auth=(splunkuser, splunkpass), params={'input-dashboard-xml':XMLDashboard,'paper-size':'a4-landscape'})

if r.status_code == 200:
    with open('report_file.pdf', 'wb') as pdffile:
        pdffile.write(r.content)

splunknewbie
Loves-to-Learn Lots

I don't get why you open the XML file binary. I get string handling error if I do it so.

Bu I have another problem.  I open the xml file like this

 

replacelist = [("$pool_cust_lic_pool$", "{0}".format(customer_pool_name))]

with open(DASHBOARD_FILE, 'r') as XMLfile:
XMLDashboard = XMLfile.read().replace('\n', '')

for replacement in replacelist:
XMLDashboard = XMLDashboard.replace(replacement[0], replacement[1])

r = requests.get(SPLUNKURL, auth=(SPLUNKUSER, SPLUNKPS),
params={'input-dashboard-xml': XMLDashboard, 'paper-size': 'a4-landscape'}, verify=False)

 

part of the dashboard content:

<query> foreach "License-pool size in GB" [eval val2='&lt;&lt;FIELD&gt;&gt;'] <query>

#query is shorted. Since I have  &lt;&lt; and &gt;&gt; in my dashboard xml. I get the following error.</query>

 

An error occured creating weekly reporting for pool auto_generated_pool_download-trial for cw 5
400 Unable to render PDF.<br/><ul><li>Bailing out of Integrated PDF Generation. Exception raised while preparing to render "Untitled" to PDF. StartTag: invalid element name, line 1, column 3533 (&lt;string&gt;, line 1)</li></ul>

 

The column 3533 indicate the line &lt;&lt;FIELD&gt;&gt; in the dashboard xml. 

The dashboard works fine. I can request and convert it in pdf via curl 


curl -X POST -u admin:Changeme1 -k 'https://localhost:8089/services/pdfgen/render?input-dashboard='license_test' &namespace=search &paper-size=a4-landscape' > test.pdf

 

but it doesn't work via my python script

0 Karma

splunknewbie
Loves-to-Learn Lots

I solve the problem after use the CDATA for my queries

 

<![CDATA[ your search query here ]]>

0 Karma

clamarkv
Explorer

this thread was very helpful in figuring out how to do this, however i did have some problems with the gt (>) and lt(<) operators in my searches. 

to work around this i wrapped them in CDATA tags: 

status<![CDATA[&gt;]]>199 AND status<![CDATA[&lt;]]>300

 

if anyone else is interested, this is my "final" script: 

import requests
import json
import os
from logzero import setup_logger

logger = setup_logger(name="logger", level=20)

splunk_endpoint = os.environ["SPLUNK_ENDPOINT"]
splunk_username = os.environ["SPLUNK_USERNAME"]
splunk_password = os.environ["SPLUNK_PASSWORD"]

with open("dashboard.xml", encoding="utf-8") as xml_file:
    xml_dashboard = xml_file.read()

with open("client_list.json", encoding="utf-8") as json_file:
    client_list = json.loads(json_file.read())

for client in client_list:

    logger.info(f"generating dashboard pdf for {client['name']} [{client['id']}]")

    # replace tokens in dashboard
    this_xml_dashboard = xml_dashboard.replace("--CLIENT--NAME--", client["name"])
    this_xml_dashboard = xml_dashboard.replace("--CLIENT--ID--", client["id"])

    # send xml dashboard to render endpoint
    render_url = f"{splunk_endpoint}/services/pdfgen/render"
    render_response = requests.post(
        render_url,
        auth=(splunk_username, splunk_password),
        params={
            "input-dashboard-xml": this_xml_dashboard.encode(),
            "paper-size": "a4-landscape",
        }
    )

    logger.info(f"render_response: {render_response.status_code}")

    # render endpoint returns a pdf
    if render_response.status_code == 200:
        outfile = (f"./output/dashboard-{client['name'].replace(' ', '_').lower()}.pdf")
        with open(outfile, "wb") as pdffile:
            pdffile.write(render_response.content)
        logger.info(f"wrote {filename} [{render_response.headers['Content-Length']} bytes]")

 

hope this helps.

0 Karma

conikhil
New Member

Does this work for dynamic token like eval, condition, init Because tokens seems to be static replacement and must be passed/replaced before sending to api?

0 Karma

clamarkv
Explorer

i am using a local copy of the dashboard because thats what suited my use case. 

my scenario was that i had users creating 150+ identical monthly reports (eg copy/pasting a dashboard definition and changing a client identifier) and then scheduling those dashboard to run in the first 6 hours of the month.

the pattern i am using currently will pull a list of client id's from a lookup, iterate over each client id (replacing the client id in the dashboard definition (as you don't appear to be able to provide inputs when asking splunk to generate the dashboard)), and generating a pdf. 

this process also allows me to process these reports sequentially rather than scheduling these using the splunk scheduling interface (which is not particularly easy to schedule large amounts of reporting with). This reduces a significant amount of load in the environment at the beginning of each month.

for those of you looking for a more complete example than what i have provided earlier, i've been using something very similar to this:  splunkcloud scripted dashboard generation (github.com)

hope this helps.

0 Karma

Iriseng94
Explorer

HI ,

Understand that this rest api able to generate a pdf report , I am not sure where should i locate the script ? What if i want to schedule the export of pdf like daily monthly or weekly ? Should i do a scheduling of report from splunk and action behind is to run a scripts ?

Any detailed steps for newbie ?

Thanks in advance .

0 Karma

clamarkv
Explorer

i think it really depends on your use case.

typically i would prefer to use the native splunk scheduling, but 150+ near identical reports reports being run on the first of each month, interfering with interactive splunk users, and occasionally exhausting cluster resources was not ideal. 

im using a scheduled ci-cd process that executes this job in a docker container.

0 Karma

conikhil
New Member

Does this work for dynamic token like eval, condition, init Because tokens seems to be static replacement and must be passed/replaced before sending to api?

0 Karma

rblack88
New Member

Thanks cwue! In case anyone else gets errors saying that it couldn't find the end of a start tag, adding XMLDashboard=XMLDashboard.replace(' ', '%20') should work!

0 Karma

curtin1061
New Member

Hello,
Does anyone have the full script/xml files needed ... the reference link above is no longer valid.

I'd love to see the python code, I'm not sure where I find the 'dashboardFile.xml' or what this file should look like.

Python and curl examples appreciated.
Craig

0 Karma

cwue
Engager

Hi curtin1061,

it's been a while but I just checked our git and this is the relevant code of my program to download the dashboard as xml.
SplunkUsername, DashboardName, AppName and SessionKey are variables.

import io, requests, json
endpoint = 'https://splunkhost:8089/servicesNS/' + SplunkUsername + '/' + AppName + '/data/ui/views/' + DashboardName
    r = requests.get(endpoint, headers={'Authorization': 'Splunk ' + SessionKey}, params={'output_mode': 'json'}, verify=False)
    dashboardcode = r.json()['entry'][0]['content']['eai:data']

    with io.open('dashboard.xml', 'w', encoding='utf8') as f:
        f.write(dashboardcode)
    print("Saved dashboard.xml")
0 Karma

curtin1061
New Member

Hi cwue,
Thanks for this ... is this your internal git repo? or any chance its public?
I'm a splunk newbie just hacking my way through automating some dashboards. I've done python for a long time (and love it). Any pointers towards python/splunk snipets/code appreciated.
Regards,
Craig

0 Karma

rblack88
New Member

I have not contributed to this repo in several months, but this might help you. https://github.com/BryceKopan/SlackSplunkBot/blob/master/src/RESTSplunkMethods.py

0 Karma

anwarmian
Communicator

This works great. Thanks a lot, cwue. Those who don't have access to "requests" library can use curl command like the following:

import subprocess
file_out = "/home/splunk/pdf_dashboards/yourpdfile.pdf"

subprocess.call(["curl","-G", "-sku", "your_user_name:your_password","-k", Url, "--data-urlencode", "input-dashboard-xml=" + XMLDashboard,"-d","namespace=yourAppName", "-d","paper-size=a4-landscape"], stdout=file_out)

if you are using "--data-urlencode" then you don't have to use the following two lines:
XMLDashboard=XMLfile.read().replace('\n', '')
XMLDashboard = XMLDashboard.replace('<', '%26lt%3B') # otherwise the API will complain

Good Luck and Thanks to cwue once again.

surekhasplunk
Communicator

Hi @anwarmian and @cwue,

By using this approach doesn it resolve the problems with pdf renderinng.
Does it retain the same color codes which are used in the dashbaord
Does it retain the same spacing and panels for page as that of the dashboard ?

0 Karma

anwarmian
Communicator

Were any of you successful in creating a pdf in linux using REST API and emailing to Windows box and check if the pdf show properly? When I created a pdf using the above method in Linux (using python) it looks great in linux but after sending it to Windows via email Windows displays the pdf with blank pages.

0 Karma

JohnRiddoch
Explorer

Thanks both for this help - however, if you want to do it from a shell script, this works:
- save dashboard as "mydash.xml" with appropriate parameters/timeline set
- run this curl command:
- curl -G -sku admin:${mypass} "https://localhost:8089/services/pdfgen/render" --data-urlencode "input-dashboard-xml=$(cat mydash.xml)" -d namespace=search -d paper-size=a4-landscape > mydash.pdf

0 Karma

martin_mueller
SplunkTrust
SplunkTrust

There is a /services/pdfgen/render endpoint, but it doesn't appear to be documented. You might be able to reverse-engineer how to call it from $SPLUNK_HOME/share/splunk/search_mrsparkle/exposed/js/util/pdf_utils.js#downloadReportFromXML(), that's what gets called underneath when you click the Export PDF button.

Get Updates on the Splunk Community!

Now Available: Cisco Talos Threat Intelligence Integrations for Splunk Security Cloud ...

At .conf24, we shared that we were in the process of integrating Cisco Talos threat intelligence into Splunk ...

Preparing your Splunk Environment for OpenSSL3

The Splunk platform will transition to OpenSSL version 3 in a future release. Actions are required to prepare ...

Easily Improve Agent Saturation with the Splunk Add-on for OpenTelemetry Collector

Agent Saturation What and Whys In application performance monitoring, saturation is defined as the total load ...