Community Blog
Get the latest updates on the Splunk Community, including member experiences, product education, events, and more!

Intro to Splunk Synthetic Monitoring

CaitlinHalla
Splunk Employee
Splunk Employee

In our last post, we mentioned that the 3 key pieces of observability – metrics, logs, and traces – provide invaluable backend system insight so that we can detect and resolve failures fast. But how can we work towards proactively preventing system failures before they ever surface to our customers? 

Frontend monitoring provides insight into what users experience when interacting with our applications. Splunk provides solutions for Real User Monitoring (RUM) to monitor actual user interactions and Synthetic Monitoring to simulate end-user traffic. While both are critical to observability and Digital Experience Monitoring (DEM), we’ll start with Splunk Synthetic Monitoring in this post. Let’s explore how Splunk Synthetic Monitoring works and what it brings to our observability practice. 

Synthetics Monitoring in Action

Navigating to Synthetics from Splunk Observability Cloud, we land on the Overview page:

overview page.png

Here we can see a list of all of our existing Synthetic tests. We can filter to look for a specific type of test, sort the list by clicking on the table headings, or create new tests and global variables (variables that are shareable throughout Browser and API tests – a good place to store login credentials, for example).

If we select Add new test, we can specify which type of test we want to create: Browser test, API test, or Uptime test.

add new test.png

Browser Tests

Browser tests simulate the run of a workflow or set of requests that make up the user experience and continuously collect performance data on these requests. They can be executed from many devices and from a number of locations around the world to ensure that no matter how users access applications or where they access them from, they’ll experience consistent performance. They also create a cool filmstrip with screenshots and a video replay for easy viewing of all actions executed during the session and their results in the browser. Detectors can be configured to alert on any errors or latency encountered during the test run. 

Here we’ve created a simple Browser test: 

Browser test setup.png

This runs in the AWS-hosted Splunk public locations of Frankfurt, London, and Paris on a desktop device with a standard network connection and a 1366 x 768 viewport. Our test runs every minute and runs through one location at a time (round-robins), rather than concurrently hitting all locations. This particular test doesn’t yet have detectors configured, but if we selected Create detector, we could easily setup detectors and alerts for if/when this test fails: 

new detector.png

Browser tests can hit a single page or be transactional with multiple steps. You can use this to evaluate real customer transactions, like logging in to your site or buying a product. In this example test, we have 7 steps that execute different possible paths a user can take when interacting with our e-commerce website. The first step includes multiple transactions. The Home transaction goes to a specified URL for our site. The following Shop transaction executes the provided JavaScript to select a random product from a list of products sold on our site: 

Browser test steps 1.png

Then the product is added to a shopping cart, an order is placed, checkout is confirmed, and the test returns to keep browsing products: 

Browser test steps 2.png

You can see that the Place Order transaction contains multiple steps. Different actions and selectors are available within each step. Additionally, steps can be imported via JSON files generated from the Google Chrome Recorder, but we’ll go into this in a separate post. 

One note: you aren’t limited to running these tests on your own sites. It’s possible to run Synthetic tests against other sites, like those of competitors, to benchmark your performance against them.

API Tests

API tests check the transactional functionality and performance of API endpoints. They verify that endpoints are up and running and return the correct data and response codes. Alerts can also be configured on API tests based on any part of the HTTP request or response. 

When we create a new API test, we configure setup steps and request steps by selecting Add requests:

API test 1.png

API test 2.png

Setup steps include the actions required to set up the request, and the request step is the actual body of the API request. Validation steps allow you to check the response body, run JavaScript on the response, save, or extract values. 

Here we have a test that hits the Spotify API:

API Spotify test.png

In the first request we grant server-to-server access by providing a Spotify authorization token we have specified as a global variable for our Synthetic tests. This way, when the token changes, we don’t need to edit every test that uses it, just the variable.

In the validation step, the $.access_token is extracted from the Spotify response and saved to a custom variable. This variable (custom.access_token) is then used in a subsequent request to search for a specific track name: 

api test track name.png

We can extract values from the response body in the validation step or do things like assert the response body or headers contain certain values, assert the response code is what we expect, etc. 

Uptime Tests

Uptime tests can either be HTTP tests or Port tests. They don’t parse HTML, load images, or run JavaScript but make a request and collect metric data on response time, response code, DNS time (for HTTP tests), and time to first byte. HTTP tests hit a specified URL or endpoint, and Port tests make a TCP (Transmission Control Protocol) or UDP (User Datagram Protocol) request to the specified server port. 

Test History

If you open up any of the Synthetic tests from the Overview page, you’ll land on the Test History page. Here you’ll find a summary of performance trends, Key Performance Indicators (KPIs), and recent run results: 

summary 2.png

kpis and recent runs.png

At the top of the page, we have line graph charts showing data trends for the last day, 8 days, and 30 days. The bar chart summarizes the overall results for the given time period. The Performance KPI chart is a customizable visualization with adjustable settings. You can view test run details by selecting any point in the chart or by selecting a test run from the Recent run results table below the KPI chart. 

Every Browser test run generates additional charts and metrics, and the interaction between the test runner and the site being tested is represented as a waterfall chart on the test run results page. Browser test run results also include a filmstrip screenshot of site performance and a video of the site loading in real time. This lets you see how the page responds in real-time and see exactly what a user trying to load your site would see. When there’s a problem, you can select the APM link next to the related network activity to jump directly into APM and see what in your backend services may have contributed to the issue.

Workshop browser test run.png

Browser tests capture 40+ metrics, (including core web vitals), that you can use to extend your view into site performance by configuring charts, dashboards, and detectors using these custom metrics. 

Wrap Up

You’re now ready to set up your first Browser, API, or Uptime test to find, fix, and proactively prevent performance issues that could affect key user transactions. Don’t yet have Splunk Observability Cloud? Try it out free for 14 days

Resources

Get Updates on the Splunk Community!

Splunk is Nurturing Tomorrow’s Cybersecurity Leaders Today

Meet Carol Wright. She leads the Splunk Academic Alliance program at Splunk. The Splunk Academic Alliance ...

Part 2: A Guide to Maximizing Splunk IT Service Intelligence

Welcome to the second segment of our guide. In Part 1, we covered the essentials of getting started with ITSI ...

Part 1: A Guide to Maximizing Splunk IT Service Intelligence

As modern IT environments continue to grow in complexity and speed, the ability to efficiently manage and ...