Thursday, 14 May 2026

Introduction to Checkly



How Checkly works

Checkly is a SaaS synthetic monitoring platform — you define "checks" (HTTP requests or browser scripts), Checkly runs them on a schedule from probe locations around the world (or on-demand from CI), records latency/assertions/screenshots, and alerts you when they fail or get slow.
  
  Two main check types:

  - API checks — a single HTTP request with assertions on status, headers, body, response time.
  - Browser checks — a Playwright script run in a real headless Chromium against your deployed app.

There's also multi-step API checks (chain requests, e.g. login → use token → logout) and heartbeat checks (your job pings Checkly; alert if it stops).

Hearbeat vs Ping 

Heartbeats and pings are both vital network failure-detection mechanisms, but they differ in purpose: Heartbeats are proactive, periodic "I am alive" messages sent by an application to signal it is healthy, while Pings are reactive requests to check if a server is reachable. Heartbeats detect application crashes, while pings detect network downtime.

Checks are typically authored as code (Checkly CLI, TypeScript) and checkly deploy'd to the cloud. You can tag them (tags: ["auth"]), parametrise them with env vars like ENVIRONMENT_URL, and trigger them on-demand from CI — which is exactly what this PR does with npx checkly trigger --tags=auth.
  
  Runtime model:
  - Scheduled: every N minutes from chosen regions (e.g. us-east-2, eu-west-1) — catches regressions/outages between deploys.
  - Triggered from CI: post-deploy smoke test, results gate (or just annotate) the deploy.
  - Alerts: Slack/PagerDuty/email on failure, with retry/degraded thresholds to avoid flap.

  ---

  What it would check for this auth API
  
  Given the auth API's surface (login, OAuth, JWT issuance, admin endpoints), realistic auth-tagged checks:

  1. Health endpoint — basic liveness

  new ApiCheck("auth-health", {
    name: "Auth API – health",
    tags: ["auth"],
    frequency: 1, // minute
    locations: ["us-east-2", "eu-west-1"],
    request: {
      url: `${process.env.ENVIRONMENT_URL}/health`,
      method: "GET",
      assertions: [
        AssertionBuilder.statusCode().equals(200),
        AssertionBuilder.responseTime().lessThan(500),
        AssertionBuilder.jsonBody("$.status").equals("ok"),
      ],
    },
  });

  2. Login flow — happy path, returns a JWT

  new ApiCheck("auth-login", {
    name: "Auth API – login returns JWT",
    tags: ["auth"],
    request: {
      url: `${process.env.ENVIRONMENT_URL}/auth/login`,
      method: "POST",
      headers: [{ key: "Content-Type", value: "application/json" }],
      body: JSON.stringify({
        email: process.env.SYNTHETIC_USER_EMAIL,
        password: process.env.SYNTHETIC_USER_PASSWORD,
      }),
      assertions: [
        AssertionBuilder.statusCode().equals(200),
        AssertionBuilder.responseTime().lessThan(1500),
        AssertionBuilder.jsonBody("$.token").isNotNull(),
        // structural check on JWT shape
        AssertionBuilder.jsonBody("$.token").matches("^eyJ[A-Za-z0-9_-]+\\.[A-Za-z0-9_-]+\\.[A-Za-z0-9_-]+$"),
      ],
    },
  });
  
  3. Login — wrong password returns 401 (negative path)

  Catches the "accidentally accepts anything" class of regression.

  new ApiCheck("auth-login-bad-pw", {
    name: "Auth API – wrong password = 401",
    tags: ["auth"],
    request: {
      url: `${process.env.ENVIRONMENT_URL}/auth/login`,
      method: "POST",
      headers: [{ key: "Content-Type", value: "application/json" }],
      body: JSON.stringify({ email: process.env.SYNTHETIC_USER_EMAIL, password: "wrong" }),
      assertions: [AssertionBuilder.statusCode().equals(401)],
    },
  });
  
  4. Multi-step — login then call protected endpoint

  This is the most useful kind for an auth API, because it proves the token actually works.

  new MultiStepCheck("auth-token-roundtrip", {
    name: "Auth API – token works against /me",
    tags: ["auth"],
    code: { entrypoint: path.join(__dirname, "token-roundtrip.spec.ts") },
  });
  // token-roundtrip.spec.ts
  import { test, expect } from "@playwright/test";
  test("login then /me", async ({ request }) => {
    const login = await request.post(`${process.env.ENVIRONMENT_URL}/auth/login`, {
      data: { email: process.env.SYNTHETIC_USER_EMAIL, password: process.env.SYNTHETIC_USER_PASSWORD },
    });
    expect(login.ok()).toBeTruthy();
    const { token } = await login.json();
    
    const me = await request.get(`${process.env.ENVIRONMENT_URL}/me`, {
      headers: { Authorization: `Bearer ${token}` },
    });
    expect(me.status()).toBe(200);
    const body = await me.json();
    expect(body.email).toBe(process.env.SYNTHETIC_USER_EMAIL);
  });
  
  5. TLS & cert expiry

  A pure config check — useful because cert rotation is a classic outage cause.

  new ApiCheck("auth-tls", {
    name: "Auth API – TLS cert valid > 14d",
    tags: ["auth"],
    request: {
      url: `${process.env.ENVIRONMENT_URL}/health`,
      method: "GET",
      assertions: [AssertionBuilder.statusCode().equals(200)],
    },
    // Checkly surfaces cert expiry on the run; you set a threshold per check
  });
  
  6. Browser check — full login UX

    expect(login.ok()).toBeTruthy();
    const { token } = await login.json();

    const { token } = await login.json();

    const me = await request.get(`${process.env.ENVIRONMENT_URL}/me`, {
      headers: { Authorization: `Bearer ${token}` },
    });
    expect(me.status()).toBe(200);
    const body = await me.json();
    expect(body.email).toBe(process.env.SYNTHETIC_USER_EMAIL);
  });

  5. TLS & cert expiry

  A pure config check — useful because cert rotation is a classic outage cause.

  new ApiCheck("auth-tls", {
    name: "Auth API – TLS cert valid > 14d",
    tags: ["auth"],
    request: {
      url: `${process.env.ENVIRONMENT_URL}/health`,
      method: "GET",
      assertions: [AssertionBuilder.statusCode().equals(200)],
    },
    // Checkly surfaces cert expiry on the run; you set a threshold per check

  5. TLS & cert expiry

  A pure config check — useful because cert rotation is a classic outage cause.

  new ApiCheck("auth-tls", {
    name: "Auth API – TLS cert valid > 14d",
    tags: ["auth"],
    request: {
      url: `${process.env.ENVIRONMENT_URL}/health`,
      method: "GET",
      assertions: [AssertionBuilder.statusCode().equals(200)],
    },
    // Checkly surfaces cert expiry on the run; you set a threshold per check
  });

  new ApiCheck("auth-tls", {
    name: "Auth API – TLS cert valid > 14d",
    tags: ["auth"],
    request: {
      url: `${process.env.ENVIRONMENT_URL}/health`,
      method: "GET",
      assertions: [AssertionBuilder.statusCode().equals(200)],
    },
    // Checkly surfaces cert expiry on the run; you set a threshold per check
  });

  6. Browser check — full login UX

  Runs against the front-end but exercises the auth API end-to-end including redirects, cookies, CSRF.

  new BrowserCheck("auth-ui-login", {
    name: "Login UI works",
    tags: ["auth"],
    code: { entrypoint: path.join(__dirname, "login.spec.ts") },
  });
  import { test, expect } from "@playwright/test";
  test("user can sign in", async ({ page }) => {
    await page.goto(process.env.ENVIRONMENT_URL!);
    await page.getByLabel("Email").fill(process.env.SYNTHETIC_USER_EMAIL!);
    await page.getByLabel("Password").fill(process.env.SYNTHETIC_USER_PASSWORD!);
    await page.getByRole("button", { name: "Sign in" }).click();
    await expect(page.getByText("Dashboard")).toBeVisible({ timeout: 10_000 });
  });

  7. OAuth callback reachability

  Doesn't fully exercise the Google/Microsoft flow (those need real consent), but checks the callback
  endpoint responds correctly to a missing-code request — confirms route + handler are wired.

  new ApiCheck("auth-oauth-google-callback-shape", {
    name: "Auth API – Google OAuth callback exists",
    tags: ["auth"],
    request: {
      url: `${process.env.ENVIRONMENT_URL}/auth/google/callback`,
      method: "GET",
      assertions: [
        // 400 for missing `code`, not 404/500 — proves handler is mounted
        AssertionBuilder.statusCode().equals(400),
      ],
    },
  });

Intro to QA with Headless Browsers

Headless browsers are used in QA to execute automated browser tests faster and more efficiently by eliminating the graphical user interface (GUI). Because they don't render visuals, they consume fewer resources, enabling rapid, parallel testing in CI/CD pipelines, making them ideal for high-volume functional and regression testing.

Key Reasons for Using Headless Browsers in QA:
  • Faster Execution: Without the need to render CSS, images, or layout, tests run significantly faster.
  • CI/CD Integration: They are ideal for server-side environments where a GUI is unavailable, allowing automated tests to run after every code commit.
  • Lower Resource Usage: They consume significantly less RAM and CPU, allowing for higher parallelization (running many tests simultaneously) without overloading hardware.
  • Automated Functional Testing: They can accurately simulate user actions such as clicking buttons, submitting forms, and navigating pages.
  • Regression Testing: Due to speed and efficiency, they are perfect for running large suites of regression tests to ensure new changes haven't broken existing functionality.
Common tools for headless testing include headless Chrome, Firefox, Puppeteer, and Playwright

Headless browsers parse, compile, and execute the exact same underlying code as standard browsers, but they skip the final step of painting pixels to a physical screen.

What Headless Browsers Still Do
  • Construct the DOM: They parse HTML into a full Document Object Model tree.
  • Apply Styling: They process CSS and calculate layout, element positions, and visibility.
  • Execute JavaScript: They run a full JS engine (like V8 in Chrome) to handle AJAX, animations, and frontend logic.
  • Manage Network Traffic: They make real HTTP requests, download cookies, and handle API responses.

How QA Verifies Visuals Without a Display
  • Layout Queries: Code checks if elements are present, hidden, or overlapping by querying their coordinates.
  • Computed Styles: Scripts verify specific CSS properties, like checking if a button color is exactly rgb(0, 0, 255).
  • Virtual Screenshots: The browser renders the page into an in-memory buffer, allowing QA tools to save PNGs or perform pixel-by-pixel visual regression comparisons.

To help tailor using headless browser to our workflow, we need to know:
  • Which testing framework we are using (e.g., Playwright, Selenium, Cypress)?
  • Are we trying to catch functional bugs or visual layout glitches?
  • Do our tests run on a local machine or a CI/CD server (e.g., GitHub Actions, Jenkins)?

---