Pushly Crawler

This page explains how the Pushly crawler identifies itself, which IP addresses it uses, and how to configure your infrastructure to allow it.

User agent

All crawler requests use the following user agent:

PushlyCrawler/1.0 (+https://documentation.pushly.com/faq/crawler)

We recommend allowing this user agent in any bot protection, WAF, or CDN rules you maintain for your site.

IP addresses

The Pushly crawler originates traffic from the following IP addresses:

52.0.101.14
34.213.44.134

If you require IP-based allowlisting, you can allow these IPs in your firewall / WAF configuration.

If your organization requires stricter controls, we recommend allowing either:

  • The user agent PushlyCrawler/1.0, or

  • The IPs 52.0.101.14 and 34.213.44.134, or

  • Both, for maximum reliability


What the crawler does

The Pushly crawler:

  • Fetches HTML content from your site to power content suggestions inside the Pushly platform

  • Only accesses pages that are relevant to your Pushly configuration and in-product features

  • Does not index your content for public search engines

  • Does not execute ad-blocking or attempt to bypass your users’ ad blockers

  • Runs as a server-side integration, not from end-user browsers

If you ever want to stop the crawler completely, you can block either the user agent PushlyCrawler/1.0 or the IPs listed above.


robots.txt behavior

Because this crawler exists to provide first-party, in-platform functionality for your own authenticated users (your internal staff using Pushly), it does not treat robots.txt as an access-control mechanism.

  • Changes to robots.txt will not stop the crawler

  • To block or allow the crawler, use user-agent or IP-based rules instead

This design avoids situations where a broad Disallow: / robots rule intended for public search engines accidentally breaks your internal Pushly workflows.


Allowlisting examples

The following examples show how you can allow the Pushly crawler in common infrastructure setups. You should adapt these patterns to your specific configuration and security policies.

Cloudflare (WAF custom rule)

You can create a WAF rule that allows the Pushly crawler based on user agent and/or IP.

Example rule expression:

Actions you might configure:

  • Allow or Skip (bypass) security features such as JS challenges or bot checks for matching requests

  • Apply the rule only to specific hostnames or paths if needed

Akamai (WAF / security configuration)

In Akamai, you can create a match condition for the Pushly crawler:

  • Match on Header: User-Agent contains PushlyCrawler/1.0

  • Optionally also match on source IP being in:

    • 52.0.101.14

    • 34.213.44.134

Then:

  • Assign a lower-threat or allow action to this traffic

  • Exclude it from aggressive bot or JS challenge rules

Refer to your Akamai property configuration UI for the exact steps, as naming and structure may vary between setups.

Fastly (VCL example)

In Fastly, you can use custom VCL to identify and allow the Pushly crawler.

Example snippet:

You can also limit this to specific services or hostnames if desired.

Nginx (example configuration)

You can use Nginx to allow the Pushly crawler via user agent and/or IP.

Example:

Adjust this config to match your existing security and routing setup.


Troubleshooting

If Pushly reports that it cannot fetch your content or you see 403 responses being returned to the crawler:

  1. Verify that your WAF or bot protection is not blocking:

    • User agent: PushlyCrawler/1.0

    • IPs: 52.0.101.14, 34.213.44.134

  2. Check for rules that:

    • Enforce JS challenges

    • Block “unknown bots”

    • Challenge or block based on reputation of the IPs above

  3. Add an explicit allow/skip rule for:

    • The user agent string, and/or

    • The IP addresses

If issues persist, please contact Pushly support and include:

  • Example URLs that are failing

  • Timestamps (with timezone)

  • Any relevant WAF or firewall logs showing blocked requests from the crawler IPs or UA

Last updated