Skip to main content
Nemofy
Routing engine

One routing brain for every event, every channel.

Pick a strategy, score providers, branch on tenant tier, fail over when a vendor goes down, retry with backoff. All configured in the admin, never coded into your applications.

Live routing
Events / min12,478+4.2%
Sources

API

/notifications

Kafka

events.notif

Webhook

ingest

Routing engine
Trigger
Health
Strategy
Providers

Twilio

412ms

SendGrid

298ms

FCM

1.2s

Slack

190ms

Strategy · priority · per-tenantFCM degraded · failing over to APNs
Strategies

Pick how the next provider gets chosen.

The strategy is per-channel, per-tenant. Change it in the admin and the next event picks up the new rule. No redeploy, no migration, no application restart.

  • Strategy per channel, per tenant, per environment.
  • Live changes take effect on the next event.
  • Per-provider scoring weights you can dial in.
  • Audit log of every strategy change with the actor.
  • Combine multiple strategies into a pipeline.

Priority

Send through the preferred provider. Fall back only on failure or saturation.

Round-robin

Even rotation across N providers when no preference applies.

Region

Geo-aware per locale. Route MA traffic through INWI / MGate, EU through Twilio.

Tenant tier

Premium tenants get the low-latency provider pool; standard tenants get the cost-optimised one.

SMS chain · tenant acme-prod

live
  1. Primary

    Twilio

    SMS · low-latency

    Healthy
  2. Secondary

    MGate

    SMS · regional fallback

    Healthy
  3. Tertiary

    Vonage

    SMS · cost-optimised

    Healthy
Health probe · every 5s0 paged operators
Failover chain

Vendor down? The chain takes over.

Primary, secondary, tertiary, all per channel and per tenant. Health probes pull an unhealthy provider out of rotation before the next event picks it. When the primary recovers, traffic shifts back without an operator step.

  • Health probes every 5 seconds, per provider.
  • Chain configured in the admin, not in code.
  • Automatic swap back when the primary stabilises.
  • Per-channel chains so SMS and email failover separately.
  • Manual override available from the admin or API.
Retry with backoff

Transient failures retry. Without melting the queue.

Exponential backoff with jitter, capped at a per-channel ceiling. A flapping provider gets fewer attempts as the gap grows, so one bad vendor never takes the whole pipeline down with it.

  • Exponential backoff, configurable per channel.
  • Jitter prevents thundering-herd retry storms.
  • Max attempts capped per event class.
  • Dead-letter queue for events that exhaust the chain.
  • Per-event retry metrics in the audit trail.

Exponential backoff

base 1s · max 32s · jitter ±20%
1stry 1
2stry 2
4stry 3
8stry 4
16stry 5
32stry 6
Per channel, per tenantAfter try 6 → DLQ

Routing rules · order-events

3 rules · first-match
  1. 1IFtenant.tier == 'premium' AND channel == 'sms'Matched

    strategy: priority · pool: low-latency

  2. 2ELSE IFpayload.region IN ('MA','TN','DZ')

    strategy: region · pool: MA-providers

  3. 3ELSE*

    strategy: round-robin · pool: default

IF / ELSE rules

Branch on the things that actually change.

Tenant tier, channel, payload fields, time of day. Branches are first-match. Order them in the admin once and the audit trail explains every routing decision back to the rule that won.

  • First-match branching, ordered visually.
  • Conditions on payload, tenant, channel, time.
  • Every match recorded with the rule ID and timestamp.
  • Test a rule against historical events before publishing.
  • Per-tenant rule sets, version-controlled.
Decision audit

Every routing decision, written down.

For every event, the audit trail records the strategy that ran, the rule that matched, the provider chosen, and the outcome. When someone asks “why did this notification go that way?”, you point at a row.

  • Per-event timeline: chosen, sent, retried, succeeded.
  • Filter by tenant, channel, provider, outcome.
  • p95 latency per provider, live.
  • Counter of events using the fallback in the last hour.
  • Export the trail to your warehouse on a schedule.

Routing decisions · live

streaming
  • 10:42:18.412

    ORDER_SHIPPED· acme-prod

    Twiliovia priority

    412mssent
  • 10:42:18.407

    PASSWORD_RESET· atlas

    SendGridvia rule#1

    298mssent
  • 10:42:18.401

    ORDER_DELAYED· northwind

    MGate (fallback)via failover

    612mssent
  • 10:42:18.395

    OTP_REQUEST· acme-prod

    Twiliovia priority

    388mssent
  • 10:42:18.387

    MARKETING_BLAST· northwind

    Mailgunvia round-robin

    1184msretry

See it routing your traffic.