Skip to content

Billable metrics

Overview

Billable metrics are what you charge your customers for.

Common examples of billable metrics are:

  • number of API calls
  • number of uploaded/downloaded bytes
  • number of characters in a text for a Text-to-Speech API
  • internal currency like "coins", "tokens" or "credits"
  • etc.

Nadles allows you to offer pre-paid and usage-based pricing.

In case of a pre-paid pricing, your customers buy a certain volume of billable metrics units. With Nadles, you can let your customers adjust the purchase quantity, which is particularly useful in combination with tiered pricing and volume discounts.

In case of a usage-based pricing, Nadles will track the actual usage of billable metrics per customer, and charge for the actual usage.

When Nadles API Gateway receives an API call, it calculates, how many billable metrics' units the current API call uses, checks if the total usage exceeds the limit, and if not — it continues request processing. If the limit is exceeded, the API gateway returns 429 Too Many Requests.

Nadles API Gateway performs the following steps in order to check metrics limits and record the usage:

graph TD
    input[GET /endpoint1]

    is_response_needed{"Is response
        needed for
        calculations?"}

    proxy_and_wait[Proxy request and wait for response]

    quota_condition_met{"Does current call use
    billable metrics units?"}

    calculate_usage[Calculate metrics usage]
    limit_exceeded{"Limit exceeded?"}
    is_hard_limit{"Is there a limit?"}
    return_429[Return 429 Too Many Requests]
    record_usage[Record usage]
    skip_quota_check[Skip and continue]
    proxy[Continue request processing]

    input --> is_response_needed
    is_response_needed -->|Yes| proxy_and_wait
    is_response_needed -->|No| quota_condition_met
    proxy_and_wait --> quota_condition_met
    quota_condition_met -->|No| skip_quota_check
    quota_condition_met -->|Yes| calculate_usage
    calculate_usage --> is_hard_limit
    is_hard_limit -->|No| record_usage
    is_hard_limit -->|Yes| limit_exceeded
    limit_exceeded --> |No| record_usage
    limit_exceeded --> |Yes| return_429
    skip_quota_check ----> proxy
    record_usage --> proxy

More about each step below.


Warning

The below documentation is outdated and being updated at the moment.

Adding quotas

Click "Add quota".

Configure the new quota.


Quota name

This name will be shown to customers on the checkout page. Usually, quota name is the name of the product metric it represents. E.g., "API calls", "Processed images", "Synthesized characters", etc.

Note

There is no need to include time period in the name, Nadles will display it to your customers separately.

- Processed images per month
+ Processed images

Check out "What do I charge for?" for more product metric examples.


Quota label

Label for the quota, can contain only numbers, letters and underscore: _

Used internally, can't be changed after the quota is created. Make sure quotas in different products representing the same product metric have the same label — it will allow your customers to seamlessly switch between products.


Quota

The number of quota units included in the product.


Is hard limit

If the checkbox is checked, Nadles API Gateway will start rejecting calls to ALL API endpoints associated with the quota once the quota limit is exceeded.

If the checkbox is left unchecked, there will be no hard limit on the metric and the customer can make unlimited calls. Nadles will still record usage in cases when the quota usage condition (see below) is met. The recorded usage will be processed by the billing engine.


Endpoints

Here you select endpoints associated with the quota.

Only calls to selected endpoints will use the quota volume.

Example

Say, you have an API with the following endpoints:

  • myfirstapi.example/image/compress
  • myfirstapi.example/image/resize

And you want to have two separate quotas in the product:

  • 100 compressed images per month
  • 200 resized images per month

To achieve that, you need to create two quotas and associate /image/compress endpoint with the first quota, and /image/resize — with the second one.

With that, calls to /image/compress will only use the quota of compressed images and not affect the limit on resized images, while calls to /image/resize will only use the resized images quota.


Quota usage configuration

After you've added a new quota, you can configure, how exactly each call to the associated endpoints uses the quota volume.

By default, each call to the associated endpoint uses 1 unit of quota volume.

This can be changed, and Nadles is capable of calculating the quota usage for the current request dynamically.

It means that different calls to the same endpoint can consume a different number of quota units.


Quantity used by a single call

Each endpoint associated with a quota has a field called "Quantity used by a single call".

This field defines, how many quota units are used by each call to this endpoint.


Static quota usage

By default, the field is set to "1", meaning that each call uses 1 quota unit.

Tip

When Nadles API Gateway receives a call to an endpoint, it records usage of 1 quota unit by default.

If you want calls to different endpoints to use a different number of quota units, set the respective values in the "Quantity used by a single call" field for each endpoint.


Dynamic quota usage

In addition to static quota usage configuration, Nadles API Gateway is capable of calculating quota usage dynamically for each API call.

For that, instead of a static number like 1 or 2, you can specify a JavaScript expression that evaluates to an integer representing the quota usage for the current API call based on the request and/or reponse attributes.

See the examples below.

Example #1

The endpoint GET /prompt/{LLM_MODEL} allows sending textual prompts to two different AI models: gpt3 and gpt4.

The API provider would like calls to gpt4 to consume 2 quota units and calls to gpt3 to consume 1 quota unit.

To achieve that, the expression should evaluate to 2 if the path parameter LLM_MODEL equals to gpt4, 1 otherwise.

E.g.:

path.params.LLM_MODEL == "gpt4" ? 2 : 1

In the form:

Nadles API Gateway will check the value of the path parameter and record the usage of 2 quota units if it's gpt4, and 1 quota unit otherwise.

Example #2

The endpoint POST /process accepts a JSON array of elements to process.

The request body looks like:

[
    { "data": "ZDU2OWZlODQtODdiZS00YzZjLTk5ODktYTdjNWRjMmQ5NWJj" },
    { "data": "YTQ5NGUyNWMtNDI2NS00MjkzLWJmYWEtNzY5MjQxZjhlYjI1" },
    { "data": "YWZiOTZhNTAtMWE1Zi00Zjg4LWJmMGMtMWVhODQ2ODY3NmVj" }
]

The API provider would like to record quota usage, equal to the number of elements in the input array. The request body in this example should use 3 quota units.

The JavaScript expression should parse the request body as JSON and return the number of elements:

JSON.parse(request.body).length

In the form:

Calculating usage in the backend

Your API backend can tell Nadles directly, how much quota the current request has used.

For instance, if you bill for CPU seconds used to process the request, and Nadles is unable to infer that value from the request/response on its own, you can calculate it in your API backend for each request and then tell Nadles the actual value it should record by adding a custom header to the response.

The header name doesn't matter, you can use any valid header name, e.g. X-Consumed-Cpu-Seconds.

The expression in this case can be:

response.headers["x-consumed-cpu-seconds"]

In the form:

And that's it. Nadles will wait for the response and record the value from the X-Consumed-Cpu-Seconds header as quota usage for the current request.

Tip

There are several variables you can use in your expressions.


Quota usage condition

This field contains a JavaScript expression, which tells Nadles whether quota usage should be counted for the current request or not.

The expression must evaluate to a boolean value.

If the expression evaluates to true, quota usage is recorded. If the expression evaluates to false, Nadles doesn't record quota usage for the current API call.

Example

The most common use case for this field is to make Nadles record quota usage only if response status code is 200.

For that, the Quota usage condition expression will be:

response.statusCode == 200

In the form:

Configured this way, Nadles will record quota usage only if the response returned HTTP status code 200 OK.

Tip

There are several variables you can use in your expressions.


Expression variables

The variables described below allow you to configure Nadles API Gateway's behavior in a flexible way.

You can use these variables in quota quantity and usage condition expressions.

Path parameters

path.params.* — placeholder values specified in the endpoint URL.

Note

Placeholder names are case-sensitive.

Example

If an endpoint URL is /resource/{resourceId}

and the HTTP request URL is /resource/801d49c2-ca05-42b1-97af-baf0ddf36ba3,

then there will be a variable path.params.resourceId with value "801d49c2-ca05-42b1-97af-baf0ddf36ba3".

path.params.resourceId // "801d49c2-ca05-42b1-97af-baf0ddf36ba3"

Client IP address

request.remote_addr — Client IP address.


Request headers

request.headers['header-name'] — Request header values.

Warning

Header names are lower-case.

Example

request.headers['content-type'] == 'application/json'

Request query string parameters

request.query['query_string_parameter_name'] — Request query string parameters.

Warning

Query string parameter names are case-sensitive.

Example
request.query['page'] > 100

Request body

request.body — Request body.

Example
request.body.length > 1000

Response status code

response.statusCode — HTTP status code of the response from the upstream.

Example
response.statusCode == 200

Response headers

response.headers['header-name'] — Response header values.

Warning

Header names are lower-cased.

Example
response.headers['content-type'] == 'application/json'

Response body

response.body — Raw response body.

Example
response.body.length > 0