Cumul.io
General
The Airbyte Cumul.io destination connector allows you to stream data into Cumul.io from any Airbyte Source.
Cumul.io is an Embedded analytics SaaS solution that enables other SaaS companies to grow with an engaging customer analytics experience, seamlessly embedded in their product. Cumul.io's intuitive, low-code interface empowers business users with insight-driven actions in record time without straining engineering resources from the core product.
Getting started
In order to use the Cumul.io destination, you'll first need to create a Cumul.io account (if you don’t already have one). After logging in to Cumul.io, you can generate an API key and token in your Profile -> API Tokens. To set up the destination connector in Airbyte, you'll need to provide the following Cumul.io properties:
- "Cumul.io API Host URL": the API host URL for the Cumul.io environment where your Cumul.io account resides (i.e.
https://api.cumul.io
for EU multi-tenant users,https://api.us.cumul.io/
for US multi-tenant users, or a VPC-specific address). This property depends on the environment in which your Cumul.io account was created (e.g. if you have signed up via https://app.us.cumul.io/signup, the API host URL would behttps://api.us.cumul.io/
). - "Cumul.io API key": a Cumul.io API key (see above how to generate an API key-token pair)
- "Cumul.io API token": the corresponding Cumul.io API token (see above how to generate an API key-token pair)
As soon as you've connected a source and the first stream synchronization has succeeded, the desired Dataset(s) will be available in Cumul.io to build dashboards on (Cumul.io's "Getting started" Academy course might be interesting to get familiar with its platform). Depending on the synchronization mode set up, the next synchronizations will either replace/append data in/to these datasets!
If you have any questions or want to get started with Cumul.io, don't hesitate to reach out via our contact page.
Connector overview
Sync modes support
Sync modes | Supported?(Yes/No) | Notes |
---|---|---|
Full Refresh - Append | Yes | / |
Full Refresh - Replace | Yes | / |
Incremental Sync - Append | Yes | / |
Incremental - Append + Deduped | No | Cumul.io's data warehouse does not support dbt (yet). |
Airbyte Features support
Feature | Supported?(Yes/No) | Notes |
---|---|---|
Namespaces | Yes | (Highly recommended) A concatenation of the namespace and stream name will be used as a unique identifier for the related Cumul.io dataset (using Tags) and ensures next synchronizations can target the same dataset. Use this property to ensure identically named destination streams from different connections do not coincide! |
Reset data | Yes | Existing data in a dataset is not deleted upon resetting a stream in Airbyte, however the next synchronization batch will replace all existing data. This ensures that the dataset is never empty (e.g. upon disabling the synchronization), which would otherwise result in "No data" upon querying it. |
Airbyte data types support
Airbyte data types | Remarks |
---|---|
Array & Object | To support a limited amount of insights, this connector will stringify data values with type Array or Object (recommended by Airbyte) as Cumul.io does not support storing nor querying such data types. For analytical purposes, it's always recommended to unpack these values in different rows or columns (depending on the use-case) before pushing the data to Cumul.io! |
Time with(out) timezone | While these values will be stored as-is in Cumul.io, they should be interpreted as hierarchy * (i.e. text/string, see Cumul.io's data types Academy article). Alternatively, you could either provide a (default) date and timezone for these values, or unpack them in different columns (e.g. hour , minute , second columns), before pushing the data to Cumul.io. |
Timestamp without timezone | Cumul.io does not support storing dates without timestamps, these timestamps will be interpreted as UTC date values. |
Number & Integer data types with NaN, Infinity, -Infinity values | While these values will be stored as-is in Cumul.io, they will not support numeric aggregations such as sum, avg, etc. (using such aggregations on these values likely causes unexpected behavior). Ideally, such values are converted into meaningful values (e.g. no value, 0, a specific value, etc.) before pushing the data to Cumul.io. |
Boolean | Boolean values will be stringified (recommended by Airbyte) and result in a hierarchy column type (i.e. text/string, see Cumul.io's data types Academy article). You could use Cumul.io's hierarchy translation (see this Academy article) to assign translations to true and false that are meaningful to the business user in the column's context. |
All other data types | Should be supported and correctly interpreted by Cumul.io's Data API service*. |
*Note: It might be that Cumul.io's automatic typing could initially interpret this type of data wrongly due to its format (see Possible future improvements
below), you could then alter the column type in the Cumul.io UI to try changing it manually.
Output schema in Cumul.io
Each replicated stream from Airbyte will output data into a corresponding dataset in Cumul.io. Each dataset will initially have an Airbyte - <namespace><stream_name>
English name which can be further adapted in Cumul.io's UI, or even via API. If the request of pushing a batch of data fails, the connector will gracefully retry pushing the batch up to three times, with a backoff interval of 5 minutes, 10 minutes, and 20 minutes, respectively.
The connector will associate one or more of the following tags to each dataset:
[AIRBYTE - DO NOT DELETE] - <namespace><stream_name>
: this tag will be used to retrieve the dataset ID and its current columns from Cumul.io, and will be associated with the dataset after the first batch of data is written to a new dataset.[AIRBYTE - DO NOT DELETE] - REPLACE DATA
: this tag will be associated to a dataset when it should be "resetted" (i.e. the existing data should be replaced, seeFeature
->Reset data
above). The first batch of data of the next synchronization will replace all existing data if this tag is present on a dataset.
As noted in the tag name, it is important to never remove such tags from the dataset(s) nor manually set them on other datasets. Doing so might break existing or new synchronizations!