Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 5 additions & 0 deletions documentation/.vuepress/components/LinkableChoices.vue
Original file line number Diff line number Diff line change
Expand Up @@ -93,6 +93,11 @@ export default {
}
}

h3 {
font-size: 1rem;
text-align: center;
}

img {
width: 35px;
height: 35px;
Expand Down
22 changes: 22 additions & 0 deletions documentation/.vuepress/public/install/screen.svg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
26 changes: 26 additions & 0 deletions documentation/.vuepress/public/install/template.svg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
4 changes: 4 additions & 0 deletions documentation/.vuepress/public/install/world.svg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
92 changes: 68 additions & 24 deletions documentation/dataInsertion/README.md
Original file line number Diff line number Diff line change
@@ -1,51 +1,95 @@
# Inserting data into DebiAI
# Inserting Data into DebiAI

Being a data visualization application, providing the project data to DebiAI is a required step.
As a data visualization application, providing project data to DebiAI is a required step.

## Requirements

### A DebiAI instance
### A Running DebiAI Instance

You will need to have a running DebiAI instance to insert you project data to. (see [Installation](../introduction/gettingStarted/installation/README.md))
You need a running DebiAI instance to insert your project data. (See [Installation](../introduction/gettingStarted/installation/README.md))

### Data
### Data Format Requirements

The data you want to analyze with DebiAI will need to respect a specific format.
The data you want to analyze in DebiAI must follow a specific format.

- **CSV like format**
- **CSV-like Format**

If your data can be represented in an array like format, adding them to DebiAI will be easy. The data can also support different levels of nesting (see [unfolding columns](../dashboard/unfolding/)).
If your data is structured in an array-like format, adding it to DebiAI is straightforward. DebiAI also supports different levels of nesting (see [Unfolding Columns](../dashboard/unfolding/)).

- **Data types**
- **Supported Data Types**

DebiAI supports the following data types:

- `num`: numerical values
- `str`: string values
- `bool`: boolean values
- `array`: array of values (see [unfolding columns](../dashboard/unfolding/))
- `dict`: dictionary of values (see [unfolding columns](../dashboard/unfolding/))
- `array`: arrays of values (see [Unfolding Columns](../dashboard/unfolding/))
- `dict`: dictionary objects (see [Unfolding Columns](../dashboard/unfolding/))
- `None`: missing values

Dates are supported by DebiAI, you can provide them as strings.
Dates are supported and should be provided as strings.

- **Missing values**
- **Handling Missing Values**

DebiAI supports data with missing values (`None`, `NaN` or `null` values) since 0.29.0. The missing values will be displayed as `null` by widgets that support them. Statistics about missing values will be displayed in the dashboard.
Since version 0.29.0, DebiAI supports missing values (`None`, `NaN`, or `null`). Widgets that support missing values will display them as `null`, and statistics about missing data will be available in the dashboard.

- **Samples size**
- **Sample Size Limitations**

It is not recommended to provide more than 2.000.000 samples, as it will take a long time to process. We are working on improving this limit.
Providing more than **2,000,000 samples** is not recommended, as it may significantly increase processing time. We are actively working on improving this limitation.

## There is currently two ways to insert data into DebiAI:
## Methods for Inserting Data into DebiAI

- ### [Python module](pythonModule/README.md#python-module)
There are currently two ways to insert data into DebiAI:

The main way to add provide the project data to the application is through the DebiAI Python module.
The module was designed to be used directly in your Python workflow, to add model results directly after its evaluation for example.
<img src="/debiai_architecture.png" alt="DebiAI architecture" width="400"/>

- ### [Data providers](dataProviders/README.md#data-providers)
<LinkableChoices :choices="[
{
title: '1. Data Providers',
description: 'Make DebiAI directly access your project data',
imageLink: '/getStarted/data.svg',
elementIdDestination: '_1-data-providers-recommended'
},
{
title: '2. Python Module',
description: 'Directly insert data from your Python workflow',
imageLink: '/install/python.svg',
elementIdDestination: '_2-python-module'
}
]"
/>

A DebiAI data provider is a REST service that will expose your project to DebiAI.
DebiAI will directly ask for the data from your project making the data loading process very quick and customizable. Unlike the DebiAI Python module, the provided data won't have to be duplicated in the DebiAI application.
### **1. [Data Providers](dataProviders/README.md#data-providers) (Recommended)**

Making a data provider is the most efficient way to make your project data accessible to DebiAI, no matter the data base that your project is using.
A **DebiAI Data Provider** is a service that exposes your project data to DebiAI. This method allows DebiAI to directly retrieve metadata from your project, making data loading **fast** and **customizable**.

✅ **Key benefits**:

- No need to upload or duplicate data in DebiAI.
- Always up to date with the latest project data.
- Works with any files or databases used by your project.

⚠️ **Limitations**:

- Requires a custom implementation to expose your data.

To simplify implementation, you can use the [DebiAI Data Provider Python module](https://github.com/debiai/easy-data-provider).

### **2. [Python Module](pythonModule/README.md#python-module)**

You can also insert data directly from your Python workflow using the [DebiAI Python module](https://github.com/debiai/py-debiai). This is useful for integrating new data or model results immediately after generation.

✅ **Key benefits**:

- Easier to implement.

⚠️ **Limitations**:

- Requires data duplication in DebiAI, increasing load time.
- Data updates must be done manually.

While easier to implement, this method is less efficient than using a Data Provider.

---

By following the recommended **Data Provider** approach, you ensure an optimized project data integration with DebiAI.
66 changes: 33 additions & 33 deletions documentation/dataInsertion/dataProviders/README.md
Original file line number Diff line number Diff line change
@@ -1,50 +1,50 @@
# Data providers
# Data Providers

Making a data provider is the most efficient way to make your project data accessible to DebiAI.
Creating a **Data Provider** is the most efficient way to make your project data accessible to DebiAI.

A data provider is a service that you create that can respond to the data requests of DebiAI. This service can be made in **any language**, can use **any kind of databases** and can be hosted on **any platform** as long at the DebiAI data-provider's API is respected. So unlike the [Debiai Python module](../pythonModule/README.md), your project data won't be duplicated in DebiAI and **DebiAI will always analyze the latest data**.
A **Data Provider** is a service that responds to DebiAI's data requests. It can be implemented in **any language**, use **any database**, and be hosted on **any platform**, as long as it follows the **DebiAI Data Provider API**.

### How does it work?
Unlike the [DebiAI Python module](../pythonModule/README.md), this method **does not duplicate data** in DebiAI, ensuring that DebiAI always analyzes the latest version of your data.

DebiAI will ask your data provider to return the data that it needs to display the dashboard:
## How It Works

- Available Project lists
- Available data IDs
- Project data
- Available models and results
- Data selections (optional)
DebiAI queries your Data Provider to retrieve information for the dashboard:

DebiAI will also be able to send the data selections made by the user.
- **Project lists:** available projects
- **Data IDs:** available samples
- **Project data:** actual data used for analysis
- **Model results:** available models and outputs (optional)
- **Data selections:** user-defined data selections (optional)

### Pros and cons
Additionally, DebiAI can send data selections made by the user back to the provider:

- **Pros**:
- DebiAI will always analyze the latest data
- Your data will not be duplicated in DebiAI
- You can use any languages and databases
- You can host your data provider on any platform
- Better for long term projects
- **Cons**:
- You need to create a data provider (you can start with our [data provider templates](./quickStart.md#creation-of-a-data-provider))
- You need to respect the DebiAI data-provider's API (we made it as simple as possible)
- **Project deletion**
- **Model deletion**
- **Selection creation and deletion**

### The API
## Pros & Cons

The Data-providers API as been described with OpenAPI 3.0.
✅ **Pros**:

- [Data-providers API Swagger documentation](https://petstore.swagger.io/?url=https://raw.githubusercontent.com/debiai/data-provider-nodejs-template/main/data-provider-API.yaml)
- [Data-providers API yaml file](https://github.com/debiai/data-provider-nodejs-template/blob/main/data-provider-API.yaml).
- **Always up to date** – DebiAI always analyzes the latest data.
- **No data duplication** – Saves storage space.
- **Flexibility** – Works with any programming language and database.
- **Platform-independent** – Can be hosted anywhere.
- **Ideal for middle to long-term projects**.

### Speed
⚠️ **Cons**:

The speed at which your data loads into DebiAI depends on how quickly your data provider can provide them. So it depends on the size of the data and the speed of your database.
- Requires an initial **custom implementation**, but it's a one-time setup. To simplify implementation, you can use the [DebiAI Data Provider Python module](https://github.com/debiai/easy-data-provider).

The quicker your data provider is, the quicker your data will be available in DebiAI.
## Performance Considerations

### Getting started
The **speed of data loading** into DebiAI depends on how quickly your Data Provider responds. This is influenced by:

To create your first data provider, read our [Quick start](quickStart/README.md).
- **Data size** – Larger datasets take longer to load.
- **Database performance** – A fast database speeds up response times.

::: warning Limitations
- The interface between data-providers and DebiAI is not yet stable, so the API is likely to change in the future.
:::
Optimizing your Data Provider ensures **faster** data retrieval in DebiAI.

## Getting Started

To create your first Data Provider, check out our [Quick Start Guide](quickStart/README.md).
Loading