Context

EDITO hosts its own cloud computing cluster, allowing users to run services and processes without having to think too much about the underlying infrastructure. On the other hand, the EDITO Data Lake is not a Data Lake in the usual sens: it's rather the "data access" component of EDITO, composed of both the EDITO data storage and the EDITO data catalogue.

📌 Note: Quotas Errors
You may encounter quota errors when running jobs on EDITO (for example error Onyxia API Error: PUT /api/my-lab/app Response: 500).

Here are three things to check:

Have you requested computing resources from the platform?
If so, do you still have resources available? You can check on the "My Services" or "My Processes" pages:
If so, you may have ghost jobs (jobs that weren’t properly terminated) that are draining your resources. You can either terminate them yourself using a simple Jupyter and Helm commands, or contact the user support using the widget in the bottom right corner.

Computing cluster

Elastic cluster

As mentionned above, EDITO is designed to spare users from having to worry about its underlying infrastructure (e.g virtual machines, CPU, GPU, etc.). The cluster is configured in a way to optimized computational resource usage at any time, by automatically scaling up and down computation nodes.

For example, when a user start a service, it will by default run on one of these nodes. The system automatically choose a node that have enough resources (CPU, RAM, disk storage) to host the new service.

📌 Note: if no active nodes have enough resources, a new node will be automatically provisioned to host the service. In case of dynamic provisioning, users can experience some latency before their new service is up and running.

Resources requests and limits

Requests are the minimum guaranteed amount of a resource that is reserved for a service/process.

Limits, on the other hand, are the maximum amount of a resource to be used by a service/process. This means that the service/process can never consume more than the memory amount or CPU amount indicated.

📌 Note: a service/process can use more resources than its request. However, it is not allowed to use more than its resource limit. For CPU resources, the limit act as a threshold, throttling your service/process when reached.

On the other hand, when a service/process tries to consume more than the allowed amount of memory, the system kernel terminates the process that attempted the allocation, with an out of memory (OOM) error.

Virtual CPUs, milli CPUs

The computational resource a service needs to run on the cluster is expressed as mCPU (milli CPU).

The mCPU is used in cloud resources to express the amount of CPU usage time requested by a service to run. While 1000 mCPU correspond to 1 vCPU (virtual CPU, users can consider a vCPU is a CPU), requesting 500 mCPU means the service will request 50% of the time of a CPU to run. This means all the CPU’s cores can be exploited at any time, allowing computation parallelization/multi-threading. Specifically, it’s not because you request less than 1000 mCPU that you can’t run a multi-threaded service.

CPU nodes and GPU nodes

Currently there are two types of computational nodes available in EDITO cloud computing cluster: CPU nodes and GPU nodes. GPUs have a specific resources, such as VideoRAM. To use GPU a user need to start a service or a process that is configured for GPUs.

Current node configurations

The following table summarize the current cloud computing cluster configuration:

Node type	vCores	RAM (GB)	Web Disk Storage SSD (GB)	VideoRAM	min. node count	max. node count
CPU	8	32	128		1	30
GPU	16	112	320	48	1	4

This configuration is arbitrary and does not reflect the capacity of the platform once fully operational.

Please contact the EDITO User Support using the widget at the bottom right if you need nodes with higher capabilities.

Distributed computing frameworks

EDITO architecture supports distributed computing frameworks, such as Dask or Spark, that services or processes can rely on.

Please contact the EDITO User Support using the widget at the bottom right if you need access to a particular framework.

Quotas

EDITO is public and share resources funded by the European Commission. To avoid abuses, users have usage quotas. The following table summarizes the current quotas for personal and group projects:

Project kind	CPU vCores	RAM (GB)	Web Disk Storage SSD (GB)	GPU vCores	max. pod count
Personal project	8	32	50	1	50
Group project	16	64	100	1	100

📌 Note: “pods” are entities in which services and processes run. For simplicity, one can consider a service or process needs one pod to run.

If your services or processes never launch, face performance issues, or are stopped without an explicit action of yours, the root cause might be due to these restrictions. Please contact the EDITO User Support using the widget at the bottom right if you need bumped quotas.

Data Lake

As mentionned above, the EDITO Data Lake is not a Data Lake in the classical sens; it is rather the “data access” component of EDITO, composed of both the EDITO data storage and the EDITO data catalogue.

Data storage

EDITO provides an elastic cloud object storage allowing users to store personal, group and public data.

Basically, as a user, you have access to your personal storage that you only can manage (you can make part of it publicly accessible). There are also group storage, managed by the group members and a “public” storage, managed by the EDITO Team, in which everything is public.

While not running on Amazon Web Service, the object storage are compatible with AWS S3 API:

Storage kind	Technology	Governance / Management / Ownership
Personal project	One S3 bucket	User
Group project	One S3 bucket	Group members
Public	Several S3 buckets	Administrators (for now)

Owners of personal or group project storage can decide of the visibility of their storage content; they can share data or make the publicly available. Learn more about interacting with your storage here.

Quotas

EDITO is public and share resources funded by the European Commission. To avoid abuses, users have usage quotas. The following table summarizes the current quotas for personal and group projects, as well of for the public storage:

Storage kind	Volume amount (GB)
Personal project	20
Group project	20
Public	N/A

Please contact the EDITO User Support using the widget at the bottom right if you need bumped quotas.

External data storage

External S3-compatible storage can be configured in project settings, allowing to seamlessly work with it in parallel of (or instead of) the EDITO data storage.

Learn more about connection with external storage here.

Data catalogue

Referencing data

The EDITO data catalogue can reference data inside EDITO data storage or external data.

For example, it references Copernicus Marine data that are actually stored and managed by the Copernicus Marine Service. On the other side, it also references the ARCO versions of EMODnet data, that are stored in the public data storage of EDITO.

You can also reference data (hosted on your personal storage or any external services/platforms/projects) by interacting with the Data API.

Browsing/searching data

You can browse and search for data in the EDITO data catalogue graphically with the EDITO viewer or programmatically with the Data API.

Localisation

Currently, main EDITO cloud resources are provided by CloudFerro in Warsaw region WAW3-1.

What's next?

If you have any questions, problems, or suggestions, please feel free to contact us via chat using the widget available at the bottom right of the page.

How to credit EDITO?

EDITO Glossary

Start contributing to EDITO

Onboard your application on EDITO

Run code on EDITO or on a remote SSH target

How to understand EDITO resources?

Context

Computing cluster

Elastic cluster

Resources requests and limits

Virtual CPUs, milli CPUs

CPU nodes and GPU nodes

Current node configurations

Distributed computing frameworks

Quotas

Data Lake

Data storage

Quotas

External data storage

Data catalogue

Referencing data

Browsing/searching data

Localisation

What's next?

Funded by the European Union