ServiceUpdated on 14 January 2026
Beacon data lakes for federated, high-performance access to large data collections
Project Engineer at MARIS B.V.
Delft, Netherlands
About
Beacon is designed for very fast real-time access to data subsets from large collections, returning one harmonised file on-the-fly. The software can read datasets stored in a wide variety of file formats (NetCDF, Parquet, Zarr, and Beacon Binary Format) stored locally or stored on S3 compatible Object Stores. Subsetting by users can be done using SQL or JSON queries on individual datasets, multiple datasets at the same time, or entire collections of datasets.
It has been widely and succesfully demonstrated in various European projects such as Blue-Cloud2026, FAIR-EASE and ENVRI-HUB Next and is now also being used in Dutch national projects.
It is written in Rust and C, chosen for their low-level control and superior performance compared to Python-based or traditional database systems. It runs on any platform via Docker containers and consists of a REST API for data querying and index management, combined with core libraries that enable fast data indexing and search.
Next to this, Beacon supports making your data collection more interoperable, by including mappings and allowing for harmonisation with other sources on the fly. In this context it would fit perfectly in the EOSC federation by enabling a data access layer on top of multidisciplinary data collections.
From a provider perspective it is very simple to set-up a Beacon instance containing your data collection. The easiest and fastest way to get a Beacon Instance up and running is through using the Beacon docker compose file at: Beacon Docker Compose. To enable Beacon to connect to an existing S3 bucket requires only 2 additional environment variables to be set. The “AWS_ENDPOINT” which tells Beacon what the URL to the S3 provider is, and the “BEACON_S3_BUCKET” which tells Beacon which Bucket to use as data collection to enable subsetting on. This means it can be set up in less than a minute. We also have an example available at: Beacon Docs: S3 Object Storage. After setting up your Beacon instance, it is immediately accessible via various entries, such as Jupyter Notebooks or a newly developed User Interface called Beacon Studio.
If you want to know more about Beacon, feel free to contact me.
Applies to
- Service Catalogues, Interoperability, & Integration
- Integrating scientific data repositories
- Federated Compute & Storage
- VRE
- Scientific workflows and services
Organisation
Similar opportunities
Service
- Federated Compute & Storage
- Scientific workflows and services
- Service Catalogues, Interoperability, & Integration
Enol Fernández
Principal Software Architect at EOSC Data Commons
Amsterdam, Netherlands
Service
- Integrating scientific data repositories
- Service Catalogues, Interoperability, & Integration
ALEXANDRA KOKKINAKI
Senior Data Scientist at National Oceanography Centre, British Oceanographic Data Centre
Liverpool, United Kingdom
Service
Onedata for distributed data ecosystems supporting data & metadata management
- Federated sync-and-shares
- Federated Compute & Storage
- Integrating scientific data repositories
Lukasz Opiola
IT System Engineer & Researcher @ Onedata.org (Cyfronet AGH) at Polish EOSC Node
Kraków, Poland