Skip to content

CurvineIO/curvine

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

curvine-font-dark curvine-font-light

English | 简体中文 | Deutsch | Español | français | 日本語 | 한국어 | Português | Русский

License Rust

Curvine is a high-performance, concurrent distributed cache system written in Rust, designed for low-latency and high-throughput workloads.

📚 Documentation Resources

For more detailed information, please refer to:

Use Case

use_case

  • Case1: Training acceleration
  • Case2: Model distribution
  • Case3: Hot table data acceleration
  • Case4: Shuffle acceleration
  • Case5: Multi-cloud data caching

🚀 Core Features

  • Multi-Cloud Support: Curvine is compatible with object storage services from multiple cloud providers as its underlying storage layer, enabling transparent data migration across different vendors' object storage platforms.
  • Cloud-Native: Curvine supports CSI-based cloud-native integration with Kubernetes, enabling deployment and management of Curvine clusters via Helm charts.
  • Multi-tir Cache: Supports multi-tir cache strategies for memory, SSD, and HDD.
  • POSIX Semantic Support: Curvine delivers comprehensive POSIX semantic compatibility, implementing a high-performance FUSE layer to facilitate the manipulation of distributed cached data as if it were local disk storage.
  • Compatibility with S3 and HDFS Protocols: The system supports both S3 and HDFS read/write interfaces, facilitating seamless integration with artificial intelligence and big data technology ecosystems.
  • High Performance: Curvine employs "zero-copy" techniques multiple times throughout its data read/write pipeline and leverages asynchronous operations. Additionally, its core engine is built with Rust, ensuring optimal performance is achieved.
  • Raft Consensus: Uses the Raft algorithm to ensure the master's data consistency and high availability.
  • Monitoring and Metrics: Curvine features a comprehensive built-in observability metrics system, facilitating detailed monitoring of the performance of each component.
  • Web Interface: Provides a web management interface for convenient system monitoring and management.

📦 System Requirements

  • Rust 1.86+
  • Linux or macOS (Limited support on Windows)
  • FUSE library (for file system functionality)

Officially Supported Linux Distributions

OS Distribution Kernel Requirement Tested Version Dependencies
CentOS 7 ≥3.10.0 7.6 fuse2-2.9.2
CentOS 8 ≥4.18.0 8.5 fuse3-3.9.1
Rocky Linux 9 ≥5.14.0 9.5 fuse3-3.10.2
RHEL 9 ≥5.14.0 9.5 fuse3-3.10.2
Ubuntu 22 ≥5.15.0 22.4 fuse3-3.10.5

🛠 Build Instructions

This project requires the following dependencies. Please ensure they are installed before proceeding:

📋 Prerequisites

You can either:

  1. Use the pre-configured curvine-docker/compile/Dockerfile_rocky9 to build a compilation image
  2. Reference this Dockerfile to create a compilation image for other operating system versions
  3. We also supply curvine/curvine-compile image on dockerhub

🚀 Build Steps (Linux - Ubuntu/Debian example)

Using make to build:

# Build all modules make all # Build core modules only: server client cli make build ARGS="-p core" # Build fuse and core modules make build ARGS="-p core -p fuse"

Using build.sh directly:

# Build all modules sh build/build.sh # Display command help  sh build/build.sh -h # Build core modules only: server client cli sh build/build.sh -p core # Build fuse and core modules sh build/build.sh -p core -p fuse

Building Docker images:

# or use curvine-compile:latest docker images to build make docker-build # or use curvine-compile:build-cached docker images to build, this image already cached most dependency crates make docker-build-cached

After successful compilation, target file will be generated in the build/dist directory. This file is the Curvine installation package that can be used for deployment or building images.

🖥️ Start a single - node cluster

cd build/dist # Start the master node bin/curvine-master.sh start # Start the worker node bin/curvine-worker.sh start

Mount the file system

# The default mount point is /curvine-fuse bin/curvine-fuse.sh start

View the cluster overview:

bin/cv report

Access the file system using compatible HDFS commands:

bin/cv fs mkdir /a bin/cv fs ls /

Access Web UI:

http://your-hostname:9000 

Curvine uses TOML - formatted configuration files. An example configuration is located at conf/curvine-cluster.toml. The main configuration items include:

  • Network settings (ports, addresses, etc.)
  • Storage policies (cache size, storage type)
  • Cluster configuration (number of nodes, replication factor)
  • Performance tuning parameters

🏗️ Architecture Design

Curvine adopts a master-slave architecture:

  • Master Node: Responsible for metadata management, worker node coordination, and load balancing.
  • Worker Node: Responsible for data storage and processing.
  • Client: Communicates with the Master and Worker nodes via RPC.

The system uses the Raft consensus algorithm to ensure metadata consistency and supports multiple storage strategies (memory, SSD, HDD) to optimize performance and cost.

📈 Performance

Curvine performs excellently in high-concurrency scenarios and supports:

  • High-throughput data read and write
  • Low-latency operations
  • Large-scale concurrent connections

Contributing

Please read Curvine Contribute guidelines

📜 License

Curvine is licensed under the ​Apache License 2.0.

Star History

Star History Chart