Show HN: Garnet – Distributed Python on Kubernetes Garnet (https://garnet.ai) is a framework to run distributed and parallel Python on Kubernetes. We make it easy to scale from laptop to cluster, without requiring any devops work. Garnet plugs into any existing Python environment (notebook, IDE, CI/CD pipelines, custom apps) using a lightweight client library, and enables users to scale popular data/ml libraries (including pandas, NumPy, scikit-learn, XGBoost, etc.) and custom code to a remote cluster for execution. Under the hood, we natively wire up Dask with Kubernetes for scheduling, which brings advantages such as dynamic resource allocation, ephemeral clusters and autoscaling, while abstracting the devops complexity for developers and data scientists. Our vision is to build the orchestration layer for distributed computing in Python. We’re starting off with Dask, but have goals of supporting other frameworks such as Ray and Modin in the future. Some of the cool use cases we’ve seen from our users are: Bursting to the cloud from local dev environments for data and memory intensive computations Standing up managed Dask and Kubernetes clusters programmatically in 3-4 lines of Python, without any devops knowledge (Docker, k8s etc.) Parallelize existing codebases in pandas, NumPy, scikit-learn, XGBoost etc. with minimal code changes, without opting for a more complex system such as Spark Currently, users can download our client library and run Dask workloads on our fully managed cluster (see docs). We’re working on adding the capability to run Garnet on your own cloud (Kubernetes) in an upcoming release. We’re excited to hear your feedback, and what you’d like to see. Please drop your email on our website so we can get in touch directly. January 22, 2021 at 04:49PM
Comments
Post a Comment