Train a machine learning model too large to fit on one machine by distributing its parameters across a cluster
Run gradient boosted trees or logistic regression on massive datasets using your existing Hadoop or Spark infrastructure
Train graph neural networks for node classification or link prediction on large-scale graph data
Integrate Angel into an existing Spark data pipeline for the model training step without rebuilding your infrastructure
Requires a running Hadoop YARN cluster, not suitable for single-machine or laptop use.
Angel is a distributed machine learning platform developed jointly by Tencent and Peking University. Its core purpose is training machine learning models on very large datasets, particularly when the model itself has an enormous number of parameters that would not fit on a single machine. It was built from Tencent's internal experience handling the kind of scale that comes with a major internet company's data. The system is built around an idea called a Parameter Server. In simple terms, this means the model's parameters (the numbers that get adjusted during training) are split across many server machines, while separate worker machines process the training data and send updates back. This split allows training on datasets and model sizes that would be impractical on a single computer. Angel runs on Yarn, which is the resource management layer commonly used in Hadoop clusters. It also integrates with Spark, a popular distributed data processing tool, through a component called Spark on Angel. This means teams already using Spark for data pipelines can incorporate Angel for the model training step without rebuilding their infrastructure from scratch. The list of algorithms included in the repository is long. On the traditional machine learning side it covers logistic regression, support vector machines, factorization machines, k-means clustering, gradient boosted decision trees, and others. There is also a graph computing module that includes algorithms for ranking pages by importance, detecting communities, finding common connections between nodes, and training graph neural networks for tasks like node classification or link prediction. The project is open source under the Apache 2.0 license and is active under the Linux Foundation's Deep Learning Foundation. Several academic papers have been published about the system and its components, including work presented at major database and machine learning research venues.
← angel-ml on gitmyhub — every repo by this author, as a profile.
Verify against the repo before relying on details.