explaingit

tkipf/gcn

7,386PythonAudience · researcherComplexity · 4/5Setup · moderate

TLDR

The original research code for Graph Convolutional Networks, a machine learning method that classifies nodes in a graph by combining each node's own features with those of its neighbors, includes three citation network datasets to run immediately.

Mindmap

mindmap
  root((GCN))
    What it does
      Node classification
      Semi-supervised learning
      Graph representation
    Models
      Standard GCN
      Chebyshev variant
      No-graph baseline
    Datasets
      Cora citations
      Citeseer citations
      Pubmed citations
    Tech
      Python
      TensorFlow
      Adjacency matrix
Click or tap to explore — scroll the page freely

Code map

Detail Auto

An interactive map of this repo's files and how they connect — its source is parsed live in your browser. Click Visualize to build it.

filefunction / class

Things people build with this

USE CASE 1

Reproduce the GCN paper results on the Cora, Citeseer, or Pubmed citation datasets straight from the repo

USE CASE 2

Train a node classification model on your own graph data by providing an adjacency matrix and feature table

USE CASE 3

Compare the main GCN model against the Chebyshev polynomial variant and a no-graph baseline

USE CASE 4

Adapt the code to classify entire graphs rather than individual nodes using the graph-level data loader

Tech stack

PythonTensorFlow

Getting it running

Difficulty · moderate Time to first run · 30min

Requires an older TensorFlow 1.x version, running on TF 2.x needs compatibility mode or code adaptation.

In plain English

This repository contains the original code for Graph Convolutional Networks (GCN), a machine learning technique introduced in a 2017 research paper by Thomas Kipf and Max Welling. The code is written in Python using TensorFlow, an older version of the framework. The paper was accepted at a major machine learning conference called ICLR 2017. The core idea is about classifying nodes in a graph when you only have labels for some of those nodes, a setup called semi-supervised learning. A graph here means any data that has items and connections between them: social networks where people are nodes and friendships are edges, or academic citation networks where papers are nodes and citations are edges. The model learns to assign each node a category by looking at both the node's own properties and the properties of its neighbors in the graph. To use it with your own data, you provide three pieces: a table describing which nodes are connected to which (an adjacency matrix), a table of features for each node, and a table of labels for the nodes you already know. The repository ships with three academic citation datasets (Cora, Citeseer, and Pubmed) so you can run the demo immediately without gathering data yourself. The code offers three model variants. The main GCN model is the one from the paper. A second variant uses a mathematical technique called Chebyshev polynomials for a different approach to spreading information across the graph. A third is a simple baseline network that does not use the graph structure at all, useful for comparison. There is also support for classifying entire graphs rather than individual nodes, which requires a slightly different data setup. This repository is primarily a research artifact. It is the reference implementation accompanying a specific paper, not a general-purpose library. Researchers and students studying graph-based machine learning commonly use it as a starting point or a benchmark.

Copy-paste prompts

Prompt 1
How do I prepare a custom adjacency matrix and node feature matrix to train the GCN model on my own graph dataset?
Prompt 2
Walk me through running the GCN demo on the Cora dataset and explain how to read the accuracy output
Prompt 3
What changes do I need to make to switch from node classification to classifying whole graphs with this code?
Prompt 4
How does the Chebyshev model in this repo differ from the standard GCN model and when should I prefer it?
Prompt 5
How do I adapt this TensorFlow 1.x GCN code to run in TensorFlow 2.x without rewriting the model logic?
Open on GitHub → Explain another repo

← tkipf on gitmyhub — every repo by this author, as a profile.

Verify against the repo before relying on details.