explaingit

apple/ml-sharp

8,341PythonAudience · researcherComplexity · 4/5Setup · hard

TLDR

SHARP is Apple's research tool that generates realistic novel viewpoints of a scene from a single photograph using 3D Gaussian splatting, producing a full 3D representation in under a second on a GPU.

Mindmap

mindmap
  root((repo))
    What it does
      Single image to 3D
      Novel view synthesis
    How it works
      Neural network
      3D Gaussian splatting
      Real-world scale
    CLI Tool
      sharp command
      Input image folder
      Output 3D files
    Requirements
      Python
      PyTorch
      NVIDIA GPU for video
    Outputs
      3D Gaussian files
      Rendered video
Click or tap to explore — scroll the page freely

Code map

Detail Auto

An interactive map of this repo's files and how they connect — its source is parsed live in your browser. Click Visualize to build it.

filefunction / class

Things people build with this

USE CASE 1

Generate 3D scene representations from single photos at near-real-time speed for use in augmented reality or 3D content pipelines.

USE CASE 2

Reproduce and extend Apple's SHARP research paper results using the provided command-line tool and pretrained model weights.

USE CASE 3

Feed SHARP output files into existing 3D Gaussian rendering tools to produce video walkthroughs of a scene from a single input photo.

USE CASE 4

Benchmark SHARP against other single-image novel-view synthesis methods on standard image quality metrics.

Tech stack

PythonPyTorchCUDA3D Gaussian Splatting

Getting it running

Difficulty · hard Time to first run · 1h+

Video rendering requires an NVIDIA GPU, model weights are downloaded automatically on first run.

In plain English

SHARP is a research project from Apple that takes a single photograph as input and generates realistic images of the same scene from nearby camera angles. In other words, you give it one picture, and it produces what the scene would look like from slightly different positions, creating a sense of three-dimensional depth from a flat image. The way it works is that a trained neural network looks at the photo and quickly figures out a three-dimensional representation of the scene using a technique called 3D Gaussian splatting. This representation stores the scene as a large collection of small fuzzy blobs in three-dimensional space, each with color and opacity information. Once that representation is built, a rendering engine can produce new viewpoints in real time. The whole process from photo to 3D representation takes under a second on a standard graphics card, which the paper describes as three orders of magnitude faster than previous approaches. The output files are compatible with existing 3D Gaussian rendering tools. The project accompanies a research paper and includes a command-line tool called sharp. After installing the Python dependencies, you point it at a folder of input images and it writes the resulting 3D representation to an output folder. The model weights are downloaded automatically on the first run. A separate render command can then produce video along a camera path, though that step currently requires an NVIDIA GPU. The representation uses real-world scale, so camera movements correspond to actual distances rather than arbitrary units. The authors report that SHARP improves on previous methods by measurable amounts on several image quality benchmarks. The code and model are released under separate licenses, each with their own terms.

Copy-paste prompts

Prompt 1
I have a folder of input photos and want to run Apple's SHARP tool to generate a 3D Gaussian splat representation. Walk me through the Python install steps and the exact sharp CLI command to use.
Prompt 2
Using the SHARP repo, show me how to run the render command to produce a video along a camera path from the 3D output, and what NVIDIA GPU requirements I need.
Prompt 3
I want to integrate SHARP into a Python pipeline that takes a single photo and outputs a 3D file I can load in a Gaussian splat viewer. Show me the CLI invocations or Python API calls I need.
Prompt 4
Explain what 3D Gaussian splatting is in plain terms and how SHARP uses it to go from a single flat image to a 3D scene representation in under a second.
Open on GitHub → Explain another repo

← apple on gitmyhub — every repo by this author, as a profile.

Verify against the repo before relying on details.