Processing pipeline
A cryo-EM experiment produces tens of thousands of movie files. To turn those raw frames into a set of picked, extracted particles ready for 3D reconstruction, the data must pass through a well-defined sequence of processing steps. Magellon automates the early, compute-heavy stages and lets you track, inspect, and re-run any step from the Jobs panel.
The 30-second mental model
Section titled “The 30-second mental model”A cryo-EM dataset is thousands of 2D images of the same molecule frozen at random orientations. Each image is one projection of the 3D molecular density. The pipeline’s job is to clean and characterise those images so the downstream reconstruction can figure out each molecule’s orientation and back-project everything into a 3D map.
Everything else is making that work in the presence of noise, motion, and imaging artefacts.
The pipeline at a glance
Section titled “The pipeline at a glance”| Step | What it does | Magellon today |
|---|---|---|
| Motion correction | Aligns the frames in each movie to remove beam-induced specimen motion | Automated via motioncor plugin (MotionCor2/3) |
| CTF estimation | Fits the microscope’s contrast transfer function per micrograph | Automated via ctf plugin (CTFFIND4) |
| Square detection | Locates grid squares in low-magnification overview images | Automated via ptolemy plugin (ONNX model) |
| Hole detection | Locates ice holes in medium-magnification images | Automated via ptolemy plugin (ONNX model) |
| FFT | Computes the power-spectrum FFT of each micrograph | Automated — always-on reference plugin |
| Particle picking | Finds particle coordinates in each micrograph | Automated via topaz (CNN-based) or template-picker |
| Micrograph denoising | Denoises micrographs using a trained CNN | Automated via Topaz denoising backend |
| Particle extraction | Cuts and normalises a stack of particle boxes | Automated via stack-maker plugin |
| 2D classification | Clusters extracted particles by appearance; removes junk | Automated via can-classifier (CAN + MRA) |
| 3D reconstruction onwards | Initial model → refinement → polishing → postprocess | Not yet automated — run externally (RELION, cryoSPARC) |
Steps 1–9 are what Magellon orchestrates today. Steps 10 and beyond remain in external tools; Magellon stores their outputs as session artifacts for browsing.
What a microscope session produces
Section titled “What a microscope session produces”A single Krios session (12–48 hours) typically contains:
| Item | Typical size | Description |
|---|---|---|
| Movies | 50–500 frames, 50 MB–several GB each | Raw detector output — one movie per acquisition position |
| Micrograph count | 1 k–10 k | Each micrograph captures ~100–1 000 particles |
| Gain reference | 1 file per session | Per-pixel sensitivity correction applied during import |
When Magellon imports a session it reads these files from the configured
data path (MAGELLON_GPFS_PATH) and creates a session record. Large
payloads (movies, micrographs) stay on the shared filesystem; only
metadata and task results travel over the message bus.
Step-by-step details
Section titled “Step-by-step details”1. Motion correction
Section titled “1. Motion correction”Each movie’s frames suffer from beam-induced specimen motion — the specimen drifts 5–50 Å during the exposure. Summing the frames naively blurs the image. Motion correction aligns the frames first, producing a single, sharper micrograph.
Inputs: movie stack + gain reference
Outputs: one aligned .mrc micrograph per movie
Plugin: motioncor — wraps MotionCor2/3; GPU-accelerated
2. CTF estimation
Section titled “2. CTF estimation”The microscope intentionally defocuses the image to increase contrast, which introduces a sinusoidal modulation in Fourier space (the Contrast Transfer Function). Every downstream step needs to know each micrograph’s CTF to correctly weight and combine signal.
Inputs: aligned micrograph
Outputs: defocus, astigmatism, and CTF goodness-of-fit per micrograph
Plugin: ctf — wraps CTFFIND4; multiple backends (fast, GPU, external)
The CTF fit quality score is stored as micrograph metadata. You can filter out poor micrographs (high astigmatism, low confidence) in the session view before dispatching particle picking — this dramatically reduces junk picks downstream.
3 & 4. Square and hole detection
Section titled “3 & 4. Square and hole detection”At low magnification the microscope acquires overview images showing the
grid squares. At medium magnification it captures the individual ice
holes within each square. Magellon’s ptolemy plugin uses ONNX-based
computer vision to locate both automatically, driving the acquisition
target selection pipeline.
Plugin: ptolemy — one plugin, two categories (square_detection
and hole_detection)
5. FFT
Section titled “5. FFT”A fast-Fourier-transform of each aligned micrograph produces a power-spectrum thumbnail — the classic “Thon ring” image used to visually verify CTF quality. The FFT plugin is always-on and its output appears immediately in the image viewer for every micrograph.
6 & 7. Particle picking and denoising
Section titled “6 & 7. Particle picking and denoising”Particle picking scans each aligned micrograph for blob-shaped signals
that match the expected particle size and produces a coordinate file
(x, y) per micrograph. Magellon ships two pickers:
| Backend | Method | Best for |
|---|---|---|
topaz | Trained CNN (Topaz) | General-purpose; works without a reference template |
template-picker | Cross-correlation template matching | When you already have a good 2D template |
The Topaz backend also supports micrograph denoising as a companion step — denoised micrographs feed improved coordinates back into subsequent picks.
8. Particle extraction
Section titled “8. Particle extraction”Extraction cuts a square box (e.g. 256 × 256 px) around each picked
coordinate, normalises the contrast, and writes all boxes to a single
.mrcs particle stack. The stack is the input to 2D classification.
Plugin: stack-maker — thin wrapper around the vendored extraction
algorithm from the Magellon algorithm library
9. 2D classification
Section titled “9. 2D classification”Picked particles always include some junk: ice contamination, broken molecules, neighbouring molecules accidentally cropped. 2D classification clusters all particles into K groups by appearance. Bad groups (featureless blobs, ice rings, edge artefacts) are dropped, leaving a clean particle set. Typical retention: 30–70 % of initial picks.
Plugin: can-classifier — Convolutional Autoencoder + Multi-Reference
Alignment (CAN+MRA)
Steps beyond 2D classification — initial 3D model generation, 3D refinement, CTF refinement, Bayesian polishing, and postprocessing — are typically run in RELION or cryoSPARC. Magellon stores and displays the results but does not yet automate these later stages.
Monitoring progress
Section titled “Monitoring progress”Each automated step dispatches work as tasks visible in the Jobs panel. One import creates one job containing one task per micrograph per step. The Jobs panel shows:
- Per-step progress bars
- Individual task status (pending / running / completed / failed)
- Live log output from the plugin processing each task
- Output file locations on the shared filesystem
Failed tasks can be individually retried from the Jobs panel without re-running the whole import.
Shared filesystem requirement
Section titled “Shared filesystem requirement”Metadata travels over the message bus; large files (movies, micrographs, particle stacks) travel over the shared filesystem. CoreService and every plugin container must mount the same path. In the default Docker Compose setup this is a bind mount; on HPC clusters it is typically GPFS, Lustre, or BeeGFS.
If a plugin container can write /magellon/home/<session>/motioncor/file.mrc
but CoreService cannot read that path, results will silently disappear.
Verify the shared mount before running your first import — see
Directory Structure for the
expected layout.
See also
Section titled “See also”- Plugins — architectural overview of the plugin system
- Plugin categories — full reference of every category and its current backends
- Managing plugins — start/stop/scale plugins, inspect logs, troubleshoot
- Data import — how to bring a session into Magellon