Falcon Optimizer
Where Frequency Meets Geometry
Experience learning as art, through trajectories, spectra and structure
Frequency Masking
Energy-aware spectral filtering adapts gradient updates through 2D FFT analysis
Rank-1 Updates
Low-rank approximations preserve essential gradient directions via power iteration
Orthogonal Projection
Gram-Schmidt orthogonalization ensures decorrelated parameter updates
The Art of Optimization
What is Falcon?
Falcon (Frequency-Aware Low-rank Conditioning Optimizer) is a novel optimization algorithm that combines frequency domain analysis with low-rank matrix approximations for deep neural network training.
Through comprehensive experiments on CIFAR-10 with VGG11, Falcon achieves 90.33% accuracy, demonstrating competitive performance with AdamW (90.28%) and Muon (90.49%).
Paper: "FALCON: Frequency-Adaptive Learning with Conserved Orthogonality and Noise Filtering" (GitHub)
Why Visualize?
Understanding optimization algorithms requires more than equations. This interactive suite lets you explore how different optimizers traverse loss landscapes, how frequency filtering shapes gradients, and how training dynamics evolve over time.
Each visualization reveals a unique perspective on the geometry of learning, transforming abstract mathematics into tangible insight.
Performance Highlights
Interactive Visualizations
3D Trajectory Viewer
Explore optimizer paths across loss landscapes with interactive 3D visualization
Frequency Filter Explorer
Draw custom filters and visualize 2D FFT transformations in real-time
Training Dynamics
Track loss curves, accuracy evolution, and adaptive scheduling across epochs
Experimental Results
Complete analysis of CIFAR-10 experiments with all figures and metrics
SVD Explorer
Interactive rank-k matrix approximation demonstration
Network Diagram
Layer-wise optimization strategy visualization
"In the dance of gradients and frequencies, patterns emerge—
each optimizer a unique choreography across the loss landscape."
— The Mathematics of Learning
SVD Explorer
Rank-1 approximation captures the top 1 singular value, preserving the most important structure while reducing dimensionality.
Network Architecture
Input
AdamW
Conv1
Spectral Filtering
Conv2
Spectral Filtering
FC1
Orthogonal Projection
FC2
Orthogonal Projection
Output
AdamW
Falcon adaptively applies different update strategies to different layers based on their characteristics and training phase.