The scaling of machine learning models, driven by massive parallel hardware and extensive datasets, has historically been hampered by sequential bottlenecks in core algorithms. Dynamical systems, crucial for models like RNNs and MCMC, were a prime example of this limitation. This PhD dissertation, by Xavier Gonzalez, fundamentally challenges this paradigm.
Beyond Sequential Bottlenecks: Parallelizing Dynamics
This work demonstrates that dynamical systems can indeed be parallelized across their sequence length. The core innovation reframes sequential computations as nonlinear equations solvable via Newton's method, leveraging parallel associative scans. This approach bypasses the inherent sequential dependency, a significant theoretical and practical leap. The research presented in Gonzalez's dissertation addresses critical limitations of prior parallel Newton methods, namely inefficiency and instability.
Stability and Convergence Guarantees for Parallel Newton Methods
Methodologically, the thesis introduces scalable and stable parallel Newton methods by incorporating quasi-Newton and trust-region techniques. Quasi-Newton variants offer speed and memory improvements, while trust-region methods provide enhanced stability. Theoretically, the framework unifies various fixed-point iterations, including Picard and Jacobi. Crucially, it establishes a linear convergence rate dependent on approximation accuracy and stability. A precise condition, rooted in dynamical stability, is identified: the Largest Lyapunov Exponent dictates whether parallelization provably accelerates a dynamical system. This provides a firm theoretical foundation for the applicability and performance of these parallel techniques.