

Tanush Shaska, Oakland University
Title: A Linear-Logical Semantics of Graded Neural Networks
Abstract:
We develop a resource-sensitive categorical semantics for graded neural architectures, indexing representations over an arbitrary abelian group GGG. These networks live in the symmetric monoidal closed category of bounded GGG-graded vector spaces, with layers modeled as parameterized degree-preserving morphisms.
A key result is that implicit duplication is categorically forbidden. Operations such as multi-head attention and broadcasting therefore require explicit use of the exponential modality, constructed as the graded symmetric algebra. This provides a model of Multiplicative Exponential Linear Logic and interprets Exponentially Graded Transformers as neural realizations of the associated comonad. Learning is formalized by lifting into the Dialectica category, where reverse-mode automatic differentiation becomes a strong monoidal functor satisfying a Grade Invariance theorem that keeps gradients aligned with the routing topology over GGG. The framework reveals linear logic as an intrinsic structural feature of graded neural computation. Joint work with Valeria de Paiva.
Fabian Ruehle, Northeastern University
Title: ML Explorations in Low-Dimensional Topology
Abstract:
The Z^\widehat{Z}Z-invariants of 3-manifolds are infinite qqq-series that simultaneously serve as topological invariants, BPS invariants, and conjectural quantum modular forms, yet the precise topological content they encode remains largely mysterious.
First, I will explain the construction of a large dataset of Z^\widehat{Z}Z-invariants. Then, I will introduce the interpretable machine learning pipeline we developed to extract topological information from these series. The pipeline uses a regression-based interpretability framework operating on the embedding layer paired with feature scoring techniques.
We show that neural networks disentangle classification tasks into a “homological” and a “topological” axis, and study cobordism invariants and modular properties of Z^\widehat{Z}Z-invariants. I will summarize how this interpretable machine learning pipeline generates precise, falsifiable mathematical conjectures.
Title: Tameness of Strongly Simply Connected Algebras
Abstract:
For the class of strongly simply connected finite-dimensional algebras, the representation type can be studied using the quadratic Tits form, similarly to the hereditary case. We present a concrete framework for detecting and constructing tame algebras in this class.
The main motivation for finding such algebras is the possibility of extracting Krull–Schmidt type invariants of their finitely generated modules.
Title: On Temporal Analogues of Tits Geometries, Small-World Graphs, and Post-Quantum Cryptography with AI Instruments
Abstract:
Artificial intelligence technology can be used for the construction of generators of pseudo-random or genuinely random sequences of elements from a selected commutative ring. This technique can be used for generating temporal analogues of Tits geometries in computer memory.
These combinatorial structures can be useful in constructing new families of small-world graphs. We present such families defined by equations over arbitrary finite commutative rings with nontrivial multiplicative groups.
These structures can be used in post-quantum cryptography instead of lattices or linear codes. We also present new cryptosystems defined in terms of temporal graphs and symbolic computations.
Chong Li
Title: Beyond Computation: A Hardware-Agnostic Approach to Systematic Cost Analysis for Distributed LLMs
Abstract: Large Language Models (LLMs) have become the de-facto backbone of modern AI solutions. Yet Transformer-based architectures are not only computation-intensive but also memory- and communication-heavy. While chip-vendor optimizations largely target computational throughput, they often leave the burdens of memory and communication to AI designers. In this talk, we present a systematic, hardware-agnostic method that maps an LLM directly to a profiling-free, analytical cost model of its distributed execution. From the model architecture alone, one can quantify computation, memory, and communication costs—bypassing any physical hardware measurement—and thereby recast the search for an efficient parallelization strategy as an optimization problem over a structured space of distributed mappings. Advanced parallelization strategies can thus be evaluated at design time, well before any code runs, saving costly machine hours and reducing the carbon footprint of going beyond computation.
16:00-16:45 PM
Lenore Mullin
Title: Attention at the Theoretical Minimum: A Mathematics of Arrays Framework for Memory-Optimal Transformer Kernels
Abstract: The attention mechanism is the dominant computational bottleneck in modern transformer-based AI, yet its standard implementation incurs quadratic memory traffic in the sequence length — a fundamental inefficiency that cannot be resolved by FLOP-reduction alone, since DRAM accesses cost 100–1000× more energy than arithmetic operations on contemporary hardware.
We present a "Mathematics of Arrays" (MoA) reformulation of scaled dot-product attention and its numerically stable softmax, deriving a *Denotational Normal Form* (DNF) that eliminates all intermediate arrays — including the implicit transposed-key buffer and every softmax temporary — by algebraic construction. The derivation proceeds via systematic psi-reduction of a PyTorch reference implementation through four Omega steps, each eliminating one category of intermediate storage, and is verified numerically at full double-precision floating-point against PyTorch on concrete inputs. The complete DNF accesses only original Q, K, V elements by multi-index, achieving O(ndk+ndv) data movement versus O(n2+ndk+ndv) for the standard implementation.
We further show that for batched, multi-head inputs of shape ⟨B,h,n,dk⟩, the Omega operator naturally partitions all higher-rank operations into independent 2-D sub-problems via Q(+.×)Ω⟨2,2⟩(transposeΩ⟨2⟩)(K), directly exposing batch- and head-level parallelism without additional scheduling logic. We contrast this with compiler-based approaches, arguing that no compiler can provide provable semantic equivalence across all targets; MoA's algebraic pipeline from Python to DNF to Operational Normal Form to a semantic subset of C to FPGA/ASIC — demonstrated in prior hardware co-design work — constitutes a verified, end-to-end pipeline with predictive performance bounds computable from array shapes alone.
A predictive performance model projects 2–100× speedup and 2–50× energy reduction, with the advantage growing linearly as n/dk at exascale sequence lengths.
Gaétan Hains
Title: Non-Uniform Quantization of Neuron Activations in Deep Neural Networks for Object Detection
Abstract:
We present an analysis of the effect of numerical precision reduction, or quantization, on the statistical distribution of neuron activation values. The statistical data is obtained from a layer-wise sampling of neuron activations in the PVAnet and Yangyi deep neural networks.
9:00-9:45
Theau Blanchard
Title: Finsler-Randers Metric Learning for Direction-Aware Latent Space Exploration
Abstract: Deep neural networks extract powerful latent representations from images, enabling fast and efficient downstream analysis. However, these latent spaces lack explicit geometric structure, leading to possibly poor interpolation fidelity and misleading extrapolation. Latent exploration techniques assume symmetric transition costs which are often not the case for real world data. Their symmetric nature cannot capture directional processes, such as disease progression, where certain transitions are impossible (e.g. tumor growth dynamics or cognitive improvement in Alzheimer's disease). Although Riemannian metrics extend Euclidean geometry by incorporating curvature, they remain symmetric. We show that understanding asymmetry and not curvature alone directly addresses these problems. To do so we propose a new Finsler-Randers metric learning framework. It incorporates direction-dependent distances, thus enabling the computation of trajectories respecting monotonic sequential constraints. We demonstrate its potential for interpolation and extrapolation on cross-sectional datasets.
9:45-10:30
Jason M. Gross
Title: Employing Abstract and Dynamical Techniques to Enhance AI Cognition and Navigation
Abstract: Our research program aims to impart into AI systems, particularly those empowering robotics, a far greater role for techniques that are both systematic and subtle as opposed to those employing sheer trial and error. In this vein, the system which we have conceived aims to solve two problems in robotics and AI which are separate but nevertheless have some bearing, at least indirectly, on each other. The first is known as the Limited Transfer Ability Problem in which machines cannot easily recognize object(s) by rearranging or reconfiguring elements based on abstract relations, a method that is unconsciously employed by humans to rapidly acquire knowledge about the world. The second is the limited capacity of machines to spontaneously adapt to changeable and unpredictable environments while simultaneously minimizing risk and danger. To attack the first problem, we shall devise a system that embeds a scheme that processes both possible and actual geometric features and discrete relations with respect to their affordances through an algorithm in part defined by an intra-cognitive game between differing conceptual strategies (drawing on complexity theory) to minimize losses. This process of sifting through and rating different permutations in turn results in probability matrices that will then feed into a Hopfield Network, which optimizes the way in which memories are stored and recalled in a type of neural network. We hypothesize that this approach will surpass the state of the art in this niche within AI. The experiment we set up will test the robot’s capabilities by presenting it with novel scenarios and scenes, calling upon it to recognize and consider unfamiliar trajectories. Our hypothesis concerning the second problem is that a system that combines chaotic attractors with reinforcement learning in a particular way will advance the field of robotics with respect to navigational capabilities. We propose a system in which the solutions to the equations of motion are influenced by the estimates of optimal trajectories calculated by means of an ensemble of chaotic and triadic attractors that track internal and environmental interdependencies and sensitivities so that it can autonomously calibrate its own operational settings and parameters given changeable and unpredictable topographies and obstacles. These solutions will then be input into a reinforcement learning and reward-based algorithm grounded in the Bellman equation to further optimize the selection of trajectories. The experiment for this hypothesis will require testing the robot’s capacity to navigate mazes and other varying or rough terrains that present an array of challenges for it to overcome.
10:30 Coffee Break
11:00-11:45
Thomas Oliver
Title: Data Representation in Number Theory
Abstract: Before applying machine learning or AI to number theory (or to any subject) we must first decide how the underlying objects should be represented as input to an algorithm. But what is the most meaningful representation of an arithmetic object? Is an integer N best viewed simply as an integer, as a vector of residues modulo primes p, or through structures attached to N, such as the coefficients of an elliptic curve of conductor N? Is an elliptic curve a five-dimensional vector arising from a Weierstrass equation, an infinite-dimensional vector determined by its L-function, or a vector field derived from its L-function together with its twists? Different representations reveal different patterns, and the choice of representation may determine what AI systems are able to learn and discover about arithmetic phenomena.
11:45-12:30
Iyad Assad Nekka
Title: Deep Learning for Anomaly Detection in Dynamic Graphs: A Taxonomy and Comprehensive Survey
Abstract: Anomaly detection in dynamic graphs has emerged as one of the most active research areas at the intersection of graph theory, temporal modeling, and deep learning. This paper presents a taxonomy of deep learning methods for this task, tracing the scientific lineage from mathematical foundations through computer science and machine learning to the six major deep learning architectural families. We establish a formal capability analysis distinguishing architectures by their ability to detect temporal versus structural anomalies, and identify Graph Neural Networks as the leading family natively suited for both dimensions through their message-passing mechanism. We catalog all surveyed methods organized by architectural combination and attributed to precise subfamily levels, identify seven systematic literature gaps representing unexplored directions, and propose a standardized benchmarking agenda. The taxonomy is accompanied by a complete visual diagram designed to serve as a reference for the research community.