AI Summary • Published on Dec 30, 2025
Sequential structure is a fundamental aspect of natural cognition, encompassing language, movement, and decision-making, and is equally central to artificial intelligence tasks. There is a significant need for frameworks that can evaluate sequence learning and processing in a domain-agnostic manner while providing a direct link to formal theories of computation. Existing software tools for symbolic sequence processing, cognitive modeling, and computational benchmarking often suffer from a narrow focus, a lack of integration across disciplines, or are poorly maintained. Specifically, current benchmarks frequently lack explicit control over the complexity of temporal dependencies, which limits their utility for systematically probing the capabilities and limitations of artificial learning systems.
The authors present SymSeqBench, an open-source, modular Python framework consisting of two primary components. First, SymSeq is designed for defining, generating, and rigorously analyzing structured symbolic sequences and associated tasks. It incorporates core data structures for grammar and sequence generation, a diverse library of artificial language generators, and a comprehensive suite of analysis metrics across token, string, string-set, and grammar levels. A crucial feature of SymSeq is its ability to generate structurally parameterized regular grammars with finely controllable complexity, primarily utilizing the information-theoretic Topological Entropy (TE). Second, SeqBench functions as a versatile pipeline for transforming these abstract symbolic sequences into concrete, task-ready datasets. It offers flexible mappings between symbolic representations and various embedded formats (e.g., one-hot encodings, images, audio samples) and supports a wide array of custom transformations, including controlled temporal perturbations. The framework supports both language recognition tasks (determining if an input string conforms to a grammar) and language transduction tasks (mapping input sequences to output sequences based on defined transformations). A `SeqWrapper` class provides a standardized interface for loading complete datasets and experimental setups via YAML configurations, and the system can process user-provided data, inferring underlying grammars where applicable. The methodology is grounded in Formal Language Theory, providing a principled way to conceptualize and standardize experiments across diverse domains.
SymSeqBench provides a systematic and versatile benchmarking framework applicable to both biologically inspired models and neuromorphic architectures. In comparative evaluations of biologically plausible spiking network models (CS and AKF) on n-step memory and context resolution tasks, the AKF model consistently demonstrated superior performance, particularly in handling ambiguous states and longer non-adjacent dependencies (NADs). Interestingly, increased filler variability in NAD tasks improved AKF model performance, aligning with observations in human cognition. For neuromorphic systems, SymSeqBench facilitated the systematic assessment of models such as LIF, adLIF, GRU, Mamba, and Transformer architectures on sequences with adjustable spatial and temporal complexity, utilizing various base datasets (SHD, SSC, GSC). Adaptive LIF networks exhibited benefits in learning complex spatio-temporal dependencies. While artificial neural network (ANN) baselines generally achieved higher accuracies, especially Mamba on more complex sequences, a noticeable performance gap persisted at higher sequence complexities even for these robust ANNs. Furthermore, the multi-scale analysis tools of SymSeqBench were effectively applied to empirical animal behavioral datasets. These analyses revealed distinct patterns of complexity and contextual depth across species, indicating that behaviors like zebrafish navigation and finch song likely involve supra-regular, hierarchical generative processes with higher topological entropy, whereas mouse, seal, and turtle behaviors were more consistent with regular or low-order Markovian grammars.
SymSeqBench establishes a unified, cognitively inspired, and theoretically principled framework for generating, transforming, and analyzing symbolic sequences with controllable complexity. This tool directly addresses the fragmentation in current research resources by offering standardized, cross-domain benchmarks for evaluating temporal processing in both biological and artificial neural networks. It enables researchers to systematically investigate task complexity, learning transfer, and generalization across diverse experimental paradigms, thereby forging crucial links between seemingly disparate cognitive phenomena. By providing the capability to generate user-constrained formal grammars of desired complexity and incorporating pre-built, cognitively inspired paradigms like Non-Adjacent Dependency (NAD) learning and Artificial Grammar Learning (AGL), SymSeqBench fills a critical void in synthetic data generation. It serves as a valuable diagnostic tool for pinpointing limitations in existing architectures and guiding the development of more effective temporal learning mechanisms, offering a unique approach to evaluating large language models with systematic complexity control and clear ties to human cognitive performance. Its modular design, which decouples task formulation from symbol embeddings, along with its comprehensive suite of analysis tools, positions SymSeqBench as a versatile resource across computational neuroethology, cognitive neuroscience, machine learning, and artificial intelligence. Future enhancements are planned to support supra-regular grammars, integrate additional established datasets, and improve web-based interfaces, ultimately fostering community contributions and accelerating methodological innovation across scientific disciplines.