about this log
Notes, drafts, and working claims from hands-on work in efficient sequence models, bioacoustics, olfaction, and research taste.
speech and music models
Hands-on speech and music modeling work: codec language models, hybrid recurrent-attention backbones, memory caching, and training reports.
-
Completed 12k Run: Early Speech Emergence in a 21.9M OLMo-Hybrid Speech LM
The full 12k-step A100 run of a 21.9M OLMo-hybrid speech codec LM on LJ Speech finished cleanly, reaching EMA val loss 3.8207 and perplexity 45.63, with clearly stronger samples...
-
MC-LA vs Hybrid MC-LA 1:3: Live Training (In Progress) in progress
Two memory-cached linear attention variants training on 25k FMA tracks. Early results: the hybrid nearly matches full MC-LA on loss while running 1.9x faster. GRM gating is...
-
Memory Caching for SSMs: From Paper to Implementation
Applying Memory Caching (MC) from arXiv:2602.24281 to Mamba and linear attention for music generation — a novel extension the original paper never tested. Architecture decisions...
-
SSMs for Music Generation: Baseline Experiments
Baseline comparison of Transformer vs Hybrid Mamba-Attention (1:3) for autoregressive music generation over DAC tokens. The hybrid matches the transformer on all codebook metrics...
bioacoustics and olfaction
Representation-learning projects for biological signals: animal vocalization encoders and topology audits of learned odor embeddings.
-
Udani: Depth-Recurrent Hybrids for Bioacoustic Representation Learning
Udani is a bioacoustic representation-learning codebase built around masked acoustic prediction, fused Gated DeltaNet blocks, local attention, and a depth-recurrent 3:1 hybrid...
-
Topological Signals in Learned Odor Embeddings
A representation audit of OpenPOM and chemical baselines: robust H1 signal appears in learned odor embeddings, but the topology is not unique to POM.
essays and theory
Short conceptual pieces that inform the research taste: compression, curiosity, aesthetics, and model-building.