22/12/2025
Take a word like "champagne". Our brain does interesting things in order to store it in our mental lexicon — that is, our memory of words. "Champagne" is formed by several phonemes and, still, it comprises a single morpheme, a single unit. Hence, our linguistic apparatus can treat it as such when making sense of sentences, for example.
Using Dehaene and colleagues’ definition, let’s consider “chunk" a “group of contiguous items that frequently recurs as a whole and that are therefore usefully encoded as a single group by the nervous system” (2015; see Ref. 2). As they explain, a sequence of events might well be grouped together as a “chunk” by the brain and stored for further processing.
One interesting question, then, would be what cues the brain uses to segment continuous sequences into smaller units. Thinking of spoken sentences, for example, we listen to a continuous acoustic signal and — effortlessly — segment it into words. In written texts, periods help us identify when the sentence ends. Likewise, blank spaces cue our eyes to word boundaries. In both cases, we find our units — sentences and words, respectively.
Now, let’s put language aside and consider more general experiences. How does the brain look for patterns in any auditory stimuli?
- - -
Electrophysiological data opens a window through which neural processes can be assessed in several aspects. Thanks to this method, Najman and colleagues (2025; see Ref. 1) addressed the issue of which classes of probabilistic models the brain uses to encode sequences of events.
Participants were exposed to a samba-like rhythm made of hand-claps, while some small portion of the sounds were omitted. The sequences of auditory stimuli were random. The innovation proposed was how to tackle the question, given that samba has a pattern of strong beats, weak beats and silent units.
A novel procedure was introduced to investigate whether and how the brain forms clusters of data based on probabilistic features of the stimuli. The paper unveils “hidden features” that could not be retrieved by a context tree — a subject of a previous post (see Ref. 3). The results show that the brain uses “the occurrence of the strong beat to identify the structure embedded in the sequence of stimuli”.
The key statistical role of strong beats, specifically, is highlighted.
It turns out that sequences of auditory stimuli, just like words in sentences, may also contain their own cues, markers for boundaries — and they can be seen in EEG data registered from the pre-frontal cortex. Moreover, the researchers’ new clustering procedure allowed them to group sets of EEG data by intrinsic law. Once again, interesting results come along with methodological innovation that can benefit the whole field.
- - -
How each chunk is treated, being a single unit, depends on the phenomenon being observed; the neural or mental process. As usual, there is much to be elucidated. We know that, in some cases, units can be merged in a higher level to form another chunk (another unit), meaning the brain can process information in a hierarchical fashion — as it does with language when building sentences, to recall our initial example.
- - -
Now, broadly speaking, several probabilistic models can be recruited to parse random sequences like. Mathematical lens, they help the brain calibrate its “vision” accordingly. Hidden patterns come to light; claps get more predictable. Auditory sequences become samba.
That is it for this week. Don’t forget to check the paper — links below!
- - -
Chunking | The “Black Box of Science” series
References:
1. Najman et al. (2025) - The ‘design features’ of language revisited. https://doi.org/10.1371/journal.pcbi.1012765
2. Dehaene et al. (2025) - The Neural Representation of Sequences: From Transition Probabilities to Algebraic Patterns and Linguistic Trees. https://doi.org/10.1016/j.neuron.2015.09.019
3. NeuroMat (2025) - "Waltz in the Dark | The “Black-Box of Science” Series", Facebook post. https://www.facebook.com/share/p/1GPnV1JvoQ/
Author summary A classical conjecture is that the brain is constantly estimating regularities from sequences of events to be able to properly act upon the environment. We assume that, by doing statistics, the brain chooses a model from a class of possible models. Which class of models is used by the...