(Dina Lipkind, Anja T. Zai, Alexander Hanuschkin, Gary F. Marcus, Ofer Tchernichovski & Richard H. Hahnloser 1 Nov 2017)
Songbirds, being skilled vocal learners18,19,20,21, provide an opportunity for studying how errors are assigned and minimized during the learning of complex motor sequences. A young zebra finch (Taeniopygia guttata) imitating an adult tutor has to match a series of spectrally distinct sounds (syllables) performed in a precise order (Fig. 1b). Zebra finches are capable of adjusting their developing song towards its target in a variety of ways, including morphing the spectral (phonological) structure of song syllables22,23,24,25, generating and adding novel syllables to their song23, 25, 26, and rearranging the positions of existing syllables26, 27. How then do they cope with the complexity of selecting the appropriate combination of operations that would reduce the mismatch between their own song and the target?
A possible way to reduce computational complexity could be to optimize one aspect of the task, while ignoring the costs of the other. At one extreme, the task could be reduced to assigning each syllable in the bird’s song to the temporally corresponding syllable in the target song (Fig. 1c, left). Such strategy would minimize sequence rearrangements, at the cost of possibly large phonological adjustments. Although this hypothesis has not been directly tested, a number of previous findings suggest that songbirds may not be using global alignment between song and target as a learning strategy. These include the observation that individual syllables are recognizable in developing zebra finch song before the correct sequence is apparent28; the existence of an early developmental phase in which repetitions of a single “proto-syllable” differentiate towards multiple targets22, 24, 25, 29, 30; the fact that many songbird species perform variable syllable sequences as adults (e.g., nightingales, starlings and Bengalese finches); and the ability of zebra finches to match a target exclusively through syllable rearrangements, without changing phonology26. An alternative strategy, therefore, could be to assign song syllables to target syllables in a manner that minimizes phonological distances, while ignoring combinatorial distances (Fig. 1c, middle). Such phonological greediness would increase the number of ensuing sequence changes and thus the overall sequencing cost26. An intermediate strategy could be to seek a trade-off between minimizing structural and temporal errors, for example by independently matching parts of the song sequence (such as phonology in bigrams or trigrams27) to parts of the target sequence