Researchers have simulated the process of the sensory sounds coding for artificial intelligence by modelling human hearing.
Scientists of Peter the Great St. Petersburg Polytechnic University (SPbPU), Russia, have come closer to creating a digital system within artificial intelligence to process speech and human hearing in real-life sound environment, for example, when several people talk simultaneously during a conversation. The researchers have simulated the process of the sensory sounds coding by modelling the mammalian auditory periphery.
Artificial intelligence equipped with human hearing
According to the experts, the human nervous system processes information in the form of neural responses. The peripheral nervous system, which involves analysers (particularly visual and auditory) provide perception of the external environment. They are responsible for the initial transformation of external stimuli into the neural activity stream and peripheral nerves ensure that this stream reaches to the highest levels of the central nervous system.
This allows an individual to qualitatively recognise the voice of a speaker in an extremely noisy environment. At the same time, according to researchers, existing speech processing systems are not effective enough and require powerful computational resources.
To solve this problem, the research was conducted by the experts of the ‘Measuring information technologies department at SPbPU. The study is funded by the Russian Foundation for Basic Research.
During the study, the researchers developed methods for acoustic signal recognition based on peripheral coding. Scientists will partially reproduce the processes performed by the nervous system while processing information and integrate this process into a decision-making module, which determines the type of the incoming signal.
Anton Yakovenko, project lead explains: “The main goal is to give the machine human-like hearing, to achieve the corresponding level of machine perception of acoustic signals in the real-life environment.”
According to Yakovenko, the examples of the responses to vowel phonemes given by the auditory nerve model created by the scientists are represented the source dataset. Data processing was carried out by a special algorithm, which conducted structural analysis to identify the neural activity patterns the model used to recognise each phoneme.
The proposed approach combines self-organising neural networks and graph theory.
According to the scientists, analysis of the reaction of the auditory nerve fibres allowed to identify vowel phonemes correctly under significant noise exposure and surpassed the most common methods for parameterization of acoustic signals.
The SPbPU researchers believe that the methods developed should help create a new generation of neurocomputer interfaces, as well as ‘ provide better human-machine interaction.