By calculating the indices over full frequency range and representing three indices as RGB pixels values, diurnal patterns of weeks and months, even years can be viewed, in an “Extended Acoustic Summary Image” (Fig 2)
Fig 2. Example of a Extended Acoustic Image from Towsey et al 2014. An Extended Acoustic Summary image for the months March to October, 2013 Each pixel RGB value represents values for three acoustic indices for a 1 minute recording
The false-colour spectrogram will be as informative as the indices it is generated from – and this will vary task to task. This approach is promising for time-series data from a single point, but many other possibilities exist, including interactive resynthesis of audio features, for example.
(How) can feature visualisation – or sonification/ resythnesis – provide a meaningful perceptual summary of large audio archives? Can ecologically relevant features be distinguished - such as animal sounds (biophony), weather (geophony) and machinery (technophony)? Can silent and distorted files be made apparent? Might interactive perceptualisation afford deeper insights?
Two 3 hour dawn chorus recordings are available, made in different habitats at the same time. Each is segmented into 180 one minute mono wav files. The recordings start 90mins before sunrise, capturing the onset of the dawn chorus. Beside a bird chorus of increasing density, there are some sheep, various engines (planes, cars) and a thunder storm, followed by rain.
UK dawn chorus recordings on Fig Share
One of the many excruciating features of Trump’s presidential election campaign was his constant interruptions of Hilary Clinton during the presidential debate. And recent research published in the Journal of Social Sciences2 reveals similar gender differences in interruptions in academic job talks. Automatic analysis of speaker characteristics, such as gender, would be a powerful tool in analysis of conversational dynamics in oral history, gender studies and numerous other humanities disciplines.
Could machine listening in combination with supervised learning, or even unsupervised clustering be used to discriminate between voices in an interview in order to identify conversational dynamics?
Participants are invited to consider which audio features and/or machine learning methods might be best applied.
Trump vs Clinton Presidential Election Campaign 2016.
For exploration, an audio file of Trump’s Clinton interjections is available here
Python implementation of Acoustic Indices for soundscape analysis https://github.com/sandoval31/Acoustic_Indices
App on Appstore
Video on Vimeo
Video on Vimeo
Described in Mital, P. K., & Grierson, M. (2010) Mining Unlabeled Electronic Music Databases through 3D Interactive Visualization of Latent Component Relationships.
Towsey, Michael, Liang Zhang, Mark Cottman-Fields, Jason Wimmer, Jinglan Zhang, and Paul Roe. “Visualization of long-duration acoustic recordings of the environment.” Procedia Computer Science 29 (2014): 703-712. ↩
Blair-Loy, Mary, Laura E. Rogers, Daniela Glaser, Y. L. Wong, Danielle Abraham, and Pamela C. Cosman. “Gender in Engineering Departments: Are There Gender Differences in Interruptions of Academic Job Talks?.” Social Sciences 6, no. 1 (2017): 29. ↩