3 screenshots of the merlin app, recording and identifying bird song.

Since dawn chorus day I’ve been noticing bird song more than ever. This has lead to wondering which birds are singing. A recognise a very small number. I’ve tried a couple of apps and my favourite so far is Merlin.

Merlin identifies bird sounds using breakthroughs in machine learning technology to recognize species based on spectrograms—visual representations of sounds

Sound ID.

Rather delightfully you see the names and thumbnails of bird the app recognises. This are hi-lighted each time the bird is heard. Even better you get the same effect playing back the audio. Hopefully this will lead to me being able to recognise a few more bird songs without the app.

How accurate the app is I do not know, but I have seen most of the ones it has identified nearby.

AudioMoth is a low-cost, full-spectrum acoustic logger, based on the Gecko processor range from Silicon Labs. Just like its namesake the moth, AudioMoth can listen at audible frequencies, well into ultrasonic frequencies. It is capable of recording uncompressed audio to microSD card at rates from 8,000 to 384,000 samples per second and can be converted into a full-spectrum USB microphone.

Looks a bit tricky to actually buy at the moment, but interesting device.

BBC Radio 4 – Word of Mouth, Chatbots 1

Like lots of other folk I’ve been reading plenty about Large Language Models, AI & Chatbots and playing with some of the toys.

I really liked Professor Bender’s approach and method. I also found this a very easy listen. My mind has tended to wander off when reading blogs post about AI. Very clear on the “not intelligent” and the risks associated with chatbots trained on large piles of language.

And specifically the things that they’re predicting is what would be a plausible next word given all the preceding words here and then again and then again and again.

And so that’s linguistically interesting that once you get to billions of words of text, there’s enough information in there just in the distribution of words to stick with things that are both grammatical and seemingly coherent.

So that’s a cool observation and it’s dangerous because we tend to react to grammatical, fluent, coherent, seeming text as authoritative and reliable and valuable.

So instead of talking about automatic speech recognition, I prefer to talk about automatic transcription because that describes what we’re using it for and doesn’t attribute any cognition to the system that is doing the task for us.2

 

  1. I subscribe to the RSS feed of this BBC radio program as a podcast, pity you can’t find the feed on the webpage.
  2. Ironically I used Aiko to get the text of the podcast for the quotes: “transcription is powered by OpenAI’s Whisper model running locally on your device”
Listened Feeding the People in Wartime Britain From National Kitchens to British Restaurants by Jeremy Cherfas from eatthispodcast.com
At the same time, during both World War One and World War Two, there were concerted efforts to feed people. It started with centrally cooked meals that people took home to eat, but soon blossomed into a far-reaching network of government-run restaurants.

Another really interesting episode from Jeremy Cherfas. I’d never heard of the British Restaurants that were more common in the midst of WW2 than MacDonalds are now. Where they came from, what happened in different places and the possibility of a return were all covered.

Ian McMillan celebrates spectral spaces, the pulse of the body, and the power of repetition, in a Verb which showcases emerging talent – new sound designers from the Sound First scheme ….

Ian is joined by the songwriter, producer and sound designer Benbrick, the poet, playwright and performer Hannah Silva, and Sound First participant Noah Lawson, to explore what sound design can bring to poems, and what sounds are buried in poems themselves.

Really enjoyed listening to this, perhaps I am a wee bit more tuned into sound after Sunday.