Experiments in sound
Wikipedia says the human voice frequency for speech is between 85 to 155 Hz for men, and 165 to 255 Hz for women. That set me thinking.
- What is the limit to our hearing?
- How do sounds differ?
- How can we synthesise speech?
What are the limits to our hearing?
Kids can hear frequencies from 20 Hz to 20 kHz, while adults hear only up to 12-14 kHz (Frequency Range of Human Hearing).
To check the lower frequency limit, I created an MP3 with sounds from 1 Hz to 100 Hz at 1 second intervals. Just play the sound, and see when you start hearing something. (Of course, whether you can hear something also depends on the volume of your speaker, the ambient noise, etc.) I could hear nothing for the first 40 seconds: so I can’t hear frequencies lower than 40 Hz.
PS: Don’t be worried if you don’t hear anything for a while. You’re not supposed to! Keep the volume at full level, though.
To check the upper frequency limit, I created this MP3 with sounds from 1 kHz to 20 kHz in 1 second intervals. Just play the sound, and see when you stop hearing anything. I couldn’t hear anything beyond 14 seconds: so I can’t hear frequencies beyond 14 kHz.
How do sounds differ?
I took this audio file of someone reciting vowels and plotted a spectrogram (below). A spectrogram plots time on the X axis and frequency on the Y-axis.
Some observations:
- All the vowels have evenly spaced bars. (In this case, they’re all multiples of something around 120 Hz.)
- ‘u’ has the lowest frequency mix. ‘a’ spans from low to high. ‘i’ has a bit of low and a bit of high, nothing in the middle. ‘ai’ and ‘au’ look like ‘a’ followed by ‘i’ and ‘u’ respectively.
How can we synthesise speech?
I don’t know. There are lots of speech synthesizers. They sound robotic. I’m trying to see if knowing what sounds look like improves things. I’ll let you know if I do well.