Photophone (Development Page)
In his 2018 graduation year, E. McAuliffe developed a photophone for his Medium ISP (Video, Blog).
Filtering appears to be the key for optimum clarity. The human vocal band lies within the range from 300k to 3k Hz making a band pass filter the recommended solution. Here are some resources,
In an Arduino Forum post, Grumpy Mike recommends this set up for an Audio Amplifier using the TDA7052A IC.
Volume control can be controlled as shown according to the TDA7052A datasheet.