Pronunciation Station


An ultrasound overlay video for linguistics is a video presentation that superimposes ultrasound image of the moving tongue over simultaneously recorded external video of the speaker, making the moving tongue visible inside the speaker’s mouth. Ultrasound overlay visualizations have proven to be effective in teaching both linguistics and language pronunciation; however, currently this overlay effect can only be achieved by mixing and editing the videos in post-production.


The objective of this ECE Capstone project is to create a Pronunciation Station system, where the system displays ultrasound overlays automatically mixed in real time, thus providing live biofeedback for learners. The completed system will include the following features:

  • An ultrasound machine will capture a speaker’s tongue movement, while simultaneously an external video camera films the same speaker in external side view.
  • The video feeds from the ultrasound machine and the external video are mixed and displayed in real time, with the tongue superimposed over the external image, and the videos scaled and rotated as needed to match. The tongue image is also coloured pink.
  • The mixed live video can be recorded for later playback.
  • A pre-recorded video can also be selected and displayed in a window on screen as the model for the learner.
  • Scalable guiding lines may be superimposed over the video images, to provide references for the learners.
  • The system is portable and can be set up and run with minimal training for the user.

    The Team


      Dr. Strang Burton, Department of Linguistics

    ECE Capstone Team

    • Juliana Hamada
    • Josh Ng
    • Cindy Ling
    • Derek Tam