Showing posts with label captions. Show all posts
Showing posts with label captions. Show all posts

Saturday, February 20, 2010

Improved Conference Captions from Amazon Mechanical Turk (2)


After my initial experiments last month, I applied to the FreeBSD Foundation for funds to pay for additional human editing of the YouTube machine generated transcripts. The screenshot on the left shows an example HIT (Human Intelligence Task) available on Amazon Mechanical Turk.

The task description on the left is based on a template I created with three variables: $VIDEO_URL, $VIDEO_TITLE, and $CAPTIONS_URL. New HITs are then created by uploading a CSV file with three columns for each of those variables, e.g.

VIDEO_URL,VIDEO_TITLE,CAPTIONS_URL
http://www.youtube.com/watch?v=mMmbjJI5su0,"BSD v. GPL, Jason Dixon, NYCBSDCon 2008",http://people.FreeBSD.org/~murray/improved-captions-bsdvsgpl.sbv
http://www.youtube.com/watch?v=Pe8LdJpBGJ4,"Isolating Cluster Jobs for Performance and Predictability, Brooks Davis (DCBSDCon 2009",http://people.FreeBSD.org/~murray/improved-captions-isolatingcluster.sbv


Using this method I created 12 HITs for the first pass of editing for which I offered between $9 and $14 per video. A slightly modified template with the same three variables was used to pay ~$7 per video for a second pass to further improve the transcripts improved in the first pass.

The template has gotten more detailed over the past month in response to all of the minor ways that workers submitted less than perfect transcripts. The actual SBV file format used by YouTube captions is not formally specified anywhere as far as I can tell, but the 60 character maximum width and simple format can be verified in submitted transcripts with a few emacs macros.

The transcript files have been checked into the FreeBSD Doc CVS Repository. The full list of videos with human-edited English language transcripts is:

Sunday, January 10, 2010

Improved Conference Captions from Amazon Mechanical Turk

Just wanted to send a quick note that three of the popular videos from the BSD Conferences YouTube channel have been updated with human-edited English language caption files. These offer a significant improvement over the machine generated captions I wrote about last month.

The following videos have been updated:


I've also posted three simple captions text files which provide the times and text in a very simple ascii format in case anyone wants to provide a diff to improve any remaining mistakes in the captions.

The transcriptions were done with the help of the industrious workers behind Amazon Mechanical Turk. The three transcripts above, representing at least 6 person hours of work, but easily twice that much time, were completed for less than $50 by leveraging the timing information from free machine generated captions and mechanical turk for the editing. This is less than 1/10th of the cost of a commercial transcription service.

What is the quality of these captions in other languages when automatically translated with YouTube? Are there any other videos for which captions would particularly be useful?

AsiaBSDCon is coming up in March, and I hope to have things streamlined by then such that videos with both Japanese and English captions can be added to the channel shortly after the conference.

Tuesday, December 22, 2009

Machine generated captions for BSD conference videos

One of the most frequent requests I've received, since Launching the BSD Conferences YouTube channel last year, has been for captions in Spanish, Russian, Chinese, and other languages. I was excited last month when Google announced automatic captions for Youtube videos using machine translation. This feature is still highly experimental but I am happy to report that it has been enabled for the BSD Conferences channel. In combination with the much more mature automatic translation feature, this means that captions are now available in over 50 languages from Afrikaans to Vietnamese for most of the 73 videos in the BSD Conferences channel.

The automatic captions are still highly experimental and the quality of transcription for highly technical content spoken by a diverse set of international speakers is a significant challenge to get right. If you are interested in helping to correct any of the English transcripts I would be happy to provide you a simple text file of the transcription, with each line offering the start and end time for the caption to be displayed, and the caption text. One advantage of the machine translation is that the most time consuming part of manually creating captions, synchronizing the timing of the text with the speech, has been done automatically. Even when the technical words are mangled, the timing information in the automatic captions files can be leveraged to make the process of manually improving the captioning much easier.

The experimental automatic captions are only available directly from the video watch pages, and not from channel pages or other views. For example, visit www.youtube.com/watch?v=nwbqBdghh6E to see one of our most popular videos, Kirk McKusick speaking on FreeBSD Kernel Internals. Hover over the triangle at the bottom right of the video, then over the CC submenu and select "Transcribe Audio". You can then choose to "Translate Captions" into a different language as well.