Sunday, October 24, 2010

FreeBSD at GSoC Mentor Summit

As in previous years, Google held a "Mentor Summit" to bring together representatives from the open source organizations that participated in the Google Summer of Code to share experiences of what worked, what didn't, and generally learn from each other about shepherding students through the program. The mentor summit is always run Unconference-style and it is a great opportunity to meet, learn, and socialize with the many other open source organizations.

In addition to several hours of face to face FreeBSD-related catch-up with Brooks Davis over pizza and beer, I particularly enjoyed catching up with old colleagues and learning about the current state of a variety of other open source projects I use such as R, Boost, NTP, and Ganeti.

This weekend Brooks and I were the only FreeBSD representatives. Given that I'm local and Google sponsors the travel of 2 representatives from each open source organization it's quite unfortunate we couldn't get another FreeBSD mentor here this year. I would strongly encourage some of the other mentors that have never participated in this forum to volunteer to represent FreeBSD next year. This program has funded approximately 117 students to work on FreeBSD over the past 5 years and the mentor summit is best way I know of to improve the experience for students and open source projects next year.

Thanks again to all the FreeBSD mentors that worked with students this summer and hope to see some of you at the post-GSoC Mentor Summit next year...

Wednesday, September 15, 2010

FreeBSD Summer of Code Students Highlighted on Google Blog

As in previous years, I've posted a summary of FreeBSD Project participation in Google Summer of Code on the Google Open Source Blog.

By my count we have now mentored at least 117 students on FreeBSD development through this program. As in previous years it was tough to identify a few student projects to highlight given how much cool work is going on here. My list is certainly not complete but at least a few other people mentioned that Efstratios Karatzas, Zheng Liu, and David Forsythe had done a lot of excellent work this summer. Hats off to them, all the students and mentors this summer, and Brooks and Robert for serving as administrators of this whole thing for us.

Tuesday, August 10, 2010

BSDCan on Google's Open Source Blog

A coworker of mine, Kirk Russell, just posted an excellent summary of BSDCan through the years on the Google Open Source Blog.

I wasn't able to make it to BSDCan this year due to family commitments, but I did make it to another open source conference later this summer that I also wrote about on Google's open source blog.

Kirk and I haven't worked closely together but we both do our best at evangelizing BSD and open source inside our respective corners of the company. It's great to see his post about all the excellent work happening in the BSD community on a corporate blog.

Sunday, May 30, 2010

Kirk McKusick on Journaling Soft Updates in FreeBSD

Dr. Kirk McKusick has produced a high quality recording of his talk on Journaled Soft-Updates at BSDCan 2010. This is the 92nd BSD conference video in the BSD Conferences YouTube channel.

Saturday, May 29, 2010

AsiaBSDCon 2010 Videos

The videos from AsiaBSDCon 2010 are now available on the BSD Conferences YouTube channel. The full list of 17 AsiaBSDCon videos includes:

Thanks Hiroki Sato and the other organizers of AsiaBSDCon for running a successful conference and uploading these videos. Some of these videos were previously available on ustream but are not currently accessible there. The YouTube channel provides automatic machine generated captions in ~50 languages, fast streaming, and a total of over 90 videos from conferences over the past ~3 years.

Tuesday, April 6, 2010

FreeBSD Tech Talk @ Google

Long time FreeBSD developer Luigi Rizzo from the University of Pisa came to Google last week to visit with Sam Leffler and me, and he agreed to give a talk about some of his work on link emulation and packet scheduling.

This marks the second FreeBSD video in the Google Tech Talks channel in addition to the 70+ videos in the BSD Conferences channel. Enjoy.

Saturday, February 20, 2010

Improved Conference Captions from Amazon Mechanical Turk (2)

After my initial experiments last month, I applied to the FreeBSD Foundation for funds to pay for additional human editing of the YouTube machine generated transcripts. The screenshot on the left shows an example HIT (Human Intelligence Task) available on Amazon Mechanical Turk.

The task description on the left is based on a template I created with three variables: $VIDEO_URL, $VIDEO_TITLE, and $CAPTIONS_URL. New HITs are then created by uploading a CSV file with three columns for each of those variables, e.g.

VIDEO_URL,VIDEO_TITLE,CAPTIONS_URL,"BSD v. GPL, Jason Dixon, NYCBSDCon 2008",,"Isolating Cluster Jobs for Performance and Predictability, Brooks Davis (DCBSDCon 2009",

Using this method I created 12 HITs for the first pass of editing for which I offered between $9 and $14 per video. A slightly modified template with the same three variables was used to pay ~$7 per video for a second pass to further improve the transcripts improved in the first pass.

The template has gotten more detailed over the past month in response to all of the minor ways that workers submitted less than perfect transcripts. The actual SBV file format used by YouTube captions is not formally specified anywhere as far as I can tell, but the 60 character maximum width and simple format can be verified in submitted transcripts with a few emacs macros.

The transcript files have been checked into the FreeBSD Doc CVS Repository. The full list of videos with human-edited English language transcripts is:

Sunday, January 10, 2010

Improved Conference Captions from Amazon Mechanical Turk

Just wanted to send a quick note that three of the popular videos from the BSD Conferences YouTube channel have been updated with human-edited English language caption files. These offer a significant improvement over the machine generated captions I wrote about last month.

The following videos have been updated:

I've also posted three simple captions text files which provide the times and text in a very simple ascii format in case anyone wants to provide a diff to improve any remaining mistakes in the captions.

The transcriptions were done with the help of the industrious workers behind Amazon Mechanical Turk. The three transcripts above, representing at least 6 person hours of work, but easily twice that much time, were completed for less than $50 by leveraging the timing information from free machine generated captions and mechanical turk for the editing. This is less than 1/10th of the cost of a commercial transcription service.

What is the quality of these captions in other languages when automatically translated with YouTube? Are there any other videos for which captions would particularly be useful?

AsiaBSDCon is coming up in March, and I hope to have things streamlined by then such that videos with both Japanese and English captions can be added to the channel shortly after the conference.