Sunday, March 27, 2011

A curriculum for learning computer programming in WoW

(Updated: See note at bottom of post) What can you learn from the MMORPG World of Warcraft? A lot, it turns out. This is one of my conclusions from taking CMS.863J -- Computer Games and Simulations for Investigation and Education, taught by Eric Klopfer and Jason Haas. It's one of the classes offered this semester by MIT's Comparative Media Studies department, and is also cross-listed under Course 11 (MIT's Urban Studies department). Aside from myself and one other graduate student, all of the other people taking the class undergraduates, many in Course 6.

As indicated by the name of the class, the curriculum centers around computer games and educational theory relating to games and learning. However, like every other media and entrepreneurship-focused class I have taken at MIT, there is also a heavy focus on creating and building (Mens Et Manus is the institute's motto). In February, after getting up to level 13 in World of Warcraft and conducting quests with classmates in-world, we split into teams with the mission of building a sample curriculum around a specific topic area, using WoW as the learning environment.

A curriculum for learning computer programming in WoW
You may find it strange that a game like WoW can be used to study real-world topics, but there is actually a fairly long history of players using the virtual worlds to study or understand various phenomena. Edward Castronova's oft-cited 2001 paper on virtual world economies opened up a floodgate of research and academic discussion relating to massively multiplayer online games. Specific to World of Warcraft, the game has been used to study epidemiology, as well as topics related to learning. In 2008, a paper by Constance Steinkuehler and Sean Duncan in the Journal of Science Education and Technology studied scientific reasoning in thousands of WoW forum posts. They found that 86% of the posts contained "social knowledge construction," or "the collective development of understanding, often through joint problems solving and argumentation."

As expected, many of these posts consisted of discussions around certain classes of characters, spells, weapons, etc. However, 10% of the total posts studied used model-based reasoning, including mathematical models to explain some phenomenon. An example cited in the paper showed one player who apparently reverse-engineered damage algorithms in WoW to compare the abilities of priests vs. mages:

By intuition, you should notice a problem... but I’ll give you the numbers anyways 

For Mindflay, SW:P, and presumpably VT [3 priest spells]: Damage = (base_spell_damage + modifier * damage_ gear) * darkness * weaving * shadowform * misery 

For Frostbolt [mage spell] Average Damage = (base_spell_damage + (modifier + empowered frost) * damage_gear) * (1 * (1 - critrate - winter's chill - empowered frost) + (1.5 + ice shards) * (critrate + winter’s chill + empowered frost)) * piercing ice mindflay = (426 + 0.45 * dam) * 1.1 * 1.15 * 1.15 * 1.05
650.7 + 0.687 * dam 
frostbolt = (530 + (0.814 + 0.10)*dam) * ((1 - crit - 0.10 - 0.05) + (1.5 + 0.5) * (crit + 0.10 + 0.05)) * 1.06 
(530 + 0.914 * dam) * ((0.85 - crit) + 2 * (crit + 0.15)) * 1.06
0.968 * (dam + 579.7) * (crit + 1.15) 
Please notice the 0.687 versus the 0.968. That's the scaling factor. 

After students in our class had gotten the hang of WoW, there were discussions around how the game could be used to study trigonometry and group psychology. Our team, consisting of Andrew Hsiao and Michele Pratusevich (both course sixers) and myself, opted to use WoW as a platform for high school students to learn basic computer programming concepts, using a simple scripting language called Lua that can be run from within the WoW chatbox. Here's the curriculum that we developed:

Teaching Computer Science Through WoW Scripts

Michele and Andrew were the true domain experts here, and designed all four of the CS class exercises and most of the assessment. I concentrated on the introduction, theory section, and managed to create one of the assessment scripts (the simple currency exchange/variables exercise).

WoW computer programming curriculum review

I personally believe the curriculum we designed is a very effective and engaging method for introducing basic computer science concepts to high school students. Once they've established a WoW player account and learned some of the basic game functionality, it's so easy to experiment with the code, and try new functions from the WoW API. Our instructor said the curricula that the student group developed will be passed to testers at NYU, but I don't know if we'll get any feedback on how our CS curriculum fared. If you are a teacher and try it, or are a solo learner who wants to study basic computer science concepts while playing WoW, please feel free to give it a spin and let me know how it works.

My next assignment for CMS.863J, with a different group: Creating a board game to teach fundamental electrical engineering concepts. You can see our test board here, and I'll try to post more information about the project as development progresses in April.

Update: An NYU class has reviewed our curriculum. Their comments can be seen on the post, "A curriculum for programming, reviewed"

Saturday, March 19, 2011

Solutions for the academic/mass media divide

"Why don't journalists link to primary sources?" The question was posed on Hacker News, and was based on a Guardian article lamenting sloppy reporting in the Daily Mail and Telegraph.

It prompted me to write the following response in the Hacker News thread:

If you asked most reporters whether they used primary sources, they would say yes, and point to the interviews that they conduct.

But if you were to point out that primary sources also includes published research, almost to a man or woman they would say A) they don't have the time to read it B) they don't have access to the journals or C) they are not aware the research exists. A few might concede D) even if they had access, they wouldn't be able to understand the research, which points to the fact that most journalists didn't major in science/technology in college and academic writing can be difficult to penetrate.

Of the above factors, I think C presents an opportunity for academics and startup publishers. On the academic side, it's pretty clear that the traditional method of reaching out to reporters via press releases and personal contacts is becoming less viable as newsrooms cut staff and the remaining writers have less time to network/talk with sources (travel budgets to attend conferences are very restricted these days) and write up stories based on those encounters.

Some researchers have seized upon blogging as a great way to not only reach their peers, but also a wider audience, and of course, other media (including journalists, specialist blogs, etc.). Group blogs written by researchers and experts are another great way to highlight new research and discuss ideas, too. Terra Nova ( http://terranova.blogs.com/ ) is one example focused on virtual worlds; I am sure the audience here knows of many others.

But the problem with individual and group blogs is they are still largely unknown outside of a relatively small group of people. In order to make a mass audience connection, there needs to be a way for these ideas to be presented in newspapers and television reports (which is how many people still learn about the world around them), or on media websites.

An arrangement to republish blog content or for the blog authors to prepare easy-to-understand summary reports for a mass media audience are possibilities, but the processes and incentives need to be worked out -- preferably in a way that takes the load off of editors, who don't have the time to find the right bloggers and deal with the freelance contracts and payment issues. One startup idea would be to create a "marketplace" to match publishers who are seeking an informed report about a specific scientific topic (for instance, how a boiled water reactor works). Another avenue for a startup would be to set up a "science wire service" which prepares timely, relevant coverage (including blogs, video, and features) about new research and developments every day. Media companies could subscribe to the service and editors could browse the service and use as much as they like, just as they do with Reuters, Bloomberg, AP, etc.

As for the specific issue of not including links, this partly relates to the awareness and access issues mentioned above, but also to the fact that content management systems used at many newspapers and magazines are optimized for print publishing, not online publishing. Inserting links typically has to be done *after* the article has been written, often by different editors or producers who know how to use Wordpress/Drupal/homegrown tools. I think there's a startup opportunity here as well, but unfortunately it also requires a rethinking of newsroom processes and control.

The debate reminded me of my "Source Blocks" idea from 2008. It never caught on, for the simple reason that most writers (including me) are too lazy to manually include them. But that could be an opportunity for another new media product ...

Wednesday, March 16, 2011

The challenges of creating a mobile educational app based on Linked Data


Earlier this month, my iPod touch flashed a warning message that the provisioning profile on the test application our team (Sloan Fellow Mads, Course 6 undergraduate Yod, and myself) had designed for 6.898 in the fall was about to expire. Before it did, I decided to make a quick video showing the basic design and functionality of our educational app for the iPhone and other iOS devices:

Video: Knowton demonstrated:


While the app was ostensibly designed to teach young children geography facts, the purpose of building it was to show how Linked Data could be used to make an educational application on a mobile device. Mads' original concept was to have an open-ended exploratory app that would let children freely jump from one object to an associated fact. For instance, the child might be interested in a monkey, be able to see a picture and read some information about it, including the facts that it lives in a tree and likes to eat bananas. At that point, the child could either choose to learn about trees or fruit.

This idea is eminently suited to Linked Data, which is essentially a distributed, global-scale database  built around Semantic Web standards such as RDF, turtle/N3 and SPARQL, shared definitions, and links between repositories. There is an enourmous collection of Semantic Web-based data already available, ranging from Wikipedia information to creative commons-licensed photos.

I suggested narrowing the focus to geography, as presenting facts about animals and their habitats could be tied to a specific learning outcome. I also designed a rudimentary user interface and flow (see wireframe  below), which was eventually adopted for the exploration part of the app. Yod designed the basic game flow and built several code repositories, including the mobile app (using the iPhone SDK) and a Web app that let editors (us) submit information such as photos and descriptions. Mads devised a business plan.

In a perfect Semantic Web world, it wouldn't be necessary to have the Web app for editors, as SPARQL queries on consistently structured graphs could build the data store, with only a minimum of cleanup and selection (such as choosing the most suitable photos). But we quickly discovered that DBPedia, a popular source of country-level information for local fauna and landmarks, was incomplete. Freebase filled in many of the information gaps, but there were so many differences from country to country that the only practical way to tackle the task of preparing the data for the mobile app was by using the Web interface that Yod created. For geography and many animal photos, we used a source that one of the guest lecturers in class had mentioned, Ookaboo, which contained creative commons and public domain photos. Others were sourced from Flickrwrapper using a feature in Yod's Web application.

But for good "people" photos that could not be easily accessed in Flickrwrapper using basic search strings, I had to resort to finding creative commons-licensed (CC-SA) on Flickr itself and copy and paste URLs into the Web app. Even if we had been able to use Linked Data without the manual workarounds, there is no way we would have been able to run live queries from the mobile app -- not only are mobile network connections unreliable, but we discovered that many of the sites have high latency and/or frequent downtime (DBPedia especially!). As an alternative, Yod built a database that loaded onto the app and was instantly accessible by users.

On demo day on December 7, all of the 6.898 teams gathered in a CSAIL conference room at the Stata Center. Tim Berners-Lee and a group of outside judges watched our demos and listened to our business pitches. TBL's quick assessment of the projects is in the video at the bottom of this post, but we approached him afterwards to ask him about the curation problem. He suggested some AI alternatives. For instance, if Linked Data sources identified "China" as alternately being a country or a person, he said the app could choose the most suitable definition based on the number of returned sites in competing Google searches.

TBL asked about photos in Flickrwrapper. Could Flickr ratings be used to choose better-quality photos? Yod said no. TBL suggested that some geocoded logic could be used to get the best Big Ben photo. "Make sure it's 300 meters west at a certain time during the day," he said, and then joked: "But how can you be sure that it's not a photo with Aunt Jenny in the frame?" He speculated that an algorithm could help choose photos based on contrast or some other value.

Video: Tim Berners-Lee reviews the 6.898/Linked Data Ventures class projects (Knowton comments at 2:50)



Other posts about my MIT Sloan Fellows experience:

Wednesday, March 09, 2011

A robot for healthcare

Demonstrated today in our Media Ventures class: Autom. The speaker is Cory Kidd, formerly of the Media Lab, now an entrepreneur based in Hong Kong, where software talent is cheaper and local government incentives are available (including free rent) to set up his business.




Other posts about my MIT Sloan Fellows experience:

Tuesday, March 01, 2011

Social TV poster #1: PeoplePixPlaces

(Update: This concept has evolved further and turned into a final project called WorldTV, complete with a software demo and video) From the Social TV class I'm taking this semester at the MIT Media Lab: A social TV application based on news. I came up with PeoplePixPlaces, a Web-based application that gives a window into local news, using geocoded video, pictures, and tweets, as well as individual users’ own social lenses. The poster explains the concept in more detail:

social TV

The genus of the idea predates MAS 571. Last semester in 6.898 (Linked Data Ventures), I proposed a similar project, PixPplPlaces. The one-sheet vision:


“People want to know a lot about their own neighborhoods.”

- Rensselaer Polytechnic Institute Professor Jim Hendler, discussing Semantic Web-based services in Britain, 10/18/2010

While superficial mashups that plot data about crime, celebrity sightings, or restaurants on street maps have been around for years, there is no service that takes geotagged tweets, photos, and videos, as well as associated semantic context, and plots it on a map according to the time the information was created. The idea behind PixPplPlaces:

• Index some publicly available location-based social media data in a Semantic Web-compatible form
• Plot the data by time (12:25 pm on 10/24/2010) and location (Lat 42.33565, Long -71.13366) on existing Linked Data geo resources
• Bring in other existing Linked Data resources (DBPedia, rdfabout U.S. Census, etc.) that can help describe the area or other aspects of what's going on, based on the indexed social media data

Potential business models:

• Professional services: News organizations can embed PPP mashups of specific neighborhoods on their websites, add location-based businesses who are their ad clients, or use the tool as an information resource for journalists -- what was the scene at the site of a fire on Monday evening, just before the fire broke out? Lawyers, insurance companies, and others might be interested in using this for investigations.
• Advertising services: A suggestion from Reed - "a source of ads/offers in Linked Data format - for the sutainability argument as a business. Maybe in the project you can develop an open definition that would let multiple providers publish ads in the right format that you could scrape /aggregate and then present to end users? If you demonstrate a click-wrap CPC concept you might be able to mock it up by scraping ads from Google Maps or just fake it."

To be researched:
• Is social media geodata (geotagged Flickr photos, geolocated Tweets) precise enough to be plotted on a map?
• Should this be a platform or a service?
• How can the data be scraped, indexed, or made into "good" Semantic Web information?
• Would any professional organization -- news, legal, insurance -- pay for it?
• How viable is the advertising model in a crowded field chasing a (currently) small pool of clients?
The Semantic Web requirements for the 6.898 project and emphasis on tweets and photos gave the tool a different flavor than the Social TV version; in addition, I didn't consider the possibility of using "social lenses" to filter the contributions of people in the user's social circle. But for both projects, I recognized that the business case is weak, not only in terms of revenue, but also in terms of maintaining a competitive advantage if open platforms and standards are used.

Incidentally, I first had the idea for a geocode-based application for user-generated content back in 2005 or 2006. My essay Meeting The Second Wave explains the original idea:

In the second wave of new media evolution, content creators and other 'Net users will not be able to manually tag the billions of new images and video clips uploaded to the 'Net. New hardware and software technologies will need to automatically apply descriptive metadata and tags at the point of creation, or after the content is uploaded to the 'Net. For instance, GPS-enabled cameras th at embed spatial metadata in digital images and video will help users find address- and time-specific content, once the content is made available on the 'Net. A user may instruct his news-fetching application to display all public photographs on the 'Net taken between 12 am and 12:01 am on January 1, 2017, in a one-block radius of Times Square, to get an idea of what the 2017 New Year's celebrations were like in that area. Manufacturers have already designed and brought to market cameras with GPS capabilities, but few people own them, and there are no news applications on the 'Net that can process and leverage location metadata — yet.

Other types of descriptive tags may be applied after the content is uploaded to the 'Net, depending on the objects or scenes that appear in user-submitted video, photographs, or 3D simulations. Two Penn State researchers, Jia Li and James Wang, have developed software that performs limited auto-tagging of digital photographs through the Automatic Linguistic Indexing of Pictures project. In the years to come, autotagging technology will be developed to the point where powerful back-end processing resources will categorize massive amounts of user-generated content as it is uploaded to the 'Net. Programming logic might tag a video clip as "violence", "car," "Matt Damon," or all three. Using the New Years example above, a reader may instruct his news-fetching application to narrow down the collection of Times Square photographs and video to display only those autotagged items that include people wearing party hats.

For the Social Television class, we have to submit two more ideas in poster sessions. I may end up posting some of them to this blog ...