Wednesday, March 16, 2011
The challenges of creating a mobile educational app based on Linked Data
Earlier this month, my iPod touch flashed a warning message that the provisioning profile on the test application our team (Sloan Fellow Mads, Course 6 undergraduate Yod, and myself) had designed for 6.898 in the fall was about to expire. Before it did, I decided to make a quick video showing the basic design and functionality of our educational app for the iPhone and other iOS devices:
Video: Knowton demonstrated:
While the app was ostensibly designed to teach young children geography facts, the purpose of building it was to show how Linked Data could be used to make an educational application on a mobile device. Mads' original concept was to have an open-ended exploratory app that would let children freely jump from one object to an associated fact. For instance, the child might be interested in a monkey, be able to see a picture and read some information about it, including the facts that it lives in a tree and likes to eat bananas. At that point, the child could either choose to learn about trees or fruit.
This idea is eminently suited to Linked Data, which is essentially a distributed, global-scale database built around Semantic Web standards such as RDF, turtle/N3 and SPARQL, shared definitions, and links between repositories. There is an enourmous collection of Semantic Web-based data already available, ranging from Wikipedia information to creative commons-licensed photos.
I suggested narrowing the focus to geography, as presenting facts about animals and their habitats could be tied to a specific learning outcome. I also designed a rudimentary user interface and flow (see wireframe below), which was eventually adopted for the exploration part of the app. Yod designed the basic game flow and built several code repositories, including the mobile app (using the iPhone SDK) and a Web app that let editors (us) submit information such as photos and descriptions. Mads devised a business plan.
Semantic Web world, it wouldn't be necessary to have the Web app for editors, as SPARQL queries on consistently structured graphs could build the data store, with only a minimum of cleanup and selection (such as choosing the most suitable photos). But we quickly discovered that DBPedia, a popular source of country-level information for local fauna and landmarks, was incomplete. Freebase filled in many of the information gaps, but there were so many differences from country to country that the only practical way to tackle the task of preparing the data for the mobile app was by using the Web interface that Yod created. For geography and many animal photos, we used a source that one of the guest lecturers in class had mentioned, Ookaboo, which contained creative commons and public domain photos. Others were sourced from Flickrwrapper using a feature in Yod's Web application.
But for good "people" photos that could not be easily accessed in Flickrwrapper using basic search strings, I had to resort to finding creative commons-licensed (CC-SA) on Flickr itself and copy and paste URLs into the Web app. Even if we had been able to use Linked Data without the manual workarounds, there is no way we would have been able to run live queries from the mobile app -- not only are mobile network connections unreliable, but we discovered that many of the sites have high latency and/or frequent downtime (DBPedia especially!). As an alternative, Yod built a database that loaded onto the app and was instantly accessible by users.
On demo day on December 7, all of the 6.898 teams gathered in a CSAIL conference room at the Stata Center. Tim Berners-Lee and a group of outside judges watched our demos and listened to our business pitches. TBL's quick assessment of the projects is in the video at the bottom of this post, but we approached him afterwards to ask him about the curation problem. He suggested some AI alternatives. For instance, if Linked Data sources identified "China" as alternately being a country or a person, he said the app could choose the most suitable definition based on the number of returned sites in competing Google searches.
TBL asked about photos in Flickrwrapper. Could Flickr ratings be used to choose better-quality photos? Yod said no. TBL suggested that some geocoded logic could be used to get the best Big Ben photo. "Make sure it's 300 meters west at a certain time during the day," he said, and then joked: "But how can you be sure that it's not a photo with Aunt Jenny in the frame?" He speculated that an algorithm could help choose photos based on contrast or some other value.
Video: Tim Berners-Lee reviews the 6.898/Linked Data Ventures class projects (Knowton comments at 2:50)
Other posts about my MIT Sloan Fellows experience: