Wikimedia blog

News from the Wikimedia Foundation and about the Wikimedia movement

Technology

News and information from the Wikimedia Foundation’s Technology department (RSS feed).

Language engineering news: Bugs fixed in Universal Language Selector, and a new IPA keyboard layout

Imagine a world in which every single human being can easily select the language of the website that they are reading.

One of the bugs that were fixed: not all elements of the user interface of the Universal Language Selector’s were using web fonts.

That’s what the Wikimedia Foundation’s Language Engineering team has been working on through the Universal Language Selector (ULS): a reusable user interface component for comfortable selection of the most appropriate language out of a long list of available options. It integrates new features from Project Milkshake, a set of portable JavaScript tools for internationalizing any web application with web fonts, keyboard layouts and a robust mechanism for loading translations.

The Universal Language Selector is already used on translatewiki.net and on the new Wikidata project, two massively multilingual communities of software translators and data curators, who are testing this feature in an actual production environment, and reporting many bugs. After coming back from the Bangalore Developer Camp, the team set out to fix the last major bugs in the ULS, and most notably:

Now, all buttons use web fonts and are readable.

Currently, the Universal Language Selector supports 68 keyboard layouts and 44 web fonts, and the number is growing. New fonts and keyboards are added according to the needs of the readers and the editors’ communities around the world.

In other news:

  • We held Language Engineering office hours on November 21.
  • Web fonts support was deployed to the Persian Wikipedia, but unfortunately reverted after the users found several issues with font rendering. The team hopes to fix the problems and deploy web fonts again, for the benefit of all the users who do not have good fonts installed on their computers and devices.
  • Niklas Laxström created the first test release of the MediaWiki Language Extension Bundle, an easy-to-install package of stable versions of several MediaWiki extensions that improve its multilingual support. It keeps your MediaWiki site’s interface translations up-to-date and includes “language skills” boxes, rich locale data, easy translation of content pages and site interface, and the aforementioned UniversalLanguageSelector, which helps users select the language.
  • A screenshot of MediaWiki with jquery.ime and the word ‘milkshake’ written in IPA.

    I created a keyboard mapping for easy typing in the International Phonetic Alphabet (IPA), based on the SIL IPA layout. The IPA is very commonly used as a pronunciation guide in Wikipedia and Wiktionary, and the deployment of the Universal Language Selector will make typing in IPA easier. Other IPA layouts may be easily added, for example X-SAMPA. You are very welcome to try this layout in translatewiki.net: click any text field, and select the English language and the SIL IPA layout in the keyboard layout pop-up.

The team’s next sprint marks the beginning of a new release, during which we’ll start implementing a major overhaul of the user interface of translatewiki.net.

Amir E. Aharoni, Software Engineer (Internationalization)

Wikipedia Engineering DevCamp sees a lot of energy and contributions in Bangalore

On November 9-11, the Wikimedia Foundation held a developer meetup in Bangalore, India

On November 9-11, the Wikimedia Foundation held a developer meetup in Bangalore, India. The gathering provided an opportunity for India-based developers to work with the Foundation’s engineering teams on several projects, such as JavaScript-based language engineering tools, and mobile applications with PhoneGap and LAMP technologies.

The DevCamp focused on Language Engineering, Mobile development and User interaction and experience design (UI/UX). It was attended by more than 85 developers, UX/UI designers, Wikimedians and translators. The work sessions focused on developing various Wikimedia mobile apps as well as language tools. The first day of the DevCamp kicked off on Friday with tutorials on Developing mobile applications with PhoneGap by Brion Vibber and How to internationalize your code by myself. Interactive Q&A after the sessions concluded the day with a lot of challenging and interesting questions after both tutorials.

The second day started off with Santhosh Thottingal introducing Project Milkshake (the team’s JavaScript-based internationalization libraries) and the Universal Language Selector currently under development. The mobile team introduced various mobile projects like native mobile apps, mobile front-end, and VUMI-based feature phone apps that powers Wikipedia Zero. Interaction designer Pau Giner introduced design projects and guided new contributors. People started selecting projects they were interested in and teamed up with Wikimedia engineers. It was exciting to see some contributors make their first-ever open-source commits during the DevCamp. People continued to hack throughout the two days.

The final day of the DevCamp started with stand-up updates from all participants, and ended with demos and presentations of 18 projects by 25 presenters. One of the most lovely updates was presented by Lakshmi, who learned to type in her language, Malayalam, using the typing tools that Wikimedia engineers have developed.

A screenshot of a mathematical formula rendered using the MathJax library, with a context menu in the Tamil language.

Accomplishments at the DevCamp include contributions to language engineering projects, where contributors added unit tests to jquery.ime (the input method library for multiple language scripts), submitted bug fixes, tested and actively reported bugs on jquery.ime and the Universal Language Selector. Another highlight was Brion Vibber’s integration of Universal Language Selector, WebFonts and support for language variants to the Wikipedia mobile app. One of the contributors, Ershad, built a Google Chrome extension based on the input method jquery.ime and won a Wikimedia shoulder bag for it. Other highlights include patches submitted to MathJax (a library used to render mathematical equations on HTML pages) by Aditya Ravi Shankar and myself to add internationalization support.


On the mobile platform, Swayam made enhancements to the Translate proofreading mobile app. Other mobile apps developed at the DevCamp include a Commons uploader and an app to track recent changes. Patches were also submitted to MobileFrontend, an iOS client library, and a first working version of the Wikipedia FirefoxOS app.

On the UI/UX design projects, participants worked on ideas for redesigning the translatewiki.net home page, the Mobile Universal language selector, Commons discovery and triaging apps. Here’s a complete list of demonstrations that were made at the Bangalore DevCamp; you are welcome to join the coding fun!

All in all the DevCamp maintained a high energy level throughout the three days, as well as produced a lot of new code, bug fixes, input keymaps, unit tests, mobile apps, translation UI and mobile designs, and positive collaboration across the board.

Amir E. Aharoni, Software Engineer (Internationalization)

Group photo on the lawn of the IIM Bangalore.

Lead our development process as a product adviser or manager

Would you like to decide how Wikimedia sites work? You can be a product adviser or a product manager, as a volunteer, and guide the work of Wikimedia Foundation developers.

What is a product manager? As Howie Fung, the head of WMF’s product team, recently explained, when we create things on our websites or mobile applications that readers or editors would use,

there are a basic set of things that need to happen when building a product….
  1. Decide what to build
  2. Design it
  3. Build it
  4. Measure how it’s used (if you want to improve the product)
Roughly speaking, that’s how we organize our teams when it comes to building features. Product Managers decide what features to build, Designers design the feature, Developers build the feature, and Analysts measure how the features perform.

So, a product manager works with the designers, developers, and analysts to identify and solve user problems, while representing the users’ point of view. As Fung put it,

there should be someone responsible for ensuring that the various ideas come together into a coherent whole, one that addresses the problem at hand. That responsibility lies with the Product Manager.

Why do you need volunteers? While the Wikimedia Foundation has hired full-time product managers for the most pressing features our engineers are developing, that leaves us with several ongoing projects that don’t get enough product management. The WMF needs your help to: track the progress of these improvements; comment on tasks or proposals; reach out to the Wikimedia reader and contributor communities to ask for feedback via wikis, mailing lists, and IRC; help developers see what users’ needs are; and set priorities on bugs and features, thus deciding what developers ought to work on next. Here are a few of those activities:

  • File storage, especially regarding Wikimedia Commons. Engineers have been trying to improve our storage system using the Swift distributed filestore but need your help to make sure we do it right.
  • Prioritizing shell requests. When Wikimedians request configuration changes to the wikis, systems administrators can use help understanding which of them are urgent and which of them don’t actually have the necessary consensus.
  • Operations requests from the community. It’s not just shell requests. Right now we have 93 open bugs requesting attention from our systems administrators, and those requests could use prioritization and organization.
  • Data dumps. Wikimedia offers many ways to download Wikimedia data at dumps.wikimedia.org. Your help would improve tools related to import, or conversion to SQL for import, to make it easier for others to use these datasets.
  • Wikimedia Labs. The sandboxes in Wikimedia Labs will host bots, tools, and test and development environments; can you organize the advice on the roadmap and what those communities will need?
  • Admin tools development: WMF engineer Chris Steipp works on tools to help fight vandalism and spam, including major bugfixes and minor feature development to make lives of stewards and local sysops a little easier. What’s most urgent on his TODO list?

Volunteer product manager Jack Phoenix put together a detailed roadmap that was incredibly useful to guide the work of Wikimedia engineers on features like anti-spam tools.

Has anyone tried this? The first Wikimedia volunteer product manager was User:Jack Phoenix, who created the admin tools roadmap this summer, detailing a rationale for what should be done when. Jack originally signed up because:

this is just something that I know pretty well and hence why I want to be a part of this project and the team….
I want editors to be able to focus on editing — content creation, tweaking, fine-tuning… — instead of having to play whack-a-mole against spambots and vandals all the time. I have plenty of experience in playing whack-a-spambot, and I’m hoping to use that experience to improve WMF sites and also third-party sites…

It’s perfectly fine for the role of volunteer product manager to be a time-limited engagement. For example, Jack did amazing work for three months creating the roadmap. In retrospect, Jack Phoenix has estimated that to manage a product as broad as the admin tools suite, and to do it well, would take at least an hour per day if not two or three; due to time constraints, Jack has now stepped down from the role and is seeking a successor. Thanks for laying the groundwork, Jack! While we’re sad to see Jack go, we’re thankful for the roadmap and we continue to benefit from it.

If that kind of commitment sounds too burdensome, consider becoming a volunteer product adviser first. You’d do some of the same tasks as a product manager, to help check that the feature we’re building actually meets Wikimedians’ needs, and give your own opinion as well. But there wouldn’t be ownership or leadership attached, and the time commitment wouldn’t be as strong.

What next? The goal of the Engineering Community Team is to have at least two Wikimedia volunteers engaged in product management work by the end of December. Talk with us and check out whether this is something you’d like to try!

To get involved, contact Sumana Harihareswara or Guillaume Paumier.

Sumana Harihareswara
Engineering Community Manager

Writing Malayalam on Wikipedia, just like with pen and paper

Lakshmi Valsalakumari is an IT professional who wants to expand her horizons. She attended the recent Wikimedia Developers Camp in Bangalore and had this story to tell:

A man and a woman working together at a laptop computer

Lakshmi with Santhosh Thottingal, the lead developer of Wikimedia’s font and keyboard tools

I have been an Information Technology professional working with well-known software organizations over the last 15 years. While IT has been keeping me busy, productive and happy, I have also all along harbored an interest in history and the humanities. I have recently decided to pursue these interests full-time, joining a research program at the Centre of Exact Humanities, International Institute of Information Technology, Hyderabad, India.

With my recent shift into academics and research, I have been referencing Wikipedia quite a bit in the last two to three months, and I have been amazed at the sheer magnitude of information found on it. While I have been reading the Wikipedia pages extensively, I had never yet considered editing it, not even in English, the language I reference Wikipedia most in, and the one I use most on computers.

Editing and contributing content in Malayalam, my mother tongue, had not really occurred to me either—Malayalam being a language I hardly used on my computers—until I attended the Bangalore Wikimedia Dev Camp.

I have tried typing Malayalam using my regular browser, but I have not been very happy with the effect. This was not the way I liked to see Malayalam written and rendered, so I had not made any further efforts to write Malayalam online. At the camp, I met Santhosh and Manoj—avid Malayalam Wikipedia contributors—and they persuaded me to give it another shot.

The first step was to download the Meera Unicode font for Malayalam, then to change my default browser to one of those that can render Meera well (I tried out Google Chrome; Firefox was even better, I was told), and then to try out typing Malayalam using the regular English keyboard.

I liked what I saw. When I typed the suggested key combinations, even complicated Malayalam letter combinations were being rendered the way I would have written them using pen and paper. I tried more and more combinations—ta, tha, tta, Ta, tma, thra, tya, zha—and was pleased with the effect. This was fun!

The words "Catalonia" and "Lakshmi" typed in Latin transliteration and in Malayalam letters

Demos of how transliteration keyboards for Malayalam work

Soon, I was creating my first article. I noticed that on the main Wikipedia page, an article on Barcelona mentioned Catalonia as a red link, meaning that no further information was available in the Malayalam Wikipedia on it, whereas there was plenty of information on the same subject in the English Wikipedia. Manoj guided me through the steps as I created my first page in the Malayalam Wikipedia, copied the template information over from the English article and saved the heading, trying to get it right in Malayalam. I viewed my saved efforts, and with a sense of achievement, I went to grab a coffee.

Back online with my coffee, I was surprised to find a message on the article Talk page—someone had already posted a comment on the page I had just saved, chiding me for the lack of content and references. “This will drive away people from Wikipedia,” the post read. “Please ensure I get enough content on the page!”

Man, that was fast! I had no idea people were watching and following Wikipedia edits this closely. Manoj encouraged me to type more, so I returned to my effort. While I was getting comfortable with the typing, I was still grappling for suitable words in Malayalam for the content I was reading in English. Manoj suggested Olam, an online dictionary, and sure enough, I was able to find several of the Malayalam equivalents I was searching for.

And so, I typed on. Again, to my surprise, I found people editing the content and giving helpful suggestions even as I was still typing—one person told me to leave native names as such and not translate those, and another formatted some of the changes. By the end of the day, I had posted a decent amount of info, although there remained much more to be added.

I was happy with my day’s work. I had never imagined that using Malayalam on my computer and editing the Malyalam Wikipedia content would be such a pleasant and enjoyable experience, one that I was actually looking forward to!

Another point I must mention here is the sheer volume of Malayalam content that I have started seeing online, on Wikipedia pages and elsewhere. This must be due to the attention paid to this field of languages, literature and culture online by movements like Wikimedia. In 2005, I remember searching online for a well-known Malayalam lullaby Omanathingalkkidavo by Irayimman Thampi, but could not find anything. I had then resorted to the memories of my immediate relatives to try and pen the forgotten lyrics. Now, when I search for the same, the amount of material that comes up on that lullaby is amazing!

My heart-felt appreciation to Wikipedia and all its online community members who have made all of this possible. I hope to be part of this movement myself and do my bit toward furthering easy availability of multi-lingual content online

Lakshmi Valsalakumari


The Wikimedia Language Engineering team is developing technologies that make it possible to speakers of all languages to contribute to Wikipedia in their language as easily and naturally as possible. Lakshmi’s story is an example of how these technologies enable people to develop reference and educational content that makes Wikipedia useful to people in the whole world. These technologies are deployed in Wikipedias in most languages of India, and more languages and projects are being added all the time.

Amir E. Aharoni, Software Engineer (Internationalization)

OpenSource Language Summit

The Wikimedia Foundation and Red Hat co-organized an Open Source Language Summit in Pune, India on November 6-7, 2012. The summit focused on language tools and technology development to support languages on Wikipedia, the Web, Linux and other Open Source platforms.

Santhosh Thottingal presenting his talk on jquery.ime

In total, 45 core language technology developers, open source contributors, typographers and technology evangelists from the Wikimedia Language Engineering and Mobile teams, Red Hat, Mozilla Foundation, KDE, GNOME, translatewiki.net and other open source projects participated in sessions and work sprints on internationalization and localization features supporting various open source projects on the web and Linux. After brief introductory talks, we focused our work on font support, input method tools, language search, and web and localisation standards.

Highlights: 

The event had short talks on the following topics:

Selected achievements

The following people won prizes for their code contributions during the event:

  • Anish Patil ported Universal Language Selector’s cross-language search algorithm to gnome language search
  • Aravinda VK wrote a set of font-forge python wrappers to make changes to fonts programmatically. Aravinda fixed a few bugs in Kannada Gubbi font for Harfbuzz rendering engine and also wrote Kannada KGP keymap for jquery.ime
  • G Karunakar added Hindi inscript keyboard layout to Firefox OS GAIA

Other accomplishments included:

  • Kushal Das added patches to deploy Universal Language Selector on http://www.mozilla.org and also a patch for a bug on Mozilla localization platform.
  • Alolita, Sankarshan, Runa, Satish worked on discussing APIs for various translation workflows and putting together an initial specification.
  • Rajeesh Nambiar, Hussain KH, Ani Peter, Praveen A and Pravin Satpute fixed and filed upstream bugs for Malayalam, Kannada, Gujarati and Punjabi fonts with Harfbuzz.
  • Parag Nemade added InScript2 keyboards for Sanskrit, Nepali, Marathi and Konkani to jquery.ime.
  • Ankit Gadgil wrote over 200 unit tests for Marathi and Hindi input methods in jquery.ime.
  • Yuvaraj Pandian, Pau Giner, Arun Ganesh and Siebrand Mazeland developed an initial version of an Android-native app for Translatewiki.net for translation reviews.
  • Pau Giner conducted user testing with new translation prototypes with translators. Arun Ganesh created an icon for gnome-transliteration.

You can browse through tweets and more notes from the event. Happy reading!

Srikanth Lakshmanan
Internationalisation/Localisation Outreach / QA Engineer

Apply for the FOSS Outreach Program for Women internships

Women interested in open source software activities are encouraged to apply for the 2013 FOSS Outreach Program for Women internships. This initiative is organized by the GNOME Foundation and supported by the Wikimedia Foundation and other leading software projects, including Mozilla, OpenStack, Fedora and more. Successful applicants will be granted $5,000 USD for three months of remote work on a specific software project, and will be supported by a mentor related to the project. The Wikimedia Foundation is granting 3 internship positions, with a possibility to expand depending on the proposals received.

Outreach poster for internships

If you are into software development and you are available for a full-time internship between 2 January and 2 April 2013, then this program is for you. Apply also if you don’t code but are fluent in other technical activities, like testing, writing documentation, systems administration or developer advocacy. College students from the Southern Hemisphere who have a school summer break during most of this time are particularly encouraged to apply. The internships are remote and it doesn’t matter where you live. As long as you have the equipment and an Internet connection, you are set to apply.

Getting started is easy:

  1. Find an interesting project among the existing proposals, or suggest your own. We have stellar products like the MediaWiki CMS, extensions like the Visual Editor or Upload Wizard, the mobile web and apps, and plenty more.
  2. Once you have decided on a project, talk to a mentor to get a small task, and complete it. This practical step will be the most important piece of your application.
  3. Apply by December 3rd!

Do not hesitate to contact us with questions and draft proposals at any point. The sooner the better: we can point you in the right direction and save you time and work. Joining the developer community channels will also help.

Check the list of proposed Wikimedia projects. Learn more about how the program works and about the other organizations joining.

One of the priorities of the Wikimedia Foundation is to increase women’s participation and involvement in all our initiatives. Technical activities are the area where the gender disparity is most acute. We are very happy to support the FOSS Outreach Program for Women and we are looking forward to seeing its results.

You can help promote this initiative among your colleagues, at universities, and with women’s associations and technical communities in general. Spread the word!

Quim Gil, Technical Contributor Coordinator

Translate Wikidata’s user interface and open it to the world

Wikidata is one of the most important and exciting innovations in the world around Wikipedia. To make it accessible to a wide range of users, it needs its user interface to be translated to as many languages as possible, and you can help.

At the first stage, already partly enabled, Wikidata stores “interwiki links”, i.e. page metadata that connect articles about a same topic on different language versions of Wikipedia. Historically, these interwiki links have been duplicated and stored in each of the pages they linked together. With Wikidata, the list of pages about a same topic is centralized.

The next goal of Wikidata is to store not only page metadata like interwiki links, but also common data that is repeated in all languages, such as census data for cities and dates of birth and death of famous authors.

Practically all the projects that are related to Wikipedia are massively multilingual, but Wikidata is especially so: it stores common data with the goal of displaying it efficiently in all languages.

The very useful and famous CIA World Factbook site has tables of data about all countries in the world, but the labels are only written in English. Now imagine a site with such tables, but with the ability to display the labels in any language and not just English: that’s what Wikidata aims to become.

In the near future, the translation of such table labels will be done on the Wikidata website itself. In the meantime, you can help by translating the user interface displayed by the software running Wikidata.

Translation of the Wikidata software is done on translatewiki.net, the same translation platform used to translate Wikipedia’s interface. Wikidata relies on three main components that need translating: Wikibase – Repo, Wikibase – Client and Wikibase – Lib.

Wikipedia made encyclopedic articles open and accessible; Wikidata is about to do the same to statistics and other structured information. To ensure that people speaking your language can benefit from the immense potential of Wikidata, and contribute to its success,  please join us today and help us translate it.

Thank you!

Amir Aharoni
Software Engineer (Internationalization)

Introducing Wikipedia’s new HTML5 video player

A new video player has been enabled on Wikipedia and its sister sites, and it comes with the promise of bringing free educational videos to more people, on more devices, in more languages.

The player is the same HTML5 player used in the Kaltura open-source video platform. It has been integrated with MediaWiki (the software that runs Wikimedia sites like Wikipedia) through an extension called TimedMediaHandler. It replaces an older Ogg-only player that has been in use since 2007.

The new player supports closed captions in multiple languages.

Based on HTML5, the new player plays audio and video files on wiki pages. It brings many new features, like advanced support for closed captions and other timed text. By allowing contributors to transcribe videos, the new player is a significant step towards accessibility for hearing-impaired Wikipedia readers. Captions can easily be translated into many languages, thus expanding their potential audience.

TimedMediaHandler also comes with other useful features, like support for the royalty-free WebM video format. Support for WebM makes it possible to seamlessly import videos encoded to that format, such as freely-licensed content from YouTube’s massive library.

Even further behind the scenes, TimedMediaHandler adds support for server-side transcoding, i.e. the ability to convert from one video format to another, in order to deliver the appropriate video stream to the user depending on their bandwidth and the size of the player. For example, support for mobile formats is available, although it is not currently enabled.

The player’s “Share” feature provides a short snippet of code to directly embed videos from Wikimedia Commons in web pages and blog posts, as is the case here.

Sponsored by Kaltura and Google, developers Michael Dale and Jan Gerber are the main architects of the successful launch of the new player. With the support of the Wikimedia Foundation’s engineering team and Kaltura, they have gone through numerous cycles of development, review and testing to finally release the fruits of years of work.

Efforts to better integrate video content to Wikipedia and its sister sites date back to early 2008, when Kaltura and the Wikimedia Foundation announced their first collaborative video experiment. Since then, incremental improvements have been released, but the deployment of TimedMediaHandler is the most significant achievement to date. (more…)

Universal Language Selector now has Input Methods

The Language Engineering team at the Wikimedia Foundation works on a set of tasks every two weeks. This post is about the team’s accomplishment over the past two weeks. 

Have you ever sat at a computer in a foreign country, and wondered how you were going to enter text in your language using a keyboard with a different alphabet?

“Input methods” are interfaces that allow users to enter text in a script different from the one used on their keyboard. On some Wikipedia versions (like wikis in Indic languages), such a tool has been available through the Narayam extension.

As part of Project Milkshake, this feature has recently been exported to a JavaScript library (a bundle of code, called jquery.ime) so that it could be reused by other web developers.

Another language-related tool, the Universal Language Selector (ULS), allows readers of Wikipedia and its sister sites to easily pick the language of their choice for the website’s interface.

Over the last two weeks, we’ve integrated the input methods’ functionality directly into the Universal Language Selector: it now comes with a large set of input tools that users can use to input text in non-latin languages.

The integration of the two tools makes the interface more consistent and usable when it comes to choosing languages in which to read (“display”) and to write (“input”) on the site: both settings are located in the same dialog of the Universal Language Selector.

When selecting a language in which to write, it’s possible to set an accompanying preferred input method for that language, if available. When input methods have been assigned to different writing languages, switching between languages in the menu will automatically change to the preferred input method for that language.

Other language engineering news in brief:

  • The Language Engineering team will be in India during the second week of November to participate in the OpenSource Language Summit in Pune, and the Wikimedia DevCamp in Bangalore. For new volunteers who want to get started contributing to our tools, we’ve prepared a list of bugs that you can work on at these events with our support.
  • We’ve also worked on finalizing the development plan and features for Translate UX improvements, which were identified by user testing with volunteer translators to improve translation efficiency.
  • We’ve worked on how to get metrics on the impact of our tools through URL-based usage data gathering. Feedback is welcome.
  • We’ve fixed some bugs related to the ULS and gender support in MediaWiki and MediaWiki extensions.
  • The Narayam and Webfonts extensions were deployed to Wikimedia sites in Marathi; Narayam was also deployed to sites in Amharic.
  • An early stable version of ULS was deployed on Wikidata; this first use on a production site revealed a few bugs that were fixed. It will be updated to the latest stable version periodically.

Srikanth Lakshmanan
Internationalisation/Localisation Outreach / QA Engineer

Wikimedia engineering October 2012 report

Major news in October include:

Note: As of last month, we’re proposing a shorter and simpler version of this report for less technically savvy readers.

(more…)