Wikimedia blog

News from the Wikimedia Foundation and about the Wikimedia movement

Features

Introducing Beta Features

The Beta Features preferences page.

We’re pleased to announce Beta Features, a way you can try out new features on Wikipedia and other Wikimedia sites before they are released for everyone.

Beta Features lets developers roll out new software in an environment where lots of users can use these features, then give feedback to help make them better.

You can think of it as a digital laboratory – where community members can preview upcoming changes and help designers and engineers make improvements based on their suggestions. (more…)

Help design Wikipedia’s next-generation discussion system

Roundtable-Discussions-June-2013-45.jpg

Discussions are the backbone of all Wikimedia projects. Whether it’s finding a reliable source, settling on spelling and punctuation conventions, or picking an article to feature on the main page, our community of volunteer editors makes countless decisions each day simply by talking to each other. However, the way that editors communicate today – using freeform wiki pages – is confusing and difficult for new users to grasp. Flow is the Wikimedia Foundation’s project planned to create discussion and collaboration software that improves the experience for all our users, letting them focus on creating and improving content instead of mastering the talk page form.

When comments and discussion first appeared on the Internet, they brought the promise of brilliant minds discussing the issues of the day in a thoughtful, courteous fashion. Instead, what we got was a lot of: “FIRST POST!” “Jake sucks,” “Kylla rulez”, and “aliens caused climate change!!!” The Internet world dealt with this problem in various ways: by locking down poster permissions, paying staff to moderate content, or even turning comments off entirely.

Wikipedia and its sister projects face some different challenges – while the content of the encyclopedia grows in size and quality through peer-to-peer discussion and collaboration, the fact that anyone can participate in this process is still not obvious to most people who use Wikipedia as a resource. We know that a small, homogeneous contributor pool leads to gaps in knowledge and biased content, as well as overworked and frustrated editors. There are countless potential contributors who could pitch in to help, but who are dissuaded from participating in content discussions because of intimidating software. But, like other online discussion spaces, we also need to balance openness with tools to keep discussions productive and healthy.

(more…)

The Autonym Font for Language Names

When an article on Wikipedia is available in multiple languages, we see the list of those languages in a column on the side of the page. The language names in the list are written in the script that the language uses (also known as language autonym).

This also means that all the appropriate fonts are needed for the autonyms to be correctly displayed. For instance, an article like the one about the Nobel Prize is available in more than 125 languages and requires approximately 35 different fonts to display the names of all the languages in the sidebar.

Language Autonyms

Initially, this was handled by the native fonts available on the reader’s device. If a font was not present, the user would see square boxes (commonly referred to as tofu) instead of the name of a language. To work around this problem, not just for the language list, but for other sections in the content area as well, the Universal Language Selector (ULS) started to provide a set of webfonts that were loaded with the page.

While this ensured that more language names would be correctly displayed, the presence of so many fonts dramatically increased the weight of the pages, which therefore loaded much more slowly for users than before. To improve client-side performance, webfonts were set not to be used for the Interlanguage links in the sidebar anymore.

Removing webfonts from the Interlanguage links was the easy and immediate solution, but it also took us back to the sup-optimal multilingual experience that we were trying to solve in the first place. Articles may be perfectly displayed thanks to web fonts, but if a link is not displayed in the language list, many users will not be able to discover that there is a version of the article in their language.

Autonyms were not needed just for Interlanguage links. They were also required for the Language Search and Selection window of the Universal Language Selector, which allows users to find their language if they are on a wiki displaying content in a script unfamiliar to them.

Missing font or “tofu”

As a solution, the Language Engineers came up with a trimmed-down font that only contains the characters required to display the names of the languages supported in MediaWiki. It has been named the Autonym font and will be used when only the autonyms are to be displayed on the page. At just over 50KB in size, it currently provides support for nearly 95% of the 400+ supported languages. The pending issues list identifies the problems with rendering and missing glyphs for some languages. If your language misses glyphs and you know of an openly-licensed font that can fill that void, please let us know so we can add it.

The autonym font addresses a very specific use case. There have been requests to explore the possibility of extending the use of this font to similar language lists, like the ones found on Wikimedia Commons. Within MediaWiki, the font can be used easily through a CSS class named autonym.

The Autonym font has been released for free use with the SIL Open Font License, Version 1.1.

Runa Bhattacharjee, Outreach and QA coordinator, Language Engineering, Wikimedia Foundation

Scientific multimedia files get a second life on Wikipedia

On Wikimedia projects, audio and video content has traditionally taken a backseat relative to text and static images (however, changes are underway). Conversely, more and more scholarly publications come with audio and video files, though these are — a legacy from the print era — typically relegated to the “supplementary material” rather than embedded next to the relevant text passages. And a rising number of these publications are Open Access, i.e. freely available under Creative Commons licenses that allow for the materials to be reused in other contexts.

Why not enrich thematically related Wikimedia pages with such multimedia files? That’s where the Open Access Media Importer (OAMI) comes in. It makes scientific video and audio clips accessible to the Wikimedia community and a broader public audience. The OAMI is an open-source program (or ‘bot’) that crawls PubMed Central — a full-text database of over 3 million biomedical research articles — and extracts multimedia files from those publications in the database that are available under Wikimedia-compatible licenses.

Over 700 OAMI-contributed media files are currently used in Wikipedia and other Wikimedia projects. This X-ray video of a breathing American alligator — originally published by Claessens et al. (2009) in PLOS ONE — is currently being used for illustrating the “Respiratory system” entries in the Bulgarian, Chinese, English, German, Russian, and Serbocroatian Wikipedias.

Such reuse-friendly terms are the key ingredient to making scholarly materials useful beyond the article in which they have originally been published. However, OAMI aims to make this material even more useful by making it accessible:

  • in places where people actually look for them (Wikimedia platforms are a prime example),
  • in one coherent format (in our case Ogg Vorbis/Theora, which isn’t encumbered by patent restrictions), and
  • in a way that allows for collaborative annotation with relevant metadata. This makes it a lot easier to browse and search the media files.

(more…)

Notifications Launch on More Wikipedias

Notifications inform you of new activity that affects you on Wikipedia and let you take quick action.

We’re happy to announce the release of the Notifications feature on dozens of Wikipedias in many languages!

Notifications inform users of new activity that affects them, such as talk page messages or mentions of their names. It was developed this year by the Wikimedia Foundation’s Editor Engagement Team.

New languages

In recent weeks, we enabled Notifications on wikis in two dozen languages, including Wikipedia in the Dutch, French, Japanese, Korean, Polish, Portuguese, Spanish, Swedish, Ukrainian and Vietnamese, to name but a few. In the coming weeks, we’ll be rolling out this engagement tool to many more sites, and we expect it to be enabled on most Wikimedia wikis by the end of 2013.

Community response has been very positive so far, across languages and regions. Users are responding particularly well to social features such as Mentions and Thanks (see below), which enable them to communicate more effectively than before.

For each release, we reached out to community members weeks in advance, inviting them to translate and discuss the tool with their peers. As a result, we have now formed productive relationships with volunteer groups in each project, and are very grateful for their generous support. We find this collaborative approach very effective and hope to expand on these partnerships for other product releases in the future.

New platforms

Notifications are now available on mobile devices as well. This will allow mobile users to stay up-to-date on events and activities that affect them on Wikipedia and other Wikimedia projects.

For this project, we were also glad to introduce HTML email to Wikipedia, to provide a more appealing user experience, with clear visual cues and less clutter than the plain text emails used until now.

We believe that supporting new platforms and formats like these is key to engaging millions of new users, who expect a modern notification experience across all their platforms.
(more…)

Get introduced to Internationalization engineering through the MediaWiki Language Extension Bundle

The MediaWiki Language Extension Bundle (MLEB) is a collection of MediaWiki extensions for various internationalization features. These extensions and the Bundle are maintained by the Wikimedia Language Engineering team. Each month, a new version of the Bundle is released.

The MLEB gives webmasters who run sites with MediaWiki a convenient solution to install, manage and upgrade language tools. The monthly release cycle allows for adequate testing and compatibility across the recent stable versions of MediaWiki.

A plate depicting text in Sanskrit (Devanagari script) and Pali languages, from the Illustrirte Geschichte der Schrift by Johann Christoph Carl Faulmann

The extensions that form MLEB can be used to create a multilingual wiki:

  • UniversalLanguageSelector — allows users to configure their language preferences easily;
  • Translate — allows a MediaWiki page to be translated;
  • CLDR — is a data repository for language-specific locale data like date, time, currency etc. (used by the other extensions);
  • Babel — provides information about language proficiency on user pages;
  • LocalisationUpdate — updates MediaWiki’s multilingual user interface;
  • CleanChanges — shows RecentChanges in a way that reflects translations more clearly.

The Bundle can be downloaded as a tarball or from the Wikimedia Gerrit repository. Release announcements are generally made on the last Wednesday of the month, and details of the changes can be found in the Release Notes.

Before every release, the extensions are tested against the last two stable versions of MediaWiki on several browsers. Some extensions, such as UniversalLanguageSelector and Translate, need extensive testing due to their wide range of features. The tests are prepared as Given-When-Then scenarios, i.e. an action is checked for an expected outcome assuming certain conditions are met. Some of these tests are in the process of being automated using Selenium WebDriver and the remaining tests are run manually.

The automated tests currently run only on Mozilla Firefox. For the manual test runs, the Given-When-Then scenarios are replicated across several web browsers. These are mostly the Grade-A level supported browsers. Regressions or bugs are reported through Bugzilla. If time permits, they are also fixed before the monthly release, or otherwise scheduled to be fixed in the next one.

The MLEB release process allows several opportunities for participation in the development of internationalization tools. The testing workflow introduces the participants to the features of the commonly-used extensions. Finding and tracking the bugs on Bugzilla familiarizes them with the bug lifecycle and also provides an opportunity to work closely with the developers while the bugs are being fixed. Creating a patch of code to fix the bug is the next exciting step of exploration that the new participants are always encouraged to continue.

If you’d like to participate in testing, we now have a document that will help you get started with the manual tests. Alternatively, you could also help in writing the automated tests (using Cucumber and Ruby). The newest version of MLEB has been released and is now ready for download.

Runa Bhattacharjee
Outreach and QA coordinator, Language Engineering, Wikimedia Foundation

VipsScaler implementation by volunteer developer improves image handling on Wikimedia sites

A depiction of the Battle of Belmont, Second Boer War. This image is an example of image reduction in VipsScaler. The original file has a resolution of 52 megapixels

Loading and resizing large images within Wikimedia projects has become faster and more reliable with the rollout of VipsScaler, a wraparound to the VIPS free image processing software. VIPS is a tool designed to use a small amount of memory when resizing images. This allows the wiki to create thumbnails of very large PNG files, something which previously was not possible because large amounts of memory would be required. And while Wikimedia Foundation technical staff rolled it out, a volunteer wrote the code.

The most common type of image file on the internet is a JPEG, but since its compression leads to deteriorated image quality with repeated editing, most Wikipedia and Wikimedia Commons non-photographic image files are stored in PNG formats, since it uses lossless compression. Until VipsScaler, thumbnails of PNG files larger than 50 megapixels could not be created.

Volunteer Bryan Tong Minh was a student at Delft University of Technology in 2008 when he initially wrote a utility capable of downscaling PNG images without using huge amounts of memory. Active on Wikimedia Commons at the time, Tong Minh (User:Bryan) said he “was annoyed by the fact that large PNGs did not have thumbnails because the image scaler that we used, ImageMagick, could not efficiently scale large non-JPEG images.”

During the review period of the utility, he became aware of VIPS, which allows for memory-efficient scaling of image files — more than just PNGs. He then set out to implement an extension that would allow usage of VIPS with MediaWiki, which became the VipsScaler extension.

“For many years, PNG files over a certain size could not be displayed on Wikipedia. They could be downloaded, but gave an error when thumbnailed and, on their file page, made it appear that the file was corrupt,” said Adam Cuerden, whose restoration work on Wikipedia since 2007 accounts for four percent of all featured pictures on English Wikipedia. Cuerden said it wasn’t uncommon for PNG files to be marked for deletion because they could not be displayed.

Currently, VIPS scales PNG images from 35 to 140 megapixels, according to Commons contributor Brian Wolff. He points to this 72 megapixel image below of Abraham Lincoln, restored by Adam Cuerden, as an example of an image that previously wouldn’t have been able to be rendered.
(more…)

Language support at Wikimania 2013 in Hong Kong

With participants from more than 80 countries, Wikimania 2013 was a great opportunity for the Language Engineering team to meet with the users from very different language communities. We shared ideas about language tools with users from around the world, and the fact that Wikimania was held in Hong Kong this year was an opportunity to specifically discuss the support of our current and future tools for the Chinese language.

Extending language settings

During the developer days which preceded the conference, we discussed how the Universal Language Selector (ULS) could support languages with multiple variants and ordering methods:

  • Ordering and grouping. When ordering and grouping items in lists (such as pages in a category), different languages have different ordering rules. The problem arises when languages provide more than one way of ordering elements. In the case of Chinese, dozens of indexing schemes exist based on different criteria such as the number of strokes or the phonetic transcription to Latin (as is done for Pinyin).
  • Variant selection. Chinese comprises many regional language varieties. In order to offer users a local experience with minimal duplication, the Chinese Wikipedia allows to annotate variant differences on articles so that users can get the content adapted to their local variant.

Thanks to our conversations with Wikipedia editor and volunteer developer Liangent, we now better understand the context and the implications of those problems for the case of Chinese, which was key to informing our design process. By understanding the possible scenarios and the frequency of use of these features, we could better decide how prominently to present them to users. With this information, we extended the designs of the ULS to include both ordering and variant selection options, and could provide initial technical guidance on how to extend the modular ULS architecture to support the above features.

Extending the designs of the ULS to add language variant and sorting scheme selection (only when languages have more than one option for those).

 

Wikimedia projects support more than 300 languages with very different needs. As illustrated above, close collaboration with volunteers from the different language communities becomes essential to guarantee that all languages are properly supported. Please contact the Language Engineering team if you find that any particular aspect of your language is not properly supported in Wikimedia projects.

(more…)

Notifications launch on mobile

Screenshot showing new notifications icon in corner

After many weeks of beta testing, the mobile development team is happy to announce this week’s release of notifications on the mobile site. Logged in users can now receive notifications on the mobile versions of Wikimedia sites just like they can on the desktop sites. This will allow mobile users to stay up-to-date on events and activities that affect them on Wikipedia and other Wikimedia projects.

Currently supported notifications include:

  • Talk page messages: when a message is left on your user talk page;
  • Mentions: when your user name is mentioned on a talk page;
  • Page reviews: when a page you created is reviewed;
  • Page links: when a page you created is linked;
  • Edit reverts: when your edits are undone or rolled back;
  • Thanks: when someone thanks you for your edit;
  • User rights: when your user rights change;
  • Welcome: when you create a new account;
  • Getting started: easy ways for new users to start editing.

To try out notifications on mobile, log in to one of the mobile websites, such as en.m.wikipedia.org, and look for the icon in the top right corner of your screen (see screenshot). If you have new notifications, a red badge will be displayed on the icon. Clicking on the icon will take you to the notifications archive page. We are currently working on a new overlay interface for notifications on mobile that will allow you to read your new notifications without leaving the page you are on. Expect to the see the new interface rolled out some time in the next few weeks.

Screenshot showing notifications

The release of notifications on mobile is part of a gradual process of bringing feature parity to the mobile websites. We want both readers and editors to feel comfortable using the mobile sites and able to accomplish most, if not all, of the same tasks they typically perform on the desktop sites. In the future, expect to see better support for talk pages, page history, and diff views, as well as further improvements to our editing and uploading interfaces on mobile. As always, we welcome your feedback and comments. Tell us what you think on Twitter @WikimediaMobile or on IRC in the #wikimedia-mobile channel.

Note that mobile notifications currently only work on sites that have the Echo extension enabled (en.wikipedia, fr.wikipedia, hu.wikipedia, pl.wikipedia, pt.wikipedia, sv.wikipedia, and Meta-wiki). Echo should be deployed to all language Wikipedias in the near future.

Ryan Kaldari
Software Engineer

Restoring the forgotten Javanese script through Wikimedia

There are several confusing and surprising things about the Javanese language. First, a lot of people confuse it with Japanese, or with Java, a programming language. Also, with over eighty million speakers, it is one of the ten most widely spoken languages in the world, yet it is not an official language in any country or territory.

Illuminated manuscript of Babad Tanah Jawi (History of the Javanese Land) from the 19th century.

Javanese is mainly spoken in Indonesia, on the island of Java, which gave its name to a popular variety of coffee. The only official language of that country is Indonesian, but Javanese is the main spoken language in its area. It is used in business, politics and literature. In fact, its literary tradition goes back to the tenth century, when an encyclopedia-like work titled Cantaka Parwa was written in it. Another Javanese encyclopedia was published in the nineteenth century, titled Bauwarna.

This tradition is being continued today by Wikipedians who speak that language: every day they strive to improve and enhance the Javanese Wikipedia, now having over forty thousand articles. One of them is Benny Lin. In addition to writing articles and explaining to people the Wikipedia mission, Benny’s special passion is making the Javanese language usable online not just in the more prevalent Latin alphabet but also in the ancient Javanese script.

This ancient script also known as Carakan was used for over a thousand years, and numerous books have been published in it. These days there’s little book publishing in it, though it is still used in some textbooks, in some Facebook groups and in public signs. Elsewhere the Latin alphabet is used more frequently. The younger generation is starting to forget the old script and this rich heritage becomes inaccessible. Benny hopes that transcribing classical literature for Wikisource and writing modern encyclopedic articles in this script, will revive interest in it and help the Javanese people achieve greater understanding of their own culture, and make these largely unknown treasures of wisdom accessible to people of all languages and cultures.

Javanese Wikipedia article about Joko Widodo

Benny presented a talk about this at Wikimania in Hong Kong, the international gathering of Wikipedians. There he also worked with Santhosh Thottingal and myself, developers from Wikimedia’s Language Engineering team, to improve the support for the Javanese script in Wikipedia. Thanks to this work, Wikipedias in all languages can now show text in the Javanese script, and the readers don’t have to install any fonts on their computers, because the fonts are delivered using webfonts technologies. The exquisite Javanese script has many ligatures and other special features, which require the Graphite technology for displaying. As of this writing, the only web browser that supports it is Firefox, but Graphite is Free Software, and it may become supported in other browsers in the near future.

Benny also completed his work for Javanese typing tools for Wikipedia, so now the script can not only be read, but also written easily. This technology can even be used on other sites and not just Wikipedia, using the jquery.ime library.

He sees his work as part of a larger effort by many people who care about the script. There are others, who design fonts, promote the script in different venues and research its literature. Beeny saw that he could contribute by making the fonts and typing tools more accessible through Wikipedia, and he just did it.

Wikimedians believe that the sum of all knowledge must be freely shared by all humans, and this means that it must also be shared in all languages. Passionate volunteers like Benny are the people who make this happen.

Amir E. Aharoni
Software Engineer, Language Engineering team