Wikimedia blog

News from the Wikimedia Foundation and about the Wikimedia movement

Technology

News and information from the Wikimedia Foundation’s Technology department (RSS feed).

New draft feature provides a gentler start for Wikipedia articles

For most of Wikipedia’s history, we encouraged editors to create new encyclopedia articles by publishing immediately. Just find a page that doesn’t exist, type in content, and after you hit save, it’s shared with the world. This helped Wikipedia grow to the millions of articles it has now, but the project has matured in many ways, and we need additional tools for creating great new encyclopedia articles.

Starting on the English-language Wikipedia, all users (registered or anonymous) now have the option to start drafts before publishing. A draft simply has “Draft:” before the title of the page you’re creating, like this example. Drafts are not visible to readers using Wikipedia’s default search nor in external search engines such as Google, though you may find them using the advanced search options.

Why we need drafts on Wikipedia

Wikipedia’s goal is to be the most comprehensive and reliable reference work in your language, so you might ask why we would encourage people to not publish their articles immediately so readers can enjoy them.

In small Wikipedias like Swahili or Estonian, you’d be right — we’ll probably encourage all authors to skip writing drafts. However, in larger Wikipedias where quality standards are very high, thousands of new articles are deleted (sometimes within just minutes) because they don’t meet essential requirements for what makes a good Wikipedia article.

Our most recent data indicates about 80% of the articles started by brand new users are deleted, when examining Wikipedias in English, Spanish, French, and Russian. By creating a draft, authors will have more time and space to gradually work on a new topic, and can get constructive feedback from other editors. In fact, even advanced Wikipedia editors sometimes use sub-pages of their user profile (sometimes called “sandboxes”) as an unofficial draft space.

We should note that we don’t want drafts to prevent editors from following their curent process for article creation. Wikipedia articles are all works in progress, even after publication, and this fact won’t change any time soon. We’re simply adding another option for people that want the time and space that drafts affords.

What’s next

This is a very early version of drafts on Wikipedia, and frankly it’s missing a great deal of functionality. In the future, we’ll be adding features to drafts that will make them more useful. We’re exploring different design concepts to make it easy to request and provide help during the draft process, better support the publication of drafts as articles (and moving them back to draft state if they need more work), and encourage collaboration between editors.

design comp

Design concepts for Search and editing of Drafts

If you’d like to help us in this effort, please sign up for a usability testing session. In these sessions, we’ll show you prototypes of new features and get your feedback. No prior experience with Wikipedia editing is required!

Pau Giner, User Experience Designer
Steven Walling, Product Manager

OpenDyslexic font now available on Polish Wikipedia

This post is available in 2 languages: Polski 7% • English 100%

English

Screenshot of selecting the OpenDyslexic font

For those who suffer from dyslexia, the simple task of reading can become a monumental struggle. It can be hard to understand exactly what what it means to have dyslexia for those who don’t suffer from it, for this reason the condition can often go unaddressed. Fortunately there is hope in the form of the OpenDyslexic font.

With so much reading being done on computer screens, it is finally possible to help individuals with dyslexia. The OpenDyslexic font changes the shape of characters enough to make reading a lot easier for those who suffer from dyslexia.

Wikipedia supports OpenDyslexic for many languages, but unfortunately not for Polish. At the CEE conference in Modra, Slovakia, we learned that Polish can be supported as well. The request to enable OpenDyslexic was quickly granted and it is now fully supported. We would like to celebrate this occasion with the larger dyslexic community who we hope will benefit from this new feature on Polish Wikipedia.

Gerard Meijssen, Wikimedian

(more…)

Language Engineering Events – Language Summit, Fall 2013

The Wikimedia Language Engineering team, along with Red Hat, organised the Fall edition of the Open Source Language Summit in Pune, India on November 18 and 19, 2013.

Members from the Language Engineering, Mobile, VisualEditor, and Design teams of the Wikimedia Foundation joined participants from Red Hat, Google, Adobe, Microsoft Research, Indic language projects, Open Source Projects (Fedora, Debian) and Wikipedians from various Indian languages. Google Summer of Code interns for Wikimedia Language projects were also present. The 2-day event was organised as work-sessions, focussed on fonts, input tools, content translation and language support on desktop, web and mobile platforms.

Participants at the Open Source Language Summit, Pune India

The Fontbook project, started during the Language Summit earlier this year, was marked to be extended to 8 more Indian languages. The project aims to create a technical specification for Indic fonts based upon the Open Type v 1.6 specifications. Pravin Satpute and Sneha Kore of Red Hat presented their work for the next version of the Lohit font-family based upon the same specification, using Harfbuzz-ng. It is expected that this effort will complement the expected accomplishment of the Fontbook project.

The other font sessions included a walkthrough of the Autonym font created by Santhosh Thottingal, a Q&A session by Behdad Esfahbod about the state of Indic font rendering through Harfbuzz-ng, and a session to package webfonts for Debian and Fedora for native support. Learn more about the font sessions.

Improving the input tools for multilingual input on the VisualEditor was extensively discussed. David Chan walked through the event logger system built for capturing IME input events, which is being used as an automated IME testing framework available at http://tinyurl.com/imelog to build a library of similar events across IMEs, OSs and languages.

Santhosh Thottingal stepped through several tough use cases of handling multilingual input, to support the VisualEditor’s inherent need to provide non-native support for handling language content blocks within the contentEditable surface. Wikipedians from various Indic languages also provided their inputs. On-screen keyboards, mobile input methods like LiteratIM and predictive typing methods like ibus-typing-booster (available for Fedora) were also discussed. Read more about the input method sessions.

The Language Coverage Matrix Dashboard that displays language support status for all languages in Wikimedia projects was showcased. The Fedora Internationalization team, who currently provides resources for fewer languages than the Wikimedia projects, will identify the gap using the LCMD data and assess the resources that can be leveraged for enhancing the support on Desktops. Dr. Kalika Bali from Microsoft Research Labs presented on leveraging content translation platforms for Indian languages and highlighted that for Indic languages MT could be improved significantly by using web-scale content like Wikipedia.

Learn more about the sessions, accomplishments and next steps for these projects from the Event Report.

Runa Bhattacharjee, Outreach and QA coordinator, Language Engineering, Wikimedia Foundation

Wikimedia engineering report, November 2013

Major news in November include:

Note: We’re also providing a shorter, simpler and translatable version of this report that does not assume specialized technical knowledge.

(more…)

Adding musical scores to Wikimedia

Sound and musical content have long trailed behind other subjects on Wikipedia, but that is beginning to change with a new musical scores extension for MediaWiki, the software running Wikipedia and thousands of other wikis. The Score extension was added to a MediaWiki deployment earlier this year and allows users to render musical scores as PNG images and transform them into audio and MIDI files.

Score utilizes the free music-engraving program LilyPond to produce musical notations and insert them into wiki code. This code is then passed on to a LilyPond renderer, which produces images that can be uploaded to Wikipedia articles. “This is somewhat similar to the way mathematical formulas are rendered in Wikipedia,” said Markus Glaser, a Wikimedian who helped develop the extension and gave a presentation on musical scores at Wikimania in 2012. Glaser said it made sense to use LilyPond because, in addition to being free and open source, “it’s text-based, can be easy, but possesses the complexities needed to fit the needs of advanced and professional notation.”

Over time, the hope is to expand on this extension and grow it into a viable resource, encouraging music teachers, music historians and the musicology community to use Score to share their knowledge.

“Studying music on the Internet is something that remains a bit confusing and fragmented. If you are after a musical performance, you can try and hunt one down on YouTube, Spotify or other similar sites,” said Chris Keating, Chair of Wikimedia UK and an amateur violinist, who explained how many of the necessary tools to analyze music still remain largely absent on the Internet. “If you’re after free sheet music, you will probably end up looking on IMSLP. And finally, if you want to read about something, say, music theory, you are likely to come to Wikipedia.”

After setup, users can embed simple LilyPond notation into wikitext using score tags.

(more…)

Introducing the Wikidata “Concept Cloud”

Concept cloud.jpg

On Wikidata, a free knowledge base about the world that can be read and edited by humans and machines alike, all the Wikipedia articles on the same subject are bundled in a Wikidata item. All these articles are written in different languages and they all refer to other Wikipedia articles through wiki links. These wiki links are known as Wikidata items as well. When you aggregate all these items, you get all the concepts that are related to the original subject and together make up a “concept cloud.”

The beauty of a “concept cloud” is that, as all these subjects are related, they are likely to be more relevant when the subject of a “concept cloud” is what is in the news. It is assumed that what is considered “World News” will be of relevance to all the languages Wikidata supports. It is therefore likely that all the items in a “concept cloud” are more sought after in all these languages.

When you search for information, Wikidata provides more labels than Wikipedia provides articles. Even more powerful is the fact that, for Wikidata, every language is equal. For any language, Wikidata provides the same statements, the same links to Commons and the same visualization in the “Reasonator.”

Wikidata works best when labels exist for the concepts, the properties and the qualifiers that are used. As more labels for related concepts are available, the information becomes more complete. For best results, our challenge is to stimulate people to add more labels.

The new “Concept Cloud” tool presents you with the “concept cloud” for Wikidata items. You can select a language you know and find if there is a label, and for existing labels you can check if they are properly written. They should not be capitalized unless they are always capitalized and spelling has to be the standard spelling.

Every day we are going to present another item that is in the news. Things will return in the news, but we are confident that there is always something that can be done. As this is the first iteration of the “Concept cloud” tool created by Magnus Manske, we do seek to get feedback. Do you like it? What can be improved?

Yes, there are other applications for a “concept cloud”… be creative and either implement them or let us hear about them.

Gerard Meijssen

Wikidata

OAuth now available on Wikimedia wikis

Oauth logo.svg

Over the past few months, engineers in the MediaWiki Core team at the Wikimedia Foundation have been developing Extension:OAuth. A preliminary version of the extension is now live on all Wikimedia wikis.

OAuth allows users to authorise third-party applications to take actions on their behalf without having to provide their Wikimedia account password to the application. OAuth also allows a user to revoke an application’s access at any time, adding an extra layer of security for users. By using OAuth, third-party applications can streamline their workflows by no longer needing to direct people to the wiki to perform actions such as creating accounts or unblocking users. For example, on the English Wikipedia, Snuggle, the Account Creation Tool, and the Unblock Request System have begun work on implementing OAuth so that users can use their tools more seamlessly.

This dialogue is presented to you when you are asked to authorise an application to access your account.

The list of actions that third parties can be authorised to do is extensive, and extra actions can be added in if there is demand for them. We hope that OAuth will empower third-party application developers to make even better applications to help Wikimedians do their work, and we look forward to seeing what applications are created.

If you need help or have any questions, feel free to visit Help:OAuth on MediaWiki.org. If your question is not answered on that page, please ask on the talk page and a member of the OAuth team will answer it for you. For technical details on the OAuth protocol, visit http://oauth.net.

Dan Garry
Associate Product Manager for Platform, Wikimedia Foundation

Developing Distributedly, Part 2: Best Practices for Staying in Sync

Staying in sync on a globally distributed team spread across timezones takes a lot more than using the right tools!

In part 1, we discussed the various tools the distributed mobile web engineering team at the Wikimedia Foundation uses to stay synchronized. While the tools are critical to our success, it takes a lot more to ensure that we can successfully work together despite the geographic distances between us. Our development procedures and team norms are the glue that holds it all together.

As with the tools we discussed previously, the practices and norms I’ll discuss below are by no means unique to—or only useful for—distributed teams.

Rituals

When you can’t just walk across the office or poke your head over the cubicle wall to sync up with a teammate, regular, structured moments for real-time, intra-team communication become critical. The mobile web team is a scrum-inspired agile team. As such, we use regular stand-ups, planning meetings, showcases and retrospectives to have some real-time, focused conversation with one another. Because we hold these meetings at a regular cadence and consider them critical touch points for the entire team, we think of them as rituals rather than regular meetings.

The WMF Mobile Apps engineering team holding a stand-up meeting with remote participation.

The stand-ups in particular are excellent for synchronization. Unlike traditional Scrum, we do not hold stand-up meetings every single day; rather, we do ours on Monday, Wednesday and Friday. We use this time to let everyone know what we’ve been working on, make commitments about what we will be working on, alert the team if there’s anything blocking us from getting our work done and quickly triage any bugs that have been reported since we last met. While we can always look in Mingle (our project management tool, see part 1) to see who is working on what and when, these brief meetings make it easier to raise issues and communicate about where further collaboration between teammates may be valuable.

Often, conversations about blockers and problem areas start during the stand-up and continue between the interested parties after the meeting has concluded. The meeting is kept short, time-boxed at 15 minutes, so there is little overhead; the meeting stays focused and we communicate just enough to keep us all moving forward.

The other rituals provide a great way for us to stay in the loop, bond with one another and allow the team tremendous influence over the product and our process. While their primary purposes are not about day-to-day synchronization like the stand ups, the other rituals are essential for reinforcing our self-organizing team. Particularly since we are distributed, these rituals are sacred, as they are the primary moments when we all know we can work together in real time.

(more…)

Wikimedia Foundation is looking for a Vice President of Engineering

Developing and maintaining the code and infrastructure that enable the global Wikimedia volunteer community to contribute to Wikipedia and our other projects is at the heart of the Wikimedia Foundation’s work. In the past 2.5 years, I’ve led our combined Engineering and Product department. We’ve done lots of hiring and grown the department from roughly 35 to 100+ people during that time period. We have many projects underway which we hope will dramatically improve the experience of our contributors and our readers, including VisualEditor, Flow (a new discussion system), and new reader and contribution features for mobile users.

About a year ago, I announced that we needed to start thinking about dividing the responsibilities of the VP of Engineering and VP of Product into two separate leadership roles (closely collaborating on a day-to-day basis), at which point I’d focus on the VP Product part of my current role. The Director-level roles referenced in that announcement now exist—we’ve since hired a Director of User Experience and a Director of Analytics. Now it’s time for us to search for a VP of Engineering to complete the change. We’re partnering in this search with Julie Locke from Vantage Partners, an executive search firm.

We’re looking for someone who shares some fundamental beliefs with us:

  • that working in partnership with a global community of open source developers, and in close dialog with our users, is the best way to achieve lasting and positive changes to our technology;
  • that teams do their best work when they’re inspired and empowered to do good work, not because they’re “managed” to do it;
  • that it’s the job of management to create the conditions for teams and individuals to succeed, by equipping them with resources, mentoring and supporting them in their adoption of effective processes for self-organization;
  • that highly iterative development (“release early, release often”) delivers the most value to our readers and our community;
  • that hiring for diversity—of geography, gender, culture, skills, etc.—leads to more successful and effective teams.

Ideally, you’ve put these beliefs into practice in the real world, in a context where you’ve delivered open source technology to users with short delivery/deployment cycles, where you’ve supported operation of a high traffic site reaching millions of users, and where you’ve held leadership responsibilities in service of multiple, diverse, interdependent teams. You’re passionate about open source, and above all, you’re excited by the Wikimedia vision: a world in which every single human being can freely share in the sum of all knowledge.

Wikimedia has great technical challenges ahead: continually modernizing our user experience, decoupling monolithic aspects of our architecture, and supporting the greatest innovations our community comes up with. We’re looking for a collaborative, brilliant and effective leader who can help us tackle these challenges. If that describes you, take a look at the full job description, and apply today.

Erik Moeller
Vice President of Engineering and Product Development, Wikimedia Foundation

Wikimedia engineering report, October 2013

Major news in October include:

Note: We’re also providing a shorter, simpler and translatable version of this report that does not assume specialized technical knowledge.

(more…)