Wikimedia blog

News from the Wikimedia Foundation and about the Wikimedia movement

VisualEditor

Inventing as we go: building a visual editor for MediaWiki

We at the Wikimedia Foundation, in conjunction with Wikia, are building the VisualEditor for Wikipedia and all other sites based on the MediaWiki software. This is taking us some time, and we often get asked what’s involved in this, just how hard it is, and why it’s taking us so long, so we thought we’d explain some of the intricacies in the most challenging software project the Wikimedia Foundation has ever worked on.

What are we trying to do?

We are creating software that will let users load, edit and save Wikipedia articles visually, bypassing the existing system that requires our users learn “wikitext,” a complex markup code. Instead, the articles they’re editing will look the same as when they’re reading them, and any changes they make will be obvious in their effects before they press save — just like writing a document in a word processor.

Haven’t people done this before?

Yes and no. There are lots of visual text editors out there, and even a few open source ones that can edit Web pages quite well, but they are insufficient for our needs for a number of reasons.

One criterion that is hugely important to us is that our editor should work with lots of languages. This is not just a matter of supporting certain right-to-left languages, or a few based on ideograms, but being able to use any and all of the 290 languages we currently provide. We also want to be able to do so seamlessly in the same documents to support our multi-lingual communities. Some of the languages we are committed to working with have very little software support, and we are often one of the few sources of written material for them, or at least, one of the largest.

Another issue is that wikitext has grown over the past 12 years to have a large number of rich and complicated features that are not just “simplified ways of writing HTML.” Though we originally intended many of these to be used only occasionally, they have often evolved to be at the core of how MediaWiki pages are now written by many Wikimedia users and more widely.

These advanced features include content transclusion (pulling in a “live” copy of one page or part of it into another), templates (transclusion with parts defined at the second page rather than the source), and parser functions (pages that show different things depending on hundreds of potential options, like the day of the week or whether another page exists). Attempting to retrofit this into an existing editor would have been exceptionally difficult, and more work than starting from scratch.

VisualEditor and Parsoid module stack

How VisualEditor and Parsoid work together (click to enlarge)

What is the parser?

Finally, we need to not just edit pages, but to read and save existing wiki pages in the old wikitext format, and continue to work with it in parallel to the new editor. We can’t throw away the huge amount of work our communities have done over the past 12 years, so we need to re-write the “translator.” This means making a two-way “parser” — a bit of software that converts wikitext into HTML and back again. Until now, we have only had a one-way parser; the second stage, converting back from what people want to write to wikitext, has had to be done by our editors in their heads.

This means that we’ve had to have an entirely separate project – the Parsoid – that can translate in both directions: from wikitext to Web pages and also back again. This is not remotely an easy task; you have to be very attentive in replacing a parser, as it’s hugely important that we don’t break anything. The old parser, and the wikitext “language” it interprets, just grew organically as people had ideas, and no one really designed it. There are nearly two billion versions of pages in Wikimedia’s wikis alone, and this lack of design means that there are a huge number of little rules we need Parsoid to follow to avoid “dirty-diffs” — issues where the wikitext would be broken by people using the VisualEditor.

Parsoid HTML+RDFa content model

Explaining the HTML+RDFa content model used by Parsoid (click to enlarge)

We use automated testing to “round-trip” 100,000 randomly chosen pages from the English Wikipedia (as a reasonable sample of wiki content in the Latin alphabet): taking the wikitext, converting it to HTML format and back to wikitext, and comparing the result. This helps us by identifying many issues to fix so that using Parsoid does not cause articles to break. Right now we get about 80 percent of these articles to round-trip without any differences at all (up from 65 percent in October), an additional 18 percent round-trip with only minor differences (whitespace, quote style etc.), and the remaining 2 percent of pages have differences that still need fixing (down from about 15 percent in October).

Learn more

As you can see, to make a visual editor that our users need is a large amount of work. No existing technologies could do what we wanted to, so we have needed to work very hard to make sure that we deliver this. We look forward over the next few months to showing off how editors will be able to use VisualEditor and Parsoid for a much better experience, free of learning any complicated codes, letting them focus on the content and not the tool they use to write it.

If you’re interested, we have a brief presentation about how Parsoid and VisualEditor work, and what they will look like on Wikipedia.

James ForresterProduct Manager, VisualEditor and Parsoid

Google Summer of Code students reach project milestones

This year, the MediaWiki community again participated in Google Summer of Code, in which we selected nine students to work on new features or specific improvements to the software. They were sponsored by Google and mentored by experienced developers, who helped them become part of the development community and guided their code development.

Congratulations to the eight students who have made it through the summer of 2012 (our seventh year participating in GSoC)! They all accomplished a great deal, and many of them are working to improve their projects to benefit the Wikimedia community even more.

Google Summer of Code 2012

Eight students passed MediaWiki’s GSoC program in 2012.

  • Ankur Anand worked on integrating Flickr upload and geolocation into UploadWizard. WMF engineer Ryan Kaldari mentored Ankur as they made it easier for Wikimedia contributors to contribute media files and metadata. Read his wrapup and anticipate the merge of his code into the main UploadWizard codebase.
  • Harry Burt worked on TranslateSvg (“Bringing the translation revolution to Wikimedia Commons”). When his work is complete and deployed, we will more easily able to use a single picture or animation in different language wikis. See this image of the anatomy of a human kidney, for example; it has a description in eight languages, so it benefits multiple language Wikipedias (e.g., Spanish and Russian). Harry aims to allow contributors to localize the text embedded within vector files (SVGs), and you can watch a demo video, try out the test site, or just read Harry’s wrapup post. WMF engineer Max Semenik mentored this project.
  • Akshay Chugh worked on a convention/conference extension for MediaWiki. Wikimedia conferences like Wikimania often use MediaWiki to help organize their conferences, but it takes a lot of custom programming. Under the mentorship of volunteer developer Jure Kajzer, Akshay created the beta of an extension that a webmaster could install to provide conference-related features automatically. See his wrapup post.
  • Flow diagram for Ashish Dubey's project

    Ashish is working on the architecture that will support real-time collaboration.

    Ashish Dubey worked on realtime collaboration in the upcoming Visual Editor (you may have seen “real-time collaborative editing” in tools like Etherpad and Google Docs). Ashish (with WMF engineer Trevor Parscal as mentor) has implemented a collaboration server and other features (see his wrapup post) and has achieved real-time “spectation,” in which readers can see an editor’s changes in realtime. Wikimedia Foundation engineers plan to integrate Ashish’s work into VisualEditor around April to June 2013.

  • Nischay Nahata optimized the performance of the Semantic MediaWiki extension. In wikis with unusually large amounts of content, Semantic MediaWiki experiences performance degradation. With the mentorship of head Semantic MediaWiki developer Markus Krötzsch (a volunteer) and Wikidata developer Jeroen De Dauw, Nischay found and fixed many of these issues.  This also reduces SMW’s energy consumption, making it greener. Nischay’s work will be in Semantic MediaWiki 1.8.0, which is currently in beta and due to be released soon. Wikimedia Labs uses Semantic MediaWiki and will benefit from the performance improvements.
  • Proposal to redesign the MediaWiki watchlist

    Arun Ganesh illustrated the watchlist redesign proposal with this mockup.

    Aaron Pramana worked on watchlist grouping and workflow improvements. Aaron wants to make it easier for wiki editors and readers to use watchlists, and to create and use groups of watched items to focus on or share. Aaron worked with volunteer developer Alex Emsenhuber. The back end of the system is done, but Aaron wants your input about the user interface. Folks on the English Wikipedia’s Village Pump have started discussing it.

  • Robin Pepermans worked on Incubator improvements and language support, mentored by WMF engineer Niklas Laxström. If you’ve ever thought of using Wikimedia’s Incubator for new projects, it’s now easier to get started. Read Robin’s wrapup post for more.
  • Platonides worked on a desktop application for mass-uploading files to Wikimedia Commons. The application will eventually make it much easier for participants in upload campaigns like Wiki Loves Monuments to upload their photos (and it’ll work on Windows, Linux, and Mac OS). I mentored Platonides, who delivered a beta version.

As further progress happens, we’ll update our page about past GSoC students. Congratulations again to the students and their mentors. And thanks to volunteer Greg Varnum, who helped me administer this year’s GSoC, and to all the staffers and volunteers who helped students learn our ways.

Sumana Harihareswara, Engineering Community Manager

Help us shape Wikimedia’s prototype visual editor

Today, the Wikimedia Foundation launched a new prototype “visual editor” for Wikimedia. The visual editor is a new editing environment that won’t require everyone to learn our special markup language in order to contribute to our projects.

Right now, if you try to edit the English Wikipedia’s article about the Wikimedia Foundation, or the Latin Wiktionary’s entry for “futūrus” (about to be), you get a lot of confusing characters interspersed amongst the recognisable text. Though it’s possible to learn what these mean and use them powerfully, many of our editors, and especially new editors, want to contribute content, not learn technical formatting.

We identified the difficulty in learning wikitext as a key inhibitor to growing our editor community in the Wikimedia movement’s strategic plan. We want the process of learning how to edit to be trivial, so our volunteers, both new and experienced, can devote themselves to what they edit. That’s why we’re building the visual editor, so that contributing to a wiki is as easy and natural as other modern editing systems, and new editors are not dissuaded from making their changes.

You may remember a similar announcement in December 2011, when we revealed a developer prototype of our “visual editor,” but after a great deal of feedback, we’ve reworked it so that it’s more useful to our community of users.

We learned a lot from building our first prototype. It was great how many of you helped with feedback, bug reports and comments about how we were doing. In the months since then, based on your feedback and technical issues we encountered, we’ve overhauled the entire editor. We changed the technical design and how it works, rewriting its components so that we can better support more editors. We’ve also integrated it into the MediaWiki platform, so now it can load and edit wiki articles, and not just sit separately.

A screenshot of the new visual editor

To build this iteration of our open-source visual editor, we have been working with some of the team from Wikia, a collaborative publisher that operates the largest network of video game, entertainment and lifestyle wikis in the world. We both believe that this kind of tool should be built not just for the Wikimedia wiki projects, but for everyone using MediaWiki software, and when it’s done we look forward to including the visual editor “out of the box” for anyone setting up a wiki with our software.

Thanks to all this, our new prototype is now live on mediawiki.org. This is just a demonstration, and very far from a finished product — for example, we haven’t yet added image or table handling. It’s currently locked down to only work on a self-contained area of the wiki, so that it doesn’t encounter any unsupported content or break anything else. We intend to work on small pieces of the overall story, releasing a new version every two weeks or so, and adding features one-by-one until the editor is good enough to deploy for everyone (and release in MediaWiki’s core).

Over the next few weeks and months, we will be working with the community — you — to find bugs, to focus on what our priorities should be, and most importantly, to make sure that what we’re building is right for you and that it supports your “workflow.”

So please, try out the prototype, see our frequently-asked questions and tell us what you think.

– The Visual Editor Team: Trevor Parscal, Inez Korczyński, James Forrester, Roan Kattouw, Rob Moen, Subramanya Sastry, Brion Vibber, Gabriel Wicke, Christian Williams.

Help test the first visual editor developer prototype

The development of a Visual Editor is one of the Foundation’s top priorities for the upcoming year, as laid out by the 2011-2012 Annual Plan.  There is plenty of evidence that wiki-markup is a substantial barrier that prevents many people from contributing to Wikipedia and our other projects.  Formal user tests, direct feedback from new editors, and anecdotal evidence collected over the past several years have made the need for a visual editor clear.

Developing a web-based visual editor is an extremely complex task.  It is perhaps the most challenging technical project ever undertaken in the history of MediaWiki development.  Here are some of the characteristics that make this project unique:

  • We have to support editing in both the new way (via the Visual Editor) and the traditional way (via wiki markup).  This is important since it’s what our communities have used for more than 10 years: We can’t completely change the way they do their work overnight.  We need to, however, simultaneously support potential editors who are not comfortable with wiki markup.  So any editing system will need to be able to go back and forth between the Visual Editor and wiki markup with minimal, if any, disruption to the end user.  We will have to perform back-and-forth transformations without breaking things.  Anyone who has used an editor that has both “visual” and “html” modes should have a feeling for the challenges, but it’s even harder with wiki markup, because:
  • Wiki markup is enormously expressive, complex and complicated, and there’s a huge amount of content which uses every facet of this markup language. Wikipedia articles employ a rich set of layout features, including images, tables, citations, mathematical formulas, “infoboxes” and other dynamically loaded templates which preserve a consistent look and feel for certain information, and many other elements that enable a compelling and educational reader experience (see the article on Calculus as an example).  Supporting compatibility with the full breadth of these features is an enormous technical challenge.

Over the past several months, the engineering team at WMF has made a lot of progress in developing this visual editor.  Today, we’d like to share the first prototype of a basic editing surface which supports the translation of what’s on the screen into wiki markup.  The demo, which can’t yet save or edit documents, supports both basic formatting (e.g., bold, italics, section heading) as well as many of the required features that people take for granted (e.g., cut/paste and undo/redo). However, it’s still very fragile, and you may easily end up with an unusable document. In the best case scenario, you can use it to generate valid wiki markup that you can copy and paste into an edit box on any MediaWiki wiki.

This version of Visual Editor should support most of the modern browsers but was tested mostly on Firefox, Chrome and IE9. We do support IE8 as well, but not IE7 (yet). The editor isn’t internationalized yet, but will be with the next release.

Try the visual editor sandbox

You can view the demo and see the wikitext translation by visiting the visual editor sandbox on mediawiki.org and playing around with any of the articles available for pre-loading.

Manipulation of an example document, showing the link editor.

Using the debugging tools in the top right, you can switch to side-by-side view of different content representations, including wikitext (icon with square brackets), which are dynamically updated as the text on the left changes.

We would love to get your input on our progress.  Please leave us comments by clicking on the “Leave Feedback” in the upper right hand corner of the demo, which will place your feedback on this page.  Thoughts on which tasks this interface makes easier or harder compared to your current workflow would be particularly helpful.  We’re very excited to share this progress and look forward to your feedback.

Where do we go from here? From here on, we will iteratively release features, bug-fixes, and updates.  We’ll continue to make this tool useful for more real-world use cases, and tick off additional features: creating pages, saving them, editing existing pages or sections, adding/removing images, editing data in templates, editing tables. . .the list goes on.

Our goal is to enable real-world editing of a subset of content soon, but it’ll still be some time until we can serve all the needs of even a small wiki community, let alone Wikipedia’s. Currently we’re targeting June 2012 for first production use at scale, either on a smaller wiki or a section of a larger one. It’s the biggest and most important change to our user experience we’ve ever undertaken, and we look forward to your help in making it happen.

– The Visual Editor Team, Wikimedia Foundation
Trevor Parscal, Inez Korczyński, Neil Kandalgaonkar, Roan Kattouw, Brion Vibber, Gabriel Wicke