Take the 2-minute tour ×
Stack Overflow is a question and answer site for professional and enthusiast programmers. It's 100% free, no registration required.

Background - in an article editor powered by TinyMCE for an enterprise in-house CMS behind large media site/s

HTML

<p>non-breaking-space: &nbsp; pound: &pound; copyright: &copy;</p>

JS

console.log($('p').html());
console.log(document.getElementsByTagName('p').item(0).innerHTML);

both return

non-breaking-space: &nbsp; pound: £ copyright: ©

when I'm expecting

non-breaking-space: &nbsp; pound: &pound; copyright: &copy;

some elements get their entities reversed (like pound and copyright), and some are preserved (non-breaking space). I need a way to get the original inner HTML, all preserved, not one that is processed by the browser; is that possible?

This is for a TinyMCE plugin which processes input using jQuery and puts it back. The content is loaded via a database, the plugin is processing image tags did not want to modify the text content at all. The automatic change of some entities back to the raw characters wouldn't be too much of a problem, but -

  • We cannot modify editorial's input, even if it were minor
  • We enforce that these must be entities before they save due to some browser compatibility issues on our sites

I would use this answer - http://stackoverflow.com/a/4404544/830171 - however cannot as my HTML code is within a textarea that the user needs to edit and that I need to run jQuery DOM manipulation on (via the plugin).

One way I can think of is not use jQuery/DOM to process the image tags I need to change, but to use regex like a lot of TinyMCE plugins do; but since I was shot down in regex to pull all attributes out of all meta tags for attempting any regex on HTML, was hoping for a better way!

share|improve this question
1  
A console.dir of an element with such text doesn't show any properties with the entities preserved. Even the debugger (in Chrome) shows all elements' HTML without entities preserved, so I guess you're out of luck. –  pimvdb Jan 16 '13 at 19:18

1 Answer 1

Tinymce uses a contenteditable iframe to edit your contents. Thats the reason whyconsole.log($('p').html());` will get something else. Use this to get the pure editor content

tinymce.get('your_editor_id').getBody().innerHTML
share|improve this answer
    
I wouldn't focus too much on the TinyMCE part of the question, but this in general how to get back the original HTML, here shows the same problem specific to the TinyMCE plugin - ed.onPostProcess.add( function(ed, o) { console.log(o.content); // outputs &pound; console.log($('<tiny-mce-temp>' + o.content + '</tiny-mce-temp>').html()); // outputs £ –  Christian Jan 16 '13 at 17:19

Your Answer

 
discard

By posting your answer, you agree to the privacy policy and terms of service.

Not the answer you're looking for? Browse other questions tagged or ask your own question.