0

We allow people to enter HTML code on our wiki-like site. But only a limited subset of HTML to not affect our styling and not allow malicious javascript code. Is there a good Java library on the server side to ensure that the code entered is valid?

We tried creating an XML Schema document to validate against. The only issue there is the libraries we used to validate gave back cryptic error messages. What I want is for the validation library to actually fix the issue (if there was a style="" attribute added to an element, remove it). If fixing it is not easy, at least allow me to report a message to the user with the location of the error (an error code that I can present a nice message from is fine, probably even preferable).

3
  • HTML is tough. If you can, try to change the system to accept some wiki-like syntax instead (like here on SO).
    – Thilo
    Commented Dec 24, 2010 at 2:11
  • What system do you use ? Is it java, php, only xml, etc ?
    – Istao
    Commented Dec 24, 2010 at 8:39
  • I use Java as the title and question indicate.
    – at.
    Commented Dec 24, 2010 at 21:53

1 Answer 1

0

Try JSoup. I think this is what you're looking for: JSoup

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.