Take the 2-minute tour ×
Stack Overflow is a question and answer site for professional and enthusiast programmers. It's 100% free, no registration required.

I am fairly new to javascript and I am trying to develop scripts for an application that is java based and uses javascript as its interface for processing/modifying XML project information inline. There is no browser involved.

I am using rhino in a shell to mimic the application environment in order to test and build the javascripts necessary to parse and modify the XML.

The goal is to be able to read in a template project XML that has a lot of optional processing parameters in it and remove entire sections of XML if that processing function is not needed. Additionally, I need to modify specific values in the XML, which I am able to do, as shown below.

Here is a stripped down XML project file (sample_proj.xml):

<?xml version="1.0" encoding="UTF-8" standalone="no"?>  
<PROFILE lastSavedByAppVersion="" type="project" version="1">  
 <OPTIONS processingmode="concurrent"/>
 <ENCODESESSION name="My_session">  
  <OPTIONS framesizemode="custom"/>  
  <PLUGINGROUP>  
   <PLUGIN duration="0" endOffset="0" name="Gamma.plugin" repeats="1" startOffset="0">  
    <PARAMGROUP event_id="0" keyframe="0">  
     <PARAM>  
      <NAME>Cb</NAME>  
      <VALUE>1.0</VALUE>  
     </PARAM>  
     <PARAM>  
      <NAME>Cr</NAME>  
      <VALUE>1.0</VALUE>  
     </PARAM>  
     <PARAM>  
      <NAME>Y</NAME>  
      <VALUE>1.0</VALUE>  
     </PARAM>  
    </PARAMGROUP>  
   </PLUGIN>  
   <PLUGIN duration="300" endOffset="0" name="Overlay.plugin" repeats="1" startOffset="0">  
    <PARAMGROUP event_id="0" keyframe="0">  
     <PARAM>  
      <NAME>Filename</NAME>  
      <VALUE></VALUE>  
     </PARAM>  
    </PARAMGROUP>  
   </PLUGIN>  
  </PLUGINGROUP>  
 </ENCODESESSION>  
 <EVENTTIMELINE dropframe="1" fps="24">  
  <EVENT id="0">  
   <FRAME>0</FRAME>  
   <DURATION>0</DURATION>  
  </EVENT>  
 </EVENTTIMELINE>  
  <SOURCE batchtype="cliplist" type="filesource">  
  <MEDIA name="File" type="video">  
   <FILENAME/>  
  </MEDIA>  
  <MEDIA name="File" type="audio">  
   <FILENAME/>  
  </MEDIA>  
  <clipListModel audioChannelMask="-1" audioFormat="AUTO" singleOutput="false" videoFormat="AUTO">  
   <clipList/>  
  </clipListModel> 
  <TIMECODECONFIGURATION>  
   <MODE>none</MODE>  
  </TIMECODECONFIGURATION>  
 </SOURCE>   
</PROFILE>  

I can use the following js code in a rhino shell to read the file and then try to parse:

importPackage(java.io)

var project = readFile("sample_proj.xml");

project = project.replace(/Gamma/g, "GammaRGB");
project = project.replace(/\s*&lt;PLUGIN\s+.*Overlay.*[\s\S]*?\/PLUGIN&gt;/img, "");
print(project);

The first project.replace works as expected and will replace "Gamma.plugin" with "GammaRGB.plugin".

The second regex however does not do anything, although the same regex in external js regex evaluators is able to parse and remove the entire second <PLUGIN> Overly.plugin </PLUGIN> section. I am used to building perl regular expressions, so the regex here is based on what I have been able to learn about js and multi-line parsing.

I was hoping that I could parse and remove xml sections in pure javascript without having to load a separate XML parser. I always know the XML that will be passed in, so straight text base parsing of XML is preferred.

Thanks for any help,

Bill

share|improve this question
    
the don't use regexes speech in 3...2... –  Mark Nov 9 '10 at 2:07

2 Answers 2

Using Rhino you can call out to Java code. (You probably already know this, as your code is clearly using the java.io package to read text from a file.)

May I suggest the possibility of using (from JavaScript) a Java-based DOM parser (such as is available in java.xml.parsers) to manipulate the XML, rather than using Regex? Doing advanced XML/HTML manipulation with regex is hard to do correctly , especially if your software will need to accept new, unknown inputs later on down the line.

Here's some Java code that might get you started on some equivalent JavaScript:

import java.xml.parsers.*;

java.io.File file = new java.io.File("c:\\sample.xml");
DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
DocumentBuilder db = dbf.newDocumentBuilder();
org.w3c.dom.Document doc = db.parse(file);

(Also see Parsing HTML The Cthulhu Way.)

share|improve this answer
    
Thank you. I will take a look t this method as well. As an aside I have been able to parse the way I intended using escaped xml, but having some weird issues with data getting cut off when unescaping and writing out to a new file. –  billbaggy Nov 10 '10 at 3:55
    
since rhino is E4X enabled, is it possible instead to read the XML file and convert that into a javascript XML string that can then be directly accessed using: –  billbaggy Nov 10 '10 at 21:08
    
var x = new XML(xmlfromfile) –  billbaggy Nov 10 '10 at 21:09
    
I can use the new XML methods in rhino with xml declared within the javascript. Or is there possibly another method for reading in XML in rhino that would keep the data as a native js string? tried String() method conversion as well, that didn't seem to work. The error I keep getting on the file based xml is "js: uncaught JavaScript runtime exception: TypeError: The processing instruction target matching "[xX][mM][lL]" is not allowed." –  billbaggy Nov 10 '10 at 21:13

The second regex might not be working because you are using &lt; instead of < and &gt; instead of >. Is the XML being escaped before being processed by the regex?

Also [\s\S] means match whitespace or non-whitespace, which is really the same as .. (Unless that's to compensate for . not matching line breaks.)

share|improve this answer
    
I was originally using <> but it did not work in some external regex evaluators, so I switched to &lt; and &gt; as they seemed to like that better. I was unaware of escape() until you mentioned it. I will give that a try. –  billbaggy Nov 9 '10 at 4:53
    
Yes, [\s\S] was to match line breaks as well as all characters. –  billbaggy Nov 9 '10 at 4:54

Your Answer

 
discard

By posting your answer, you agree to the privacy policy and terms of service.

Not the answer you're looking for? Browse other questions tagged or ask your own question.