Take the 2-minute tour ×
Stack Overflow is a question and answer site for professional and enthusiast programmers. It's 100% free, no registration required.

Hi I had written a generic code in Java that parses XML input file without knowing its structure and outputs value in comma separated value. So lets say I have following in my XML document:

<Employee>
    <Name>XYZ</Name>
    <Id>123</Id> 
    <Address>
         <Office_Address>office address here</Office_Address>
    </Address>
</Employee>

So now my Java code parses above xml file into comma separated value as:

Employee (File 1):  Name , ID
Address (File 2):  Office_Address

That is for each nested element its output a new csv file having columns inside it equals to its child nodes.

So this is working fine but now problem is : Lets I am having same above mentioned XML file as:

 <Employee>
    <Name>XYZ</Name>
    <Id>123</Id> 
    <Address/>
</Employee>

So in this case when my generic Java code process this file it outputs as:

Employee (File 1) : Name, Id, Address

So instead of two output file I am getting one and file 1 has sometimes 3 entries instead of 2. This happens because Address element is present sometime as nested and some time as flat. So when it is nested Java code creates a new comma separated corresponding to it but when it is not nested than it outputs just one file.

I can solve this problem by hard coding the logic for this element. But I do not want to do that as than there will be no point of my Java generic XML parsing code.

So my question is that any way in which we can figure out that an element in an XML files generating from same sources may be present as nested and sometime as flat. Use of XSD or any other way. I tried researching many things but not able to figure out anything.

Thanks in advance and hoping to get solution or few good suggestions.

share|improve this question
    
@AndrewThompson: Its just a dummy example I made to explain problem I am facing. Did not thought about this. Thanks for pointing out but let me know if you have any ideas to fix original problem. –  user1188611 May 6 '13 at 16:32
    
you mention "XSD", do you have an xsd for the xml? if so, then yes you can solve the problem. if not, you will have a tough time solving this in the general sense. –  jtahlborn May 6 '13 at 16:56
    
can you tell me how I can solve this problem if I had the XSD for the XML file. Please tell me solution only if you are suggesting that I should read the complete XML file once, get access to its structure in my code some how but than my generic parsing code wont be generic. Because as I try to process new XML I need to make changes in the code so that don't left my code generic. –  user1188611 May 6 '13 at 18:09
    
i did explain my comment in my answer below. –  jtahlborn May 6 '13 at 18:56
1  
xsd is a well documented specification... –  jtahlborn May 6 '13 at 19:29

2 Answers 2

This happens because Address element is present sometime as nested and some time as flat.

That statement is not correct. Address is still nested under the Employee element. In the 2nd case, it is just empty. If you can test for "empty" element (Address element with no children) in your generic code then this issue can be solved.

share|improve this answer
    
In the XML file there are various other elements which are present empty most of times or if not have a plain text value (as compared to Address element which when not empty has its own child element). I agree it is still nested under Employee but since when Address is empty it does not have its own child so in that context it is not nested. –  user1188611 May 6 '13 at 16:25
    
Also if I test for empty element(one which might have children versus one which might have text value when not empty) than how I will figure out that whether this empty element should go in new file or should be in same file as its parent. Let me know if you understand what I am saying. –  user1188611 May 6 '13 at 16:29
1  
@user1188611 Post your code with an example. junit is even better. –  Pangea May 6 '13 at 16:52

If you have an xsd, then you could parse the xsd file and determine which elements support nested elements.

If you don't have an xsd, then you basically would have to parse the entire xml file once to determine all the possible nesting (i.e. you're basically inspecting the xml file to build your own xsd), then parse it again to actually output the final result based on the knowledge you gained from the first pass.

share|improve this answer

Your Answer

 
discard

By posting your answer, you agree to the privacy policy and terms of service.

Not the answer you're looking for? Browse other questions tagged or ask your own question.