Take the 2-minute tour ×
Code Review Stack Exchange is a question and answer site for peer programmer code reviews. It's 100% free, no registration required.

Suppose I have a lot of large XMLs I want to parse data from and each of these can range in size from 10MB to well over 300MB.

Now, I only need a small sub-set of data from a few specific keys.

For example, suppose this XML I was looking to parse had the following structure:

<Doc>
  <Big_Element1>
      ... LOTS of sub-elements ...
  </Big_Element1>
    .....
  <Small_Element1>
    <Sub_Element1_1 />
      ...
    <Sub_Element1_N />
  </Small_Element1>

   .....

  <Small_Element2>
    <Sub_Element2_1 />
      ...
    <Sub_Element2_N />
  </Small_Element2>

   .....
  <Big_ElementN>
      .......
  </Big_ElementN>
</Doc>

And all I really need is the data from the Small_Elements and the Big_Elements are definitely very large (with many small sub-elements within them) and, so, I'd like to not even enter them if I don't have to.

So, initially, I asked this question on StackOverFlow and think I understood the answer correctly, but I want to make sure I wrote this code in the best / most efficient way possible.

My solution came out to look as follows:

Dim doc As XmlDocument
Dim xNd As XmlNode

Using reader As XmlReader = XmlReader.Create(uri)
    reader.MoveToContent()

    reader.Read()

    Do While True
        If reader.NodeType = XmlNodeType.Element Then

            Select Case UCase(reader.Name)
                Case "SMALL_ELEMENT1"
                    doc = New XmlDocument
                    xNd = doc.ReadNode(reader)
                    GetSmallElement1Data(xNd)

                Case "SMALL_ELEMENT2"
                    doc = New XmlDocument
                    xNd = doc.ReadNode(reader)
                    GetSmallElement2Data(xNd)

                Case Else
                    reader.Skip()
            End Select

        ElseIf reader.NodeType = XmlNodeType.EndElement Then
            Exit Do

        Else
            ' We should never get here:
            Throw New NotSupportedException("XML Structure is not as was expected")
        End If
    Loop
End Using

And GetSmallElement1Data(xNd) & GetSmallElement2Data(xNd) are easy enough for me to deal with since they're small and so I use XPath within them to get the data I need.

But my challenge is that I'm in no way sure I coded that correctly.

Also, I know this sample code was written in VB.net, but I'm equally comfortable with C# / VB.NET solutions.

share|improve this question

closed as off-topic by Heslacher, Malachi, vnp, Brythan, Jamal Oct 20 '14 at 21:13

This question appears to be off-topic. The users who voted to close gave these specific reasons:

  • "Questions must involve real code that you own or maintain. Pseudocode, hypothetical code, or stub code should be replaced by a concrete example. Questions seeking an explanation of someone else's code are also off-topic." – Brythan, Jamal
  • "Questions containing broken code or asking for advice about code not yet written are off-topic, as the code is not ready for review. After the question has been edited to contain working code, we will consider reopening it." – Heslacher, Malachi, vnp
If this question can be reworded to fit the rules in the help center, please edit the question.

    
This post has been auto-flagged for having too many comments. Chat is a better forum. –  rolfl Oct 21 '14 at 14:44
    
Comments are not for extended discussion; this conversation has been Moved to chat. –  rolfl Oct 21 '14 at 14:44

Browse other questions tagged or ask your own question.