Tell me more ×
Stack Overflow is a question and answer site for professional and enthusiast programmers. It's 100% free, no registration required.

How would I read multiple XML files from an input stream in Java and write them as XML files?

I have this:

InputStream is = new GZIPInputStream(new FileInputStream(file));

Edit: I have a tar.gz file say, xmls.tar.gz that is "file" that contains multiple XML files. When I convert it to a string using:

public static String convertStreamToString(java.io.InputStream is) {
        java.util.Scanner s = new java.util.Scanner(is).useDelimiter("\\A");
        return s.hasNext() ? s.next() : "";
    }

I get all of the XML files chained together, with file information as well. On System.out.println I get(this is just the beginning of one file):

blah.xml    60      0      0        2300 12077203627  10436 0ustar     0      0 <?xml version="1.0"...

ANSWER:

This worked great for me, following on Keith's suggestion to use Apache Compress and io:

http://thinktibits.blogspot.com/2013/01/read-extract-tar-file-java-example.html

import java.io.*;
import org.apache.commons.compress.archivers.tar.TarArchiveEntry;
import org.apache.commons.compress.archivers.tar.TarArchiveInputStream;
import org.apache.commons.io.IOUtils;
public class unTar {  
        public static void main(String[] args) throws Exception{
                /* Read TAR File into TarArchiveInputStream */
                TarArchiveInputStream myTarFile=new TarArchiveInputStream(new FileInputStream(new File("tar_ball.tar")));
                /* To read individual TAR file */
                TarArchiveEntry entry = null;
                String individualFiles;
                int offset;
                FileOutputStream outputFile=null;
                /* Create a loop to read every single entry in TAR file */
                while ((entry = myTarFile.getNextTarEntry()) != null) {
                        /* Get the name of the file */
                        individualFiles = entry.getName();
                        /* Get Size of the file and create a byte array for the size */
                        byte[] content = new byte[(int) entry.getSize()];
                        offset=0;
                        /* Some SOP statements to check progress */
                        System.out.println("File Name in TAR File is: " + individualFiles);
                        System.out.println("Size of the File is: " + entry.getSize());                  
                        System.out.println("Byte Array length: " + content.length);
                        /* Read file from the archive into byte array */
                        myTarFile.read(content, offset, content.length - offset);
                        /* Define OutputStream for writing the file */
                        outputFile=new FileOutputStream(new File(individualFiles));
                        /* Use IOUtiles to write content of byte array to physical file */
                        IOUtils.write(content,outputFile);              
                        /* Close Output Stream */
                        outputFile.close();
                }               
                /* Close TarAchiveInputStream */
                myTarFile.close();
        }
}
share|improve this question
1  
An inputstream is connected to only one file at a time. Please elaborate more on your problem. – Santosh Jul 17 at 12:56
1  
How are the 2 files separated in the stream? Is there a delimeter? – f1sh Jul 17 at 12:59

1 Answer

up vote 2 down vote accepted

After un-compressing (gzip) you still need to un-tar. The java JDK doesn't have a built in API for tar, but there are several available from third parties. See this answer: How do I extract a tar file in Java?

share|improve this answer
Isn't my InputStream is = new GZIPInputStream(new FileInputStream(file)); code exactly what the answer in the question you linked to suggests? – Anonymous Jul 17 at 13:10
No, read the answers other than the accepted/first. The first answer, and your GzipInputStream, just gives you one stream of bytes for all files in the tar. That is ok if you want to parse those bytes yourself to figure out where each compoent of the tar ends, etc. Better to use a higher level API that lets you loop over objects like "TarEntry", and get an input stream from each of those, representing each (in your case) XML file in the tar. The later answers show how to do this with code from various libraries. – Keith Jul 17 at 13:17
My mistake. I'll look into it. – Anonymous Jul 17 at 13:20

Your Answer

 
discard

By posting your answer, you agree to the privacy policy and terms of service.

Not the answer you're looking for? Browse other questions tagged or ask your own question.