Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Why did Compress and Uncompress disappear? #131

Open
simonjohansson opened this issue Dec 18, 2018 · 7 comments · May be fixed by #200
Open

Why did Compress and Uncompress disappear? #131

simonjohansson opened this issue Dec 18, 2018 · 7 comments · May be fixed by #200

Comments

@simonjohansson
Copy link

@simonjohansson simonjohansson commented Dec 18, 2018

Howdy and thanks for a awesome library!

We have a use case where we tar up a path and upload it to google cloud storage, which takes a []byte as the object you wish to upload. Later on we might download the tar which again gives us a []byte and untar it to some path.

In v2.0.1 we have the super helpful

Compress(sourceDir string) (tar []byte, err error)
Uncompress(destinationDir string, tar []byte) (err error)

We noticed that we had issues with the symlink and that later versions of archiver fixes this. But the helpful functions are now gone in favor of

Archive(sources []string, destination string) error
Unarchive(source, destination string) error

Any reason why there are no methods do archive into []byte and unarchive from []byte anymore?

@mholt
Copy link
Owner

@mholt mholt commented Dec 18, 2018

Hmm, I can't find those functions in the commit history -- do you have a link to them to refresh my memory?

In any case, I think it must have been removed because the use case for buffering entire archive files in memory as byte slices was limited. It's much more recommended to stream instead. (How would you upload large objects to the cloud? That API sounds terrible...) There are examples how to do this in the godocs: https://godoc.org/github.com/mholt/archiver#example-Zip--StreamingWrite

@simonjohansson
Copy link
Author

@simonjohansson simonjohansson commented Dec 19, 2018

@mholt sorry for the confusion.. Those methods come from our wrapper interface.. :D

The ones I meant was

type Archiver interface {
...
	// Write writes an archive to a Writer.
	Write(output io.Writer, sources []string) error
	// Read reads an archive from a Reader.
	Read(input io.Reader, destination string) error
}

https://github.com/mholt/archiver/blob/v2.1/targz.go#L53
https://github.com/mholt/archiver/blob/v2.1/targz.go#L79

I found the example you linked to, but it is very verbose compared to the old API where all you needed to do was give Write a list of file paths.

@mholt
Copy link
Owner

@mholt mholt commented Dec 20, 2018

Hm, okay -- I'll tag this as a feature request; contributions welcomed.

@hoster110
Copy link

@hoster110 hoster110 commented Dec 27, 2018

My idea is to use full IO to accelerate the compression efficiency;

  1. Can multi-threading be supported?
  2. Can you write zip file in an additional way to adapt to the memory shortage caused by large files?
@hoster110
Copy link

@hoster110 hoster110 commented Dec 27, 2018

func Zip(srcFile string, destZip string, thNums string) error {
SytemToInfo()
err := filepath.Walk(srcFile,Listfunc)
if err != nil {
Err(err)
return err
}
nums, err := strconv.ParseInt(thNums,10,32)
if err != nil {
Err(err)
return err
}

batchsize := 2

for index := 0; index < len(ListFile); index = index+batchsize{

	wg.Add(1)
	go func (i int){
		buf := new(bytes.Buffer)
		archive := zip.NewWriter(buf)

		defer archive.Close()
		for in := i;in < i+batchsize && in < len(ListFile);in++{
			ff, err := archive.Create(ListFile[in])
			if err != nil {
				Err(err)
				continue
			}

			bytes,err := ioutil.ReadFile(ListFile[in])
			_, err = ff.Write(bytes)
			if err != nil {
				Err(err)
				continue
			}

		}
		err = archive.Close()
		if err != nil {
			Err(err)
			return
		}

		mx.Lock()
		f, err := os.OpenFile(destZip, os.O_CREATE|os.O_WRONLY|os.O_APPEND, 0666)
		if err != nil {
			Err(err)
			os.Exit(-1)
		}
		Info(fmt.Sprintf("[%+v] [Start Writing to File!!!]",i))
		buf.WriteTo(f)
		Info(fmt.Sprintf("[%+v] [Writing to File The End!!!]",i))
		f.Close()
		mx.Unlock()
		wg.Done()
	}(index)

	if index % int(nums) == 0{
		wg.Wait()
	}

}

wg.Wait()
return nil

}

================================================
So the 'append' file won't unzip. I am looking forward to your reply

nmiyake added a commit to nmiyake/archiver that referenced this issue Nov 25, 2019
This commit adds a new ReaderUnarchive interface, which supports
performing an Unarchive operation from an io.Reader with a provided
size.

Fixes mholt#150
Addresses mholt#131
nmiyake added a commit to nmiyake/archiver that referenced this issue Nov 25, 2019
This commit adds a new ReaderUnarchive interface, which supports
performing an Unarchive operation from an io.Reader with a provided
size.

Fixes mholt#150
Addresses mholt#131
nmiyake added a commit to nmiyake/archiver that referenced this issue Nov 25, 2019
This commit adds a new ReaderUnarchive interface, which supports
performing an Unarchive operation from an io.Reader with a provided
size.

Fixes mholt#150
Addresses mholt#131
nmiyake added a commit to nmiyake/archiver that referenced this issue Nov 25, 2019
This commit adds a new WriterUnarchiver interface, which supports
performing an Archive operation to an io.Writer.

Fixes mholt#131 when paired with mholt#199
nmiyake added a commit to nmiyake/archiver that referenced this issue Nov 25, 2019
This commit adds a new ReaderUnarchiver interface, which supports
performing an Unarchive operation from an io.Reader with a provided
size.

Fixes mholt#150
Addresses mholt#131
@nmiyake
Copy link
Contributor

@nmiyake nmiyake commented Nov 25, 2019

I've put up PRs #199 and #200 that would mostly do this.

Previously, there was:

type Archiver interface {
...
	// Write writes an archive to a Writer.
	Write(output io.Writer, sources []string) error
	// Read reads an archive from a Reader.
	Read(input io.Reader, destination string) error
}

Those PRs would introduce:

type ReaderUnarchiver interface {
	ReaderUnarchive(source io.Reader, size int64, destination string) error
}
type WriterArchiver interface {
	WriterArchive(sources []string, destination io.Writer) error
}

Besides the argument order being swapped and the names being different, the other primary difference is that the "Read" function takes a "size" parameter. This is not required for tar or rar archives, but is required for zip archives -- removing the "size" parameter would force the implementation to fully read the archive into memory first, which is not efficient.

However, consumers should be able to deal with this trivially with a wrapper interface -- if you know that you're not going to deal with zip archives, you can just pass "0" for the size parameter (and if you do need zip support, you can check for it and then do the work of reading the bytes into memory and getting the size yourself).

I think this approach (exposing size parameter as part of API to allow more efficiency) is better than having the native approach omit size and read into memory for ZIP (which is what it did previously) since it allows for more efficiency while still allowing clients the flexibility to work around it with their own API if needed.

@mholt
Copy link
Owner

@mholt mholt commented Jan 14, 2020

Thanks for all the work on this, I will get back to this sometime after Caddy 2 is in a more polished state.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

4 participants
You can’t perform that action at this time.