Sign up ×
Unix & Linux Stack Exchange is a question and answer site for users of Linux, FreeBSD and other Un*x-like operating systems. It's 100% free, no registration required.

I was installing Anaconda Python on Linux. For Linux, Anaconda provide a bash script, but the file is huge, almost 300 MB. I decided to see why, and opened it in a text editor.

About 95% of the file is machine language gibberish, like this:

ºîØôЕzÒA¶©h¶¥R•„&´ìÒUÓçß3{^eÑòà(|ÄÃk뎆ºîØôЕzÒA¶©h¶¥R•„&´ìÒUÓçß3{™½ö|q ŽÖm¶¥¡ôÚ­gú¡@óìÛkkº£C»Iš)à÷¾Û¸êw½æõîJN7í×p€A¡ÈzÞÝï8

The file isn't corrupt, as I can install Python.

Most of this is in the license file, so Im wondering if it's unicode for another language, but that wouldn't take 95% of the file, would it?

Can it be compiled code / machine language? Is it allowed to put machine code in bash files?

share|improve this question
3  
It's probably a self extracting binary. The actual script part will copy the gibberish part to a separate file(s), most likely doing some decompression along the way, then execute it. As long as the scrip exits before it reaches the non interpretable part, this is completely fine. –  Graeme 23 hours ago
    
bash can handle embedded data in binary –  Skaperen 23 hours ago
4  
For future reference, machine language or machine code is only one kind of binary data. Your file definitely has binary data in it, but from what you've shown we don't know if it's machine code or not. –  zwol 17 hours ago

1 Answer 1

To expand on @Graeme's comment.

The downloaded script is a bash script with an embedded tarball. The script part first validates the tarball by md5sum, then unpack the tar, which contains multiple .tar.bz2 archives. Then it proceeds by using a custom function extract_dist() to unpack the archives. I.e.:

extract_dist python-2.7.10-0
extract_dist conda-3.14.1-py27_0
...

which extracts the files:

python-2.7.10-0.tar.bz2
conda-3.14.1-py27_0.tar.bz2
...

For the 32-bit version the script part can be extracted by:

head -n 467 Anaconda-2.3.0-Linux-x86.sh

For the 64-bit version the script part can be extracted by:

head -n 466 Anaconda-2.3.0-Linux-x86_64.sh

As you can see the script part ends with exit 0 which aborts any further processing of the script by bash.

The tarball is extracted by:

tail -n +469 $THIS_PATH | tar xf - --no-same-owner
tail -n +468 $THIS_PATH | tar xf - --no-same-owner

for 32-bit and 64-bit respectively.

You could for example do:

tail -n +469 Anaconda-2.3.0-Linux-x86.sh | tar -t

to list the files in the 32-bit archive.

share|improve this answer

Your Answer

 
discard

By posting your answer, you agree to the privacy policy and terms of service.

Not the answer you're looking for? Browse other questions tagged or ask your own question.