Take the 2-minute tour ×
Stack Overflow is a question and answer site for professional and enthusiast programmers. It's 100% free, no registration required.

I'm trying to make use of an external library in my Python mapper script in an AWS Elastic MapReduce job.
However, my script doesn't seem to be able to find the modules in the cache. I archived the files into a tarball called helper_classes.tar and uploaded the tarball to an Amazon S3 bucket. When creating my MapReduce job on the console, I specified the argument as:

cacheArchive s3://folder1/folder2/helper_classes.tar#helper_classes

At the beginning of my Python mapper script, I included the following code to import the library:

import sys
sys.path.append('./helper_classes')
import geoip.database

When I run the MapReduce job, it fails with an ImportError: No module named geoip.database. (geoip is a folder in the top level of helper_classes.tar and database is the module I'm trying to import.)
Any ideas what I could be doing wrong?

share|improve this question

Your Answer

 
discard

By posting your answer, you agree to the privacy policy and terms of service.

Browse other questions tagged or ask your own question.