MNIST Example fails with AttributeError: 'NoneType' object has no attribute 'read' if pillow>7.0.0 #670

jplehmann · 2020-04-02T23:16:23Z

Describe the bug
Trying to reproduce the MNIST demo.

To Reproduce
Steps to reproduce the behavior:

Follow instructions at: https://uber.github.io/ludwig/examples/#image-classification-mnist

I have a directory like so:

$ pwd
/Users/john/git/ludwig-sandbox/MNIST/mnist_png/mnist_png

$ ls
training
testing
mnist_dataset_training.csv
mnist_dataset_testing.csv
model_definition.yaml
results

time ludwig train   --data_train_csv mnist_dataset_training.csv   --data_test_csv  mnist_dataset_testing.csv   --model_definition_file model_definition.yaml

Traceback (most recent call last):
  File "/Users/john/.venv/ludwig/bin/ludwig", line 11, in <module>
    load_entry_point('ludwig==0.2.2.4', 'console_scripts', 'ludwig')()
  File "/Users/john/.venv/ludwig/lib/python3.7/site-packages/ludwig/cli.py", line 118, in main
    CLI()
  File "/Users/john/.venv/ludwig/lib/python3.7/site-packages/ludwig/cli.py", line 64, in __init__
    getattr(self, args.command)()
  File "/Users/john/.venv/ludwig/lib/python3.7/site-packages/ludwig/cli.py", line 74, in train
    train.cli(sys.argv[2:])
  File "/Users/john/.venv/ludwig/lib/python3.7/site-packages/ludwig/train.py", line 806, in cli
    full_train(**vars(args))
  File "/Users/john/.venv/ludwig/lib/python3.7/site-packages/ludwig/train.py", line 301, in full_train
    random_seed=random_seed
  File "/Users/john/.venv/ludwig/lib/python3.7/site-packages/ludwig/data/preprocessing.py", line 339, in preprocess_for_training
    random_seed=random_seed
  File "/Users/john/.venv/ludwig/lib/python3.7/site-packages/ludwig/data/preprocessing.py", line 528, in preprocess_for_training_by_type
    random_seed=random_seed
  File "/Users/john/.venv/ludwig/lib/python3.7/site-packages/ludwig/data/preprocessing.py", line 655, in _preprocess_csv_for_training
    random_seed=random_seed
  File "/Users/john/.venv/ludwig/lib/python3.7/site-packages/ludwig/data/preprocessing.py", line 96, in build_dataset_df
    global_preprocessing_parameters
  File "/Users/john/.venv/ludwig/lib/python3.7/site-packages/ludwig/data/preprocessing.py", line 171, in build_data
    preprocessing_parameters
  File "/Users/john/.venv/ludwig/lib/python3.7/site-packages/ludwig/features/image_feature.py", line 269, in add_feature_data
    preprocessing_parameters, first_image_path
  File "/Users/john/.venv/ludwig/lib/python3.7/site-packages/ludwig/features/image_feature.py", line 179, in _finalize_preprocessing_parameters
    first_image = imread(first_image_path)
  File "/Users/john/.venv/ludwig/lib/python3.7/site-packages/skimage/io/_io.py", line 62, in imread
    img = call_plugin('imread', fname, plugin=plugin, **plugin_args)
  File "/Users/john/.venv/ludwig/lib/python3.7/site-packages/skimage/io/manage_plugins.py", line 214, in call_plugin
    return func(*args, **kwargs)
  File "/Users/john/.venv/ludwig/lib/python3.7/site-packages/skimage/io/_plugins/pil_plugin.py", line 37, in imread
    return pil_to_ndarray(im, dtype=dtype, img_num=img_num)
  File "/Users/john/.venv/ludwig/lib/python3.7/site-packages/skimage/io/_plugins/pil_plugin.py", line 67, in pil_to_ndarray
    image.seek(i)
  File "/Users/john/.venv/ludwig/lib/python3.7/site-packages/PIL/PngImagePlugin.py", line 748, in seek
    self._seek(f)
  File "/Users/john/.venv/ludwig/lib/python3.7/site-packages/PIL/PngImagePlugin.py", line 791, in _seek
    cid, pos, length = self.png.read()
AttributeError: 'NoneType' object has no attribute 'read'

Please provide code, yaml definition file and a sample of data in order to entirely reproduce the issue.
Issues that are not reproducible will be ignored.

Expected behavior
I expect it to run the experiment.

Environment (please complete the following information):

OS: OSX
Version: 10.15.4
Python version Python 3.7.3
Ludwig version ludwig==0.2.2.4

Additional context
I put a breakpoint in the pil plugin and inspected the path it's using, which seems entirely valid. I think the permissions on all the dirs are valid...

$ ll /Users/john/git/ludwig-sandbox/MNIST/mnist_png/mnist_png/training/0/16585.png 
-rw-r----- 1 john staff 315 Dec 10  2015 /Users/john/git/ludwig-sandbox/MNIST/mnist_png/mnist_png/training/0/16585.png

w4nderlust · 2020-04-03T05:58:40Z

I followed again all the steps in the example and it works fine for me.
You should look at the content of the training and testing and at the content of the 2 csv files, most likely the ath do not match. Did you move around the directories and files after having created them?
I suggest you anyway to restart from the beginning deleting what you have so far and following exactly the instructions in order.
let me know if you are able to solve the problem.

gilga98 · 2020-04-03T07:25:46Z

@w4nderlust This issue is persistent on windows and linux( I tried Ubuntu ). The exact same files seem to work without a glitch on mac. I tried running the same for FashionMnist

$ ludwig train --data_train_csv ./train_data.csv --data_test_csv ./test_data.csv --model_definition_file model-definition.yaml 


`ludwig_version: '0.2.2.4'
command: ('/media/jhondoe/New Volume/ludwig_playgorund/ubvenv/bin/ludwig train '
 '--data_train_csv ./train_data.csv --data_test_csv ./test_data.csv '
 '--model_definition_file model-definition.yaml')
random_seed: 42
input_data_train: './train_data.csv'
input_data_test: './test_data.csv'
model_definition: {   'combiner': {'type': 'concat'},
    'input_features': [   {   'encoder': 'stacked_cnn',
                              'name': 'image_path',
                              'preprocessing': {},
                              'tied_weights': None,
                              'type': 'image'}],
    'output_features': [   {   'dependencies': [],
                               'loss': {   'class_similarities_temperature': 0,
                                           'class_weights': 1,
                                           'confidence_penalty': 0,
                                           'distortion': 1,
                                           'labels_smoothing': 0,
                                           'negative_samples': 0,
                                           'robust_lambda': 0,
                                           'sampler': None,
                                           'type': 'softmax_cross_entropy',
                                           'unique': False,
                                           'weight': 1},
                               'name': 'label',
                               'reduce_dependencies': 'sum',
                               'reduce_input': 'sum',
                               'top_k': 3,
                               'type': 'category'}],
    'preprocessing': {   'audio': {   'audio_feature': {'type': 'raw'},
                                      'audio_file_length_limit_in_s': 7.5,
                                      'in_memory': True,
                                      'missing_value_strategy': 'backfill',
                                      'norm': None,
                                      'padding_value': 0},
                         'bag': {   'fill_value': '',
                                    'lowercase': False,
                                    'missing_value_strategy': 'fill_with_const',
                                    'most_common': 10000,
                                    'tokenizer': 'space'},
                         'binary': {   'fill_value': 0,
                                       'missing_value_strategy': 'fill_with_const'},
                         'category': {   'fill_value': '<UNK>',
                                         'lowercase': False,
                                         'missing_value_strategy': 'fill_with_const',
                                         'most_common': 10000},
                         'date': {   'datetime_format': None,
                                     'fill_value': '',
                                     'missing_value_strategy': 'fill_with_const'},
                         'force_split': False,
                         'h3': {   'fill_value': 576495936675512319,
                                   'missing_value_strategy': 'fill_with_const'},
                         'height': 28,
                         'image': {   'in_memory': True,
                                      'missing_value_strategy': 'backfill',
                                      'num_processes': 1,
                                      'resize_method': 'interpolate',
                                      'scaling': 'pixel_normalization'},
                         'numerical': {   'fill_value': 0,
                                          'missing_value_strategy': 'fill_with_const',
                                          'normalization': None},
                         'sequence': {   'fill_value': '',
                                         'lowercase': False,
                                         'missing_value_strategy': 'fill_with_const',
                                         'most_common': 20000,
                                         'padding': 'right',
                                         'padding_symbol': '<PAD>',
                                         'sequence_length_limit': 256,
                                         'tokenizer': 'space',
                                         'unknown_symbol': '<UNK>',
                                         'vocab_file': None},
                         'set': {   'fill_value': '',
                                    'lowercase': False,
                                    'missing_value_strategy': 'fill_with_const',
                                    'most_common': 10000,
                                    'tokenizer': 'space'},
                         'split_probabilities': (0.7, 0.1, 0.2),
                         'stratify': None,
                         'text': {   'char_most_common': 70,
                                     'char_sequence_length_limit': 1024,
                                     'char_tokenizer': 'characters',
                                     'char_vocab_file': None,
                                     'fill_value': '',
                                     'lowercase': True,
                                     'missing_value_strategy': 'fill_with_const',
                                     'padding': 'right',
                                     'padding_symbol': '<PAD>',
                                     'unknown_symbol': '<UNK>',
                                     'word_most_common': 20000,
                                     'word_sequence_length_limit': 256,
                                     'word_tokenizer': 'space_punct',
                                     'word_vocab_file': None},
                         'timeseries': {   'fill_value': '',
                                           'missing_value_strategy': 'fill_with_const',
                                           'padding': 'right',
                                           'padding_value': 0,
                                           'timeseries_length_limit': 256,
                                           'tokenizer': 'space'},
                         'vector': {   'fill_value': '',
                                       'missing_value_strategy': 'fill_with_const'},
                         'width': 28},
    'training': {   'batch_size': 128,
                    'bucketing_field': None,
                    'decay': False,
                    'decay_rate': 0.96,
                    'decay_steps': 10000,
                    'dropout_rate': 0.2,
                    'early_stop': 5,
                    'epochs': 50,
                    'eval_batch_size': 0,
                    'gradient_clipping': None,
                    'increase_batch_size_on_plateau': 0,
                    'increase_batch_size_on_plateau_max': 512,
                    'increase_batch_size_on_plateau_patience': 5,
                    'increase_batch_size_on_plateau_rate': 2,
                    'learning_rate': 0.001,
                    'learning_rate_warmup_epochs': 1,
                    'optimizer': {   'beta1': 0.9,
                                     'beta2': 0.999,
                                     'epsilon': 1e-08,
                                     'type': 'adam'},
                    'reduce_learning_rate_on_plateau': 0,
                    'reduce_learning_rate_on_plateau_patience': 5,
                    'reduce_learning_rate_on_plateau_rate': 0.5,
                    'regularization_lambda': 0,
                    'regularizer': 'l2',
                    'staircase': False,
                    'validation_field': 'combined',
                    'validation_measure': 'loss'}}


Using training raw csv, no hdf5 and json file with the same name have been found
Building dataset (it may take a while)
Loading training csv...
done
Loading validation csv..
done
Loading test csv..
done
Concatenating csvs..
done
Traceback (most recent call last):
  File "/media/jhondoe/New Volume/ludwig_playgorund/ubvenv/bin/ludwig", line 10, in <module>
    sys.exit(main())
  File "/media/jhondoe/New Volume/ludwig_playgorund/ubvenv/lib/python3.6/site-packages/ludwig/cli.py", line 118, in main
    CLI()
  File "/media/jhondoe/New Volume/ludwig_playgorund/ubvenv/lib/python3.6/site-packages/ludwig/cli.py", line 64, in __init__
    getattr(self, args.command)()
  File "/media/jhondoe/New Volume/ludwig_playgorund/ubvenv/lib/python3.6/site-packages/ludwig/cli.py", line 74, in train
    train.cli(sys.argv[2:])
  File "/media/jhondoe/New Volume/ludwig_playgorund/ubvenv/lib/python3.6/site-packages/ludwig/train.py", line 806, in cli
    full_train(**vars(args))
  File "/media/jhondoe/New Volume/ludwig_playgorund/ubvenv/lib/python3.6/site-packages/ludwig/train.py", line 301, in full_train
    random_seed=random_seed
  File "/media/jhondoe/New Volume/ludwig_playgorund/ubvenv/lib/python3.6/site-packages/ludwig/data/preprocessing.py", line 339, in preprocess_for_training
    random_seed=random_seed
  File "/media/jhondoe/New Volume/ludwig_playgorund/ubvenv/lib/python3.6/site-packages/ludwig/data/preprocessing.py", line 528, in preprocess_for_training_by_type
    random_seed=random_seed
  File "/media/jhondoe/New Volume/ludwig_playgorund/ubvenv/lib/python3.6/site-packages/ludwig/data/preprocessing.py", line 655, in _preprocess_csv_for_training
    random_seed=random_seed
  File "/media/jhondoe/New Volume/ludwig_playgorund/ubvenv/lib/python3.6/site-packages/ludwig/data/preprocessing.py", line 96, in build_dataset_df
    global_preprocessing_parameters
  File "/media/jhondoe/New Volume/ludwig_playgorund/ubvenv/lib/python3.6/site-packages/ludwig/data/preprocessing.py", line 171, in build_data
    preprocessing_parameters
  File "/media/jhondoe/New Volume/ludwig_playgorund/ubvenv/lib/python3.6/site-packages/ludwig/features/image_feature.py", line 269, in add_feature_data
    preprocessing_parameters, first_image_path
  File "/media/jhondoe/New Volume/ludwig_playgorund/ubvenv/lib/python3.6/site-packages/ludwig/features/image_feature.py", line 179, in _finalize_preprocessing_parameters
    first_image = imread(first_image_path)
  File "/media/jhondoe/New Volume/ludwig_playgorund/ubvenv/lib/python3.6/site-packages/skimage/io/_io.py", line 62, in imread
    img = call_plugin('imread', fname, plugin=plugin, **plugin_args)
  File "/media/jhondoe/New Volume/ludwig_playgorund/ubvenv/lib/python3.6/site-packages/skimage/io/manage_plugins.py", line 214, in call_plugin
    return func(*args, **kwargs)
  File "/media/v/New Volume/ludwig_playgorund/ubvenv/lib/python3.6/site-packages/skimage/io/_plugins/pil_plugin.py", line 37, in imread
    return pil_to_ndarray(im, dtype=dtype, img_num=img_num)
  File "/media/jhondoe/New Volume/ludwig_playgorund/ubvenv/lib/python3.6/site-packages/skimage/io/_plugins/pil_plugin.py", line 67, in pil_to_ndarray
    image.seek(i)
  File "/media/jhondoe/New Volume/ludwig_playgorund/ubvenv/lib/python3.6/site-packages/PIL/PngImagePlugin.py", line 748, in seek
    self._seek(f)
  File "/media/jhondoe/New Volume/ludwig_playgorund/ubvenv/lib/python3.6/site-packages/PIL/PngImagePlugin.py", line 791, in _seek
    cid, pos, length = self.png.read()
AttributeError: 'NoneType' object has no attribute 'read'
`

model-definition.yaml

input_features:
    -
        name: image_path
        type: image
        encoder: stacked_cnn

output_features:
    -
        name: label
        type: category

preprocessing:
        height: 28
        width: 28

training:
    epochs: 50
    dropout_rate: 0.2

gilga98 · 2020-04-03T10:17:02Z

Update:
Neither is the model serving working on both these environments

curl --location --request POST '0.0.0.0:8000/predict' \
--header 'Content-Type: application/json' \
--form 'image_path=@/home/jhondoe/Desktop/165.png'


Error: 'NoneType' object has no attribute 'read'
ERROR:    Error: 'NoneType' object has no attribute 'read'
INFO:     <ip>:<port> - "POST /predict HTTP/1.1" 500 Internal Server Error

jplehmann · 2020-04-03T10:24:51Z

That's funny @gilga98 , so you had no problems on OSX, but had problems on the other platforms? That's opposite me.

I will try to start from scratch today.

jplehmann · 2020-04-03T15:01:20Z

I followed the instructions again this time on a Debian box on GCE (9.12 (stretch)). This time I only installing ludwig with pip, and didn't build it myself. Ludwig version shows to be ('0.2.2.4').

Same exception.

I did not do anything special to the directories. Not only do the csv and pngs look good, as I mentioned I put a breakpoint and checked the path that was being used in the file objects, and ran ls to verify its correctness.

$ pip freeze | egrep -ie "pil|image|ludwig"
ludwig==0.2.2.4
Pillow==7.1.1
scikit-image==0.14.2

gilga98 · 2020-04-03T15:14:00Z

The issue is with pillow verison
7.0.0 worked fine for me on all platforms

pip install pillow==7.0.0

https://pillow.readthedocs.io/en/stable/releasenotes/7.1.1.html`

This fix from pillow concerning the PNGs is breaking it

jplehmann · 2020-04-03T15:23:28Z

That's it -- I can confirm this works for me too. Many thanks @gilga98!

w4nderlust · 2020-04-04T03:51:04Z

mmm that's weird.
Glad it is solved, will likely add a constraint for the pillow version in therequirements, and also try to see if in the meantime since I introduced the constraint on scikit-image==1.14.2 the problems have been solved.
Will keep it open as a reminder.

w4nderlust added the waiting for answer label Apr 3, 2020

w4nderlust added dependencies and removed waiting for answer labels Apr 4, 2020

w4nderlust changed the title ~~MNIST Example fails with AttributeError: 'NoneType' object has no attribute 'read'~~ MNIST Example fails with AttributeError: 'NoneType' object has no attribute 'read' if pillow>7.0.0 Apr 4, 2020

w4nderlust added this to To do in Ludwig Development Apr 4, 2020

uber / ludwig

MNIST Example fails with AttributeError: 'NoneType' object has no attribute 'read' if pillow>7.0.0 #670

MNIST Example fails with AttributeError: 'NoneType' object has no attribute 'read' if pillow>7.0.0 #670

jplehmann commented Apr 2, 2020

w4nderlust commented Apr 3, 2020

gilga98 commented Apr 3, 2020 •

edited

gilga98 commented Apr 3, 2020

jplehmann commented Apr 3, 2020

jplehmann commented Apr 3, 2020 •

edited

gilga98 commented Apr 3, 2020 •

edited

jplehmann commented Apr 3, 2020

w4nderlust commented Apr 4, 2020

uber / ludwig

Join GitHub today

MNIST Example fails with AttributeError: 'NoneType' object has no attribute 'read' if pillow>7.0.0 #670

MNIST Example fails with AttributeError: 'NoneType' object has no attribute 'read' if pillow>7.0.0 #670

Comments

jplehmann commented Apr 2, 2020

w4nderlust commented Apr 3, 2020

gilga98 commented Apr 3, 2020 • edited

gilga98 commented Apr 3, 2020

jplehmann commented Apr 3, 2020

jplehmann commented Apr 3, 2020 • edited

gilga98 commented Apr 3, 2020 • edited

jplehmann commented Apr 3, 2020

w4nderlust commented Apr 4, 2020

gilga98 commented Apr 3, 2020 •

edited

jplehmann commented Apr 3, 2020 •

edited

gilga98 commented Apr 3, 2020 •

edited