Skip to content

The --n-workers setting shows odd behavior. #5

@koaning

Description

@koaning

I'm running a train command locally.

python -m spacy ray train --n-workers 4 ...

Here's the results from htop.

image

My machine isn't running anything else but it seems like 12 threads are spinning up. I'm also seeing a lot of ray::IDLE so I'm wondering if something strange is happening. At the same time I'm also not seeing any learning happening from the train logs.

(venv) ➜  proglang-project git:(main) ✗ python -m spacy project run train

=================================== train ===================================
Running command: /home/vincent/Development/proglang-project/venv/bin/python -m spacy ray train --n-workers 4 configs/config.cfg --output training/ --paths.train corpus/stack-overflow-labels-train.spacy --paths.dev corpus/stack-overflow-labels-train.spacy
ℹ Using CPU
2021-03-02 22:07:33,929 INFO resource_spec.py:231 -- Starting Ray with 17.87 GiB memory available for workers and up to 8.94 GiB for objects. You can adjust these settings with ray.init(memory=<bytes>, object_store_memory=<bytes>).
2021-03-02 22:07:34,299 INFO services.py:1193 -- View the Ray dashboard at localhost:8265
(pid=83259) E    #       LOSS TOK2VEC  LOSS NER  ENTS_F  ENTS_P  ENTS_R  SCORE 
(pid=83259) ---  ------  ------------  --------  ------  ------  ------  ------
(pid=83259)   0       0          0.00     44.50    0.00    0.00    0.00    0.00
(pid=83259)   0     200          0.00   9754.83    0.00    0.00    0.00    0.00
(pid=83259)   2     400          0.00  12038.00    0.00    0.00    0.00    0.00
(pid=83259)   3     600          0.00  14772.00    0.00    0.00    0.00    0.00
(pid=83259)   5     800          0.00  18170.17    0.00    0.00    0.00    0.00
(pid=83259)   7    1000          0.00  22302.00    0.00    0.00    0.00    0.00
(pid=83259)   9    1200          0.00  27441.83    0.00    0.00    0.00    0.00
(pid=83259)  16    1600          0.00  40972.67    0.00    0.00    0.00    0.00
(pid=83259)  21    1800          0.00  49929.84    0.00    0.00    0.00    0.00

Not much learning happening. Here's the same run but without the ray integration.

=================================== train ===================================
Running command: /home/vincent/Development/proglang-project/venv/bin/python -m spacy train configs/config.cfg --output training/ --paths.train corpus/stack-overflow-labels-train.spacy --paths.dev corpus/stack-overflow-labels-train.spacy
ℹ Using CPU

=========================== Initializing pipeline ===========================
Set up nlp object from config
Pipeline: ['tok2vec', 'ner']
Created vocabulary
Finished initializing nlp object
Initialized pipeline components: ['tok2vec', 'ner']
✔ Initialized pipeline

============================= Training pipeline =============================
ℹ Pipeline: ['tok2vec', 'ner']
ℹ Initial learn rate: 0.0
E    #       LOSS TOK2VEC  LOSS NER  ENTS_F  ENTS_P  ENTS_R  SCORE 
---  ------  ------------  --------  ------  ------  ------  ------
  0       0          0.00     44.50    0.00    0.00    0.00    0.00
  0     200          9.67  15358.64    0.00    0.00    0.00    0.00
  2     400         45.78   3916.13   67.20   92.93   52.63    0.67
  3     600         70.91   1395.79   92.36   93.29   91.45    0.92
  5     800         93.66    831.71   93.79   94.11   93.47    0.94
  7    1000        108.12    759.16   95.15   95.38   94.91    0.95

Something feels strange about the training loop with ray, so I figured I'd report it.

Here's my spaCy info.

============================== Info about spaCy ==============================

spaCy version    3.0.3                         
Location         /home/vincent/Development/proglang-project/venv/lib/python3.7/site-packages/spacy
Platform         Linux-5.8.0-7642-generic-x86_64-with-Pop-20.10-groovy
Python version   3.7.9                         
Pipelines  

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions