Skip to content

Conversation

@eladn
Copy link
Contributor

@eladn eladn commented Jul 17, 2019

Main changes:

  • Move to TensorFlow 2.0.0-beta1.
  • New Keras model, implemented in keras_model.py. It uses tf.keras module (and not the pure Keras package). In order to use the new added Keras model, one should provide the --framework keras argument to code2vec.py. The default framework is still the old TensorFlow model (this is the one chosen if no additional arguments are stated).
  • The keras model allows TensorBoard output while training.
  • Move the TensorFlow model to tensorflow_model.py.
  • Implement model_base.py for common implementation-independent model methods, from which both tensorflow_model.py and keras_model.py inherit.
  • Use tf.Dataset for input pipeline and re-implement the reader in path_context_reader.py.
  • Adapt the TensorFlow model to work with the new reader.
  • Minor refactor the TensorFlow model.
  • Move configurations from common.py to config.py.
  • Added parameters and properties to config.py.
  • Refactor the printing format of the model evaluation results.
  • Use python's logger as output method instead of print(.) s.
  • Added option to separate between <OOV> and <PAD> special words. Use parameter SEPARATE_OOV_AND_PAD in config.py to set whether to apply this option. The default is False (as it was used to be).
  • Add requirements.txt file.
  • Add python type-annotations to all parts of the code.
  • The README.md has been updated accordingly.

What left to do:

  • Train the new Keras for 8 iterations, upload it to S3 and add the link to README.

Should not break the current behavior, just added new functionalities. If new params not stated explicitly the default behavior haven't changed.

eladn added 30 commits March 13, 2019 16:30
…reader.dataset directly to model.fit() instead of using reader.dataset.iterator; rename "Model" ==> "Code2VecModel"; use common.SpecialDictWords in keras_model and reader; fix tensorflow.*python*.keras imports
…al model output; PathContextReader: use abstract ModelInputTensorsFormer to be inherited by the impl.
…layer to not use bias (now #trainable_params equals to orig tf model); use tf.train.AdamOptimizer() as optimizer instead keras "adam" - now training works on GPU
…lass; migrate lookup tables handling into `Vocab`
…maintain number of epochs for a model (recovered on load); model_base have no session object (keras model doesn't need it now); export compile keras model to a method; fix use of vocab size in embeddings; enhance save+load vocabularies; repeat eval reader; additional refactor
… model; impl `_get_vocab_embedding_as_np_array()` in keras model; separate embedding constants in `Config` (by vocab type)
eladn added 28 commits June 2, 2019 13:45
…00; use keras CB to perform logging and evaluate during training; move arg parsing to config; fix AttentionLayer so mask will be input; use logger in classes (instead of prints)
… If new params not stated explicitly the default behavior haven't changed.
@urialon urialon merged commit 01d1731 into tech-srl:master Jul 17, 2019
anki54 pushed a commit to anki54/code2vec that referenced this pull request May 31, 2020
Add tf.keras model implementation
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

2 participants