Welcome to genomeNLP’s documentation!#
Copyright (c) 2022 Tyrone Chen , Navya Tyagi , Sarthak Chauhan, Anton Y. Peleg , and Sonika Tyagi .Code in this repository is provided under a MIT license. This documentation is provided under a CC-BY-3.0 AU license.
Visit our lab website here. Contact Sonika Tyagi at sonika.tyagi@monash.edu.
Note
The main repository is on github but also mirrored on gitlab. Please submit any issues to the main github repository only.
- genomeNLP: Genome recoding for Machine Learning Usage incorporating genomicBERT
- genomeNLP: Case study of deep learning
- genomeNLP: Case study of DNA
- 4. Setting up a biological dataset
- 5. Format a dataset for input into genomeNLP
- 6. Preparing a hyperparameter sweep
- 7. Selecting optimal hyperparameters for training
- 8. With the selected hyperparameters, train the full dataset
- 9. Perform cross-validation
- 10. Compare different models
- 11. Obtain model interpretability scores
- Citation
- genomeNLP: Case study of Protein
- 4. Setting up a biological dataset
- 5. Format a dataset for input into genomeNLP
- 6. Preparing a hyperparameter sweep
- 7. Selecting optimal hyperparameters for training
- 8. With the selected hyperparameters, train the full dataset
- 9. Perform cross-validation
- 10. Compare different models
- 11. Obtain model interpretability scores
- Citation
- Create a token set from sequences
- Create a dataset object from sequences
- Create embeddings from a tokenised dataset
- Perform a hyperparameter sweep
- genomicBERT: Train a deep learning classifier
- Perform cross-validation
- Compare performance of different deep learning models
- Generate synthetic sequences for use in classification
- Get class attribution for deep learning models