Skip to content

Conversation

@YaoYinYing
Copy link

@YaoYinYing YaoYinYing commented Aug 29, 2024

This is one of the separated PRs from #321 .

Full PR roadmap

id purpose # commits Affected
1 Hydra-Omegaconf and Pip module 6 Code, Config, Doc
2 BFD supports and MSA parallelism fixes 5 Code, Config, Case
3 Small molecule inputs and covalent bonds 4 Code, Case, Doc

Changelog

Added

  • Support for SDF file inputs for small molecules and additional file formats.
  • Expanded set of demo cases.
  • PTM Inputs (phosphorylations, isopeptides, metal bindings, disulfide bridges, glycosylations, etc.)

Fixed

  • Implemented automatic binary search within the PATH if not explicitly specified.
  • Resolved an issue with dummy assertions of ref_atom_name_chars and problematic atoms that caused errors.
  • Addressed the None-value in TemplateAtomMaskAllZerosError to prevent incorrect warning triggers.
drop: buildin logger fix: template try-except bugs fix: output logs refactor: deduplicate code fix: hetatm input raise from smiles fix: hetatm input raise and dialognoses docs&cases: covalent bonds feat: covalent bond fix: covalent bond add: use_3d opt for ligand
@CLAassistant
Copy link

CLA assistant check
Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you sign our Contributor License Agreement before we can accept your contribution.
You have signed the CLA already but the status is still pending? Let us recheck it.

@alephreish
Copy link

@YaoYinYing I had to bring back a couple of imports in order to make it work:

diff --git a/apps/protein_folding/helixfold3/infer_scripts/feature_processing_aa.py b/apps/protein_folding/helixfold3/infer_scripts/feature_processing_aa.py index 603e8da..c994b0d 100644 --- a/apps/protein_folding/helixfold3/infer_scripts/feature_processing_aa.py +++ b/apps/protein_folding/helixfold3/infer_scripts/feature_processing_aa.py @@ -19,6 +19,7 @@ import os from pathlib import Path import pickle from typing import List, Mapping, Optional, Tuple +import time import numpy as np import logging @@ -28,7 +29,7 @@ from helixfold.data import pipeline_multimer from helixfold.data import pipeline_rna_multimer from helixfold.data import pipeline_conf_bonds, pipeline_token_feature, pipeline_hybrid from helixfold.data import label_utils - +from concurrent.futures import ProcessPoolExecutor, as_completed from helixfold.data.tools import utils from .preprocess import Entity, digit2alphabet diff --git a/apps/protein_folding/helixfold3/inference.py b/apps/protein_folding/helixfold3/inference.py index 429809b..9edc0a1 100644 --- a/apps/protein_folding/helixfold3/inference.py +++ b/apps/protein_folding/helixfold3/inference.py @@ -24,6 +24,7 @@ import pickle import pathlib import shutil import numpy as np +import logging from helixfold.common import all_atom_pdb_save from helixfold.data.pipeline_conf_bonds import load_ccd_dict from helixfold.model import config, utils @@ -116,7 +117,7 @@ def resolve_bin_path(cfg_path: str, default_binary_name: str)-> str: raise FileNotFoundError(f"Could not find a proper binary path for {default_binary_name}: {cfg_path}.") -def get_msa_templates_pipeline(cfg: DictConfig) -> Dict: +def get_msa_templates_pipeline(cfg) -> Dict: use_precomputed_msas = True # Assuming this is a constant or should be set globally template_searcher = hmmsearch.Hmmsearch( diff --git a/apps/protein_folding/helixfold3/utils/model.py b/apps/protein_folding/helixfold3/utils/model.py index 4a5b2d6..2ba6337 100644 --- a/apps/protein_folding/helixfold3/utils/model.py +++ b/apps/protein_folding/helixfold3/utils/model.py @@ -17,6 +17,7 @@ import numpy as np import paddle import paddle.nn as nn +import logging import io from helixfold.model import modules_all_atom

Also a side note: leave_atom_flag is currently ignored, so the user has to modify ccd_preprocessed_etkdg.pkl.gz to remove the atoms that leave upon formation of the corresponding covalent bond.

@YaoYinYing
Copy link
Author

@alephreish Hi Andrey, this PR is a cherry-pick(which means there exists some poteintal bugs) from one branch that has been definitely out-of-dated. If you are looking for a full-featured branch for you project, please consider this fork.

@alephreish
Copy link

alephreish commented Oct 30, 2024

@YaoYinYing I've seen the main branch in your fork. I personally like the current interface of helixfold - it's flexible enough, although switching to your new interface would not be a big deal. I have the feeling that this small PR does have a chance of being merged into PaddlePaddle/PaddleHelix:dev since patch-hydra diverged too much by now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

4 participants