aksell commited on
Commit
c663b1c
·
1 Parent(s): a2cfd88

Remove docks from main page and list supported models

Browse files
hexviz/pages/2_📄Documentation.py CHANGED
@@ -40,7 +40,13 @@ TODO: Add examples of attention patterns
40
 
41
  Read more about attention patterns in fex [Revealing the dark secrets of BERT](https://arxiv.org/abs/1908.08593).
42
 
43
- # FAQ
 
 
 
 
 
 
44
  1. I can't see any attention- "bars" in the visualization, what is wrong? -> Lower the `minimum attention`.
45
  2. How are sequences I input folded? -> Using https://esmatlas.com/resources?action=fold
46
  """
 
40
 
41
  Read more about attention patterns in fex [Revealing the dark secrets of BERT](https://arxiv.org/abs/1908.08593).
42
 
43
+ ## Protein Language models in Hexviz
44
+ Hexviz currently supports the following models:
45
+ 1. [ProtBERT](https://huggingface.co/Rostlab/prot_bert_bfd)
46
+ 2. [ZymCTRL](https://huggingface.co/nferruz/ZymCTRL)
47
+ 3. [TapeBert](https://github.com/songlab-cal/tape/blob/master/tape/models/modeling_bert.py) - a nickname coined in BERTOLOGY meets biology for the Bert Base model pre-trained on Pfam in [TAPE](https://www.biorxiv.org/content/10.1101/676825v1). TapeBert is used extensively in BERTOlogy meets biology.
48
+
49
+ ## FAQ
50
  1. I can't see any attention- "bars" in the visualization, what is wrong? -> Lower the `minimum attention`.
51
  2. How are sequences I input folded? -> Using https://esmatlas.com/resources?action=fold
52
  """
hexviz/🧬Attention_Visualization.py CHANGED
@@ -124,7 +124,6 @@ attention_pairs, top_residues = get_attention_pairs(
124
  head=head,
125
  threshold=min_attn,
126
  model_type=selected_model.name,
127
- ec_class=ec_class,
128
  top_n=n_highest_resis,
129
  )
130
 
@@ -197,28 +196,35 @@ def get_3dview(pdb):
197
  xyzview = get_3dview(pdb_id)
198
  showmol(xyzview, height=500, width=800)
199
 
200
- st.markdown(f"""
201
- Visualize attention weights from protein language models on protein structures.
202
- Currently attention weights for PDB: [{pdb_id}](https://www.rcsb.org/structure/{pdb_id}) from layer: {layer_one}, head: {head_one} above {min_attn} from {selected_model.name.value}
203
- are visualized as red bars. The {n_highest_resis} residues with the highest sum of attention are labeled.
204
- Visualize attention weights on protein structures for the protein language models TAPE-BERT, ZymCTRL and ProtBERT.
205
- Pick a PDB ID, layer and head to visualize attention.
206
- """, unsafe_allow_html=True)
207
 
208
  chain_dict = {f"{chain.id}": chain for chain in list(structure.get_chains())}
209
  data = []
210
- for att_weight, _ , chain, resi in top_residues:
211
  res = chain_dict[chain][resi]
212
  el = (att_weight, f"{res.resname:3}{res.id[1]}")
213
  data.append(el)
214
 
215
- df = pd.DataFrame(data, columns=['Total attention (disregarding direction)', 'Residue'])
216
- st.markdown(f"The {n_highest_resis} residues with the highest attention sum are labeled in the visualization and listed below:")
 
 
217
  st.table(df)
218
 
219
- st.markdown("""Clik in to the [Identify Interesting heads](#Identify-Interesting-heads) page to get an overview of attention
220
- patterns across all layers and heads
221
- to help you find heads with interesting attention patterns to study here.""")
 
 
 
 
 
 
222
  """
223
  The attention visualization is inspired by [provis](https://github.com/salesforce/provis#provis-attention-visualizer).
224
- """
 
124
  head=head,
125
  threshold=min_attn,
126
  model_type=selected_model.name,
 
127
  top_n=n_highest_resis,
128
  )
129
 
 
196
  xyzview = get_3dview(pdb_id)
197
  showmol(xyzview, height=500, width=800)
198
 
199
+ st.markdown(
200
+ f"""
201
+ Pick a PDB ID, layer and head to visualize attention from the selected protein language model ({selected_model.name.value}).
202
+ """,
203
+ unsafe_allow_html=True,
204
+ )
 
205
 
206
  chain_dict = {f"{chain.id}": chain for chain in list(structure.get_chains())}
207
  data = []
208
+ for att_weight, _, chain, resi in top_residues:
209
  res = chain_dict[chain][resi]
210
  el = (att_weight, f"{res.resname:3}{res.id[1]}")
211
  data.append(el)
212
 
213
+ df = pd.DataFrame(data, columns=["Total attention (disregarding direction)", "Residue"])
214
+ st.markdown(
215
+ f"The {n_highest_resis} residues with the highest attention sums are labeled in the visualization and listed here:"
216
+ )
217
  st.table(df)
218
 
219
+ st.markdown(
220
+ """
221
+ ### Check out the other pages
222
+ [🗺️Identify Interesting heads](Identify_Interesting_Heads) give a birds-eye view of attention patterns for a model,
223
+ this can help you pick what specific attention heads to look at for your protein.
224
+
225
+ [📄Documentation](Documentation) has information on protein language models, attention analysis and hexviz."""
226
+ )
227
+
228
  """
229
  The attention visualization is inspired by [provis](https://github.com/salesforce/provis#provis-attention-visualizer).
230
+ """