Spaces:
Runtime error
Runtime error
ENZYME_PROMPT = """**You are a senior systems biologist.** Analyze the input information to predict ec number using structured reasoning. Crucially, implement a **self-correct mechanism** with these steps: | |
### Self-Correct Protocol | |
1. **Enzyme Verification** | |
- Discard ANY information contradicting the enzyme nature (catalytic activity). | |
- Example: If a GO term implies non-enzymatic function (e.g., structural role), reject it immediately. | |
2. **Conflict Resolution (Majority Rule)** | |
- Identify conflicts between: | |
- Motif vs. Motif | |
- GO term vs. GO term | |
- Motif vs. GO term | |
- **Resolution Principle**: | |
- If one element (A) conflicts with ≥2 logically consistent elements (B,C,D), discard A. | |
- Preserve high-confidence information supported by multiple sources. | |
- *Note*: Compatible functions (e.g., catalytic activity + cofactor binding) are NOT conflicts. | |
3. **Output Filtered Information** | |
- Explicitly list retained/discarded items with reasons before analysis. | |
### Final Output Requirement for EC Number | |
After completing the full biological analysis, you **must** conclude your entire response with a special section for automated parsing. This section must adhere to the following precise logic and format: | |
**Decision Logic:** | |
1. **Default to a Single EC Number:** Your primary goal is to predict the **single, most likely EC number** for the protein's primary catalytic activity. | |
2. **Handling Ambiguity:** If the evidence suggests a single function but points to several possible EC numbers (e.g., a family motif describes related but distinct activities), you must **commit to one choice**. Select the EC number that is most representative, most common, or best supported by the combined evidence. **Do not list multiple options out of uncertainty.** | |
3. **Exception for Bifunctionality:** You may only predict multiple EC numbers if there is **explicit and strong evidence that a single protein is bifunctional**, meaning it contains distinct domains that perform two or more separate catalytic reactions. This requires clear support, such as a motif description explicitly stating "bifunctional" or the presence of multiple, distinct top-level catalytic GO terms (e.g., both a kinase and a cyclase activity). | |
**Formatting Rules:** | |
1. The section must begin on a new line with the exact tag: `[EC_PREDICTION]` | |
2. **Single Prediction (Standard Case):** Follow the tag with a single space and the predicted EC number. | |
* Example: `[EC_PREDICTION] 1.14.99.54` | |
3. **Bifunctional Prediction (Exceptional Case):** List the EC numbers separated by a comma with no spaces. | |
* Example: `[EC_PREDICTION] 2.7.1.1,4.6.1.1` | |
4. Do not add any other text, explanation, or punctuation on this line. | |
""" | |
RELATION_SEMANTIC_PROMPT = """ | |
relation semantic: | |
• is_a: The is a relation forms the basic structure of GO. If we say A is a B, we mean that node A is a subtype of node B. For example, mitotic cell cycle is a cell cycle, or lyase activity is a catalytic activity. | |
• part_of: The part of relation is used to represent part-whole relationships. part of has a specific meaning in GO, and a part of relation would only be added between A and B if B is necessarily part of A: wherever B exists, it is as part of A, and the presence of the B implies the presence of A. However, given the occurrence of A, we cannot say for certain that B exists. | |
• has part: The logical complement to the part of relation is has part, which represents a part-whole relationship from the perspective of the parent. As with part of, the GO relation has part is only used in cases where A always has B as a part, i.e. where A necessarily has part B. If A exists, B will always exist; however, if B exists, we cannot say for certain that A exists. i.e. all A have part B; some B part of A. | |
• ends during: X ends_during Y iff: ((start(Y) before_or_simultaneous_with end(X)) AND end(X) before_or_simultaneous_with end(Y). | |
• happens during: X happens_during Y iff: (start(Y) before_or_simultaneous_with start(X)) AND (end(X) before_or_simultaneous_with end(Y)) | |
• negatively regulates: p negatively regulates q iff p regulates q, and p decreases the rate or magnitude of execution of q. | |
• occurs in: b occurs_in c =def b is a process and c is a material entity or immaterial entity& there exists a spatiotemporal region r and b occupies_spatiotemporal_region r.& forall(t) if b exists_at t then c exists_at t & there exist spatial regions s and s’ where & b spatially_projects_onto s at t& c is occupies_spatial_region s’ at t& s is a proper_continuant_part_of s’ at t | |
• positively regulates: p positively regulates q iff p regulates q, and p increases the rate or magnitude of execution of q. | |
• regulates: A relation that describes case in which one process directly affects the manifestation of another process or quality, i.e. the former regulates the latter. The target of the regulation may be another process, for e.g., regulation of a pathway or an enzymatic reaction, or it may be a quality, such as cell size or pH. Analogously to part of, this relation is used specifically to mean necessarily regulates: if both A and B are present, B always regulates A, but A may not always be regulated by B., i.e. all B regulate A; some A are regulated by B. | |
• subproperty of: is used to establish a hierarchy among properties, indicating that a more specific property inherits characteristics from a more general one. | |
• inverse of: is used to define the reverse direction of a relationship between the same pair of individuals. | |
""" | |
FUNCTION_PROMPT = """**You are a senior systems biologist.** Analyze the input information to answer the given question. | |
""" | |
LLM_SCORE_PROMPT = """As an expert biologist, you are assigned to check one paragraph is aligned with facts or not. You will receive some facts, and | |
one paragraph. Score the paragraph between 0 to 100. | |
The score should be the format of {"score": score} | |
Here's the facts: | |
{{ground_truth}} | |
Here's the paragraph: | |
{{llm_answer}} | |
""" |