Improve model card with pipeline tag and library name

This PR improves the model card by adding essential metadata for better discoverability and usability. The `pipeline_tag` is set to `text-generation` reflecting the model's text generation capabilities, and the `library_name` is set to `transformers` because the provided examples use the Hugging Face Transformers library. The introduction is also enhanced with information from the paper abstract to more accurately describe the model's capabilities and training data.

Files changed (1) hide show

README.md +15 -23

README.md CHANGED Viewed

@@ -1,9 +1,11 @@
 ---
-license: mit
-language:
-- en
 base_model:
 - deepseek-ai/deepseek-coder-7b-instruct-v1.5
 ---
 <p align="center">
@@ -12,19 +14,8 @@ base_model:
 ## Introduction
-We present a fine-tuned model for formal verification tasks. It is fine-tuned in five formal specification languages (Cog, Dafny, Lean4, ACSL, and TLA) on six formal-verification-related tasks:
-- **Requirement Analysis**: given requirements and description of the verification or modeling goals, decomposing the goal into detailed verification steps
-- **Proof/Model Generation**: given requirements and description of the verification or modeling goals, writing formal proofs or models that can be verified by verifier/model checker.
-- **Proof segment generation**
-- **Proof Completion**: complete the given incomplete proofs or models
-- **Proof Infilling**: filling in the middle of the given incomplete proofs or models
-- **Code 2 Proof**: (Currently only support for ACSL whose specification is in form of code annotations) given the code under verification, generate the proof with the specifications
 ## Application Scenario
@@ -57,9 +48,9 @@ You only need to return the TLA formal specification without explanation.
 input_text = """
 An operation `LM_Inner_Rsp(p)` that represents a response process for a given parameter `p`. It satisfies the following conditions:
-  - The control state `octl[p]` is equal to `\"done\"`.
   - The `Reply(p, obuf[p], memInt, memInt')` operation is executed.
-  - The control state `octl` is updated by setting the `p` index of `octl` to `\"rdy\"`.
   - The variables `omem` and `obuf` remain unchanged.
 """
@@ -70,7 +61,8 @@ model = AutoModelForCausalLM.from_pretrained(
 )
 tokenizer = AutoTokenizer.from_pretrained(model_name)
-messages = [{"role": "user", "content": f"{instruct}\n{input_text}"}]
 text = tokenizer.apply_chat_template(
     messages, tokenize=False, add_generation_prompt=True
@@ -101,9 +93,9 @@ You only need to return the TLA formal specification without explanation.
 input_text = """
 An operation `LM_Inner_Rsp(p)` that represents a response process for a given parameter `p`. It satisfies the following conditions:
-  - The control state `octl[p]` is equal to `\"done\"`.
   - The `Reply(p, obuf[p], memInt, memInt')` operation is executed.
-  - The control state `octl` is updated by setting the `p` index of `octl` to `\"rdy\"`.
   - The variables `omem` and `obuf` remain unchanged.
 """
@@ -123,7 +115,8 @@ llm = LLM(
 )
 # Prepare chat messages
-chat_message = [{"role": "user", "content": f"{instruct}\n{input_text}"}]
 # Inference
 responses = llm.chat(chat_message, greed_sampling, use_tqdm=True)
@@ -142,5 +135,4 @@ print(responses[0].outputs[0].text)
       primaryClass={cs.AI},
       url={https://arxiv.org/abs/2501.16207},
 }
-```

 ---
 base_model:
 - deepseek-ai/deepseek-coder-7b-instruct-v1.5
+language:
+- en
+license: mit
+pipeline_tag: text-generation
+library_name: transformers
 ---
 <p align="center">
 ## Introduction
+This model, presented in the paper [From Informal to Formal -- Incorporating and Evaluating LLMs on Natural Language Requirements to Verifiable Formal Proofs](https://hf.co/papers/2501.16207), is a fine-tuned LLM for formal verification tasks.  Trained on 18k high-quality instruction-response pairs across five formal specification languages (Coq, Dafny, Lean4, ACSL, and TLA+), it excels at various sub-tasks including requirement analysis, proof/model generation, and code-to-proof translation (for ACSL).  Interestingly, fine-tuning on this formal data also enhances the model's mathematics, reasoning, and coding capabilities.
 ## Application Scenario
 input_text = """
 An operation `LM_Inner_Rsp(p)` that represents a response process for a given parameter `p`. It satisfies the following conditions:
+  - The control state `octl[p]` is equal to `"done"`.
   - The `Reply(p, obuf[p], memInt, memInt')` operation is executed.
+  - The control state `octl` is updated by setting the `p` index of `octl` to `"rdy"`.
   - The variables `omem` and `obuf` remain unchanged.
 """
 )
 tokenizer = AutoTokenizer.from_pretrained(model_name)
+messages = [{"role": "user", "content": f"{instruct}
+{input_text}"}]
 text = tokenizer.apply_chat_template(
     messages, tokenize=False, add_generation_prompt=True
 input_text = """
 An operation `LM_Inner_Rsp(p)` that represents a response process for a given parameter `p`. It satisfies the following conditions:
+  - The control state `octl[p]` is equal to `"done"`.
   - The `Reply(p, obuf[p], memInt, memInt')` operation is executed.
+  - The control state `octl` is updated by setting the `p` index of `octl` to `"rdy"`.
   - The variables `omem` and `obuf` remain unchanged.
 """
 )
 # Prepare chat messages
+chat_message = [{"role": "user", "content": f"{instruct}
+{input_text}"}]
 # Inference
 responses = llm.chat(chat_message, greed_sampling, use_tqdm=True)
       primaryClass={cs.AI},
       url={https://arxiv.org/abs/2501.16207},
 }
+```