|
--- |
|
license: apache-2.0 |
|
language: |
|
- en |
|
base_model: |
|
- meta-llama/Llama-3.1-8B-instruct |
|
pipeline_tag: text-generation |
|
tags: |
|
- lora |
|
- adapter |
|
- writing |
|
- CoT |
|
--- |
|
# Merged-Llama-Adapters-317-320 |
|
|
|
A merged LoRA adapter combining four fine-tuned adapters (317-320) for the Llama-3.1-8B language model. |
|
|
|
## Model Details |
|
|
|
- Base Model: meta-llama/Llama-3.1-8B-instruct |
|
- Adaptation Method: Merged LoRA |
|
|
|
## Merger Configuration |
|
|
|
### Source Adapters |
|
|
|
All source adapters share the following configuration: |
|
- Rank (r): 16 |
|
- Alpha: 16 |
|
- Target Modules: |
|
- q_proj (Query projection) |
|
- k_proj (Key projection) |
|
- v_proj (Value projection) |
|
- o_proj (Output projection) |
|
- up_proj (Upsampling projection) |
|
- down_proj (Downsampling projection) |
|
- gate_proj (Gate projection) |
|
|
|
### Merger Details |
|
|
|
- Merger Method: Linear interpolation |
|
- Merger Weights: Equal weights (0.25) for each adapter |
|
- Combined Rank: 16 (maintained from source adapters) |
|
|
|
## Usage |
|
|
|
This merged adapter must be used with the base Llama-3.1-8B-instruct model. |
|
|
|
## Limitations and Biases |
|
|
|
- This merged adapter inherits limitations and biases from: |
|
- The base Llama-3.1-8B-instruct model |
|
- More baises from traning data as most of them were fiction work. |
|
- The merging process may result in: |
|
- Potential loss of specialized capabilities from individual adapters |
|
- Averaged behavior across different adapter specializations |
|
- Possible interference between adapter weights |
|
|
|
## Merging Process |
|
|
|
The adapters were merged using the following approach: |
|
1. Linear interpolation of adapter weights |
|
2. Equal weighting (0.25) applied to each source adapter |
|
3. Preservation of original LoRA rank and architecture |
|
|
|
### Method Used |
|
|
|
The adapters were merged using PEFT (Parameter-Efficient Fine-Tuning) library's weighted adapter combination feature. The process combines multiple LoRA adapters using linear interpolation with specified weights. |
|
|
|
|
|
### Key Parameters |
|
|
|
- `combination_type="ties"`: Uses the TIES (Task Interference Edge Selection) method for combining adapters |
|
- `density=0.2`: Controls the sparsity of the merged weights |
|
|
|
|
|
### Notes |
|
|
|
- The order of loading adapters may affect the final result |
|
- Equal weights were chosen to maintain balanced influence from each adapter |
|
- The merged adapter maintains the same architecture and rank as the original adapters |
|
- While this adapter merges multiple fine-tunes, each component was developed as part of independent research efforts to explore and language model capabilities as part of R&D process. |
|
|
|
|
|
## Datasets |
|
|
|
- Not yet released, but should be released after evaluation has completed. |
|
- Only 1k pairs example of revision task <input_text> + <style_guide> => <thinking> <-> </revised_text> |
|
|
|
### Use Cases |
|
|
|
- This merged adapter can be used for a wide range of tasks, including but not limited to: |
|
- Accessibility |
|
- Revision & Editing |
|
- instruction-following use with xml tags |
|
- Thinking & reasoning with xml tag of <thinking> and </thinking>, if being asked i the instructions. |
|
|
|
|
|
These Models not optimized for code, math, or other specialized tasks that need Perefence Optimization. |
|
|
|
## Why SFT Instead of RLHF/DPO? |
|
- RLHF and DPO approaches often lead to vocabulary limitations and overfitting due to their optimization objectives |
|
|
|
## License |
|
|
|
Licensed under Apache 2.0 License. |
|
|
|
This merged adapter is part of independent individual research work. While the code is open-source under the Apache 2.0 license, please note: |
|
|
|
- You are free to use, modify, and distribute this adapter following the Apache 2.0 license terms |
|
- This work is provided "as is" without warranties or conditions of any kind |
|
- This is an independent research project and not affiliated with any organization |
|
- Attribution is appreciated but not required |
|
- For full license details, see: https://www.apache.org/licenses/LICENSE-2.0 |