Allanatrix commited on
Commit
b1b031a
·
verified ·
1 Parent(s): bc75bfa

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +107 -195
README.md CHANGED
@@ -1,195 +1,107 @@
1
- # Azure Sky Optimizer
2
-
3
- Azure Sky Optimizer is a hybrid optimizer for PyTorch, integrating Simulated Annealing (SA) with Adam to provide robust exploration and precise exploitation in non-convex optimization tasks. Designed for complex machine learning challenges, Azure Sky excels in domains requiring deep exploration of rugged loss landscapes, such as scientific machine learning, symbolic reasoning, and protein folding.
4
-
5
- Developed as part of an R&D initiative, Azure Sky combines structured stochastic exploration with gradient-based refinement, achieving stable convergence and strong generalization in multi-modal search spaces.
6
-
7
- ---
8
-
9
- ## Overview
10
-
11
- Conventional optimizers like Adam and AdamW often converge prematurely to sharp local minima, compromising generalization. Azure Sky leverages SA’s global search in early stages and Adam’s local convergence later, ensuring both deep exploration and precise convergence.
12
-
13
- ### Core Innovations
14
-
15
- - **Dynamic Temperature Scaling:** Adjusts SA temperature based on training progress for controlled exploration.
16
- - **Exploration-Exploitation Fusion:** Seamlessly transitions between SA and Adam using a sigmoid-based blending mechanism.
17
- - **Stability Enhancements:** Built-in gradient clipping and loss spike monitoring for robust training.
18
-
19
- ---
20
-
21
- ## Key Features
22
-
23
- - **Hybrid Optimization:** Combines SA’s global search with Adam’s local refinement.
24
- - **Optimized Hyperparameters:** Tuned via Optuna (the best trial: 0.0893 on Two Moons dataset).
25
- - **Flexible Parameter Handling:** Supports parameter lists, named parameters, and parameter groups with group-specific learning rates.
26
- - **Production-Ready Stability:** Includes gradient clipping and loss spike detection.
27
- - **PyTorch Compatibility:** Fully integrated with PyTorch’s `optim` module.
28
-
29
- ---
30
-
31
- ## Installation
32
-
33
- Clone the repository and install using [uv](https://github.com/astral-sh/uv):
34
-
35
- ```bash
36
- git clone https://github.com/yourusername/azure-sky-optimizer.git
37
- cd azure-sky-optimizer
38
- uv pip install -e .
39
- ```
40
-
41
- **Requirements:**
42
- - Python >= 3.8
43
- - PyTorch >= 1.10.0
44
- - NumPy >= 1.20.0
45
-
46
- > **Note:** Ensure `uv` is installed. See [uv documentation](https://github.com/astral-sh/uv) for instructions.
47
-
48
- ---
49
-
50
- ## Usage Examples
51
-
52
- Azure Sky integrates seamlessly into PyTorch workflows. Below are usage examples for various parameter configurations.
53
-
54
- ### Basic Usage
55
-
56
- ```python
57
- import torch
58
- import torch.nn as nn
59
- from azure_optimizer import Azure
60
-
61
- model = nn.Linear(10, 2)
62
- criterion = nn.CrossEntropyLoss()
63
- optimizer = Azure(model.parameters())
64
-
65
- inputs = torch.randn(32, 10)
66
- targets = torch.randint(0, 2, (32,))
67
- optimizer.zero_grad()
68
- outputs = model(inputs)
69
- loss = criterion(outputs, targets)
70
- loss.backward()
71
- optimizer.step()
72
- ```
73
-
74
- ### Parameter Lists
75
-
76
- ```python
77
- var1 = torch.nn.Parameter(torch.randn(2, 2))
78
- var2 = torch.nn.Parameter(torch.randn(2, 2))
79
- optimizer = Azure([var1, var2])
80
- ```
81
-
82
- ### Parameter Groups with Custom Learning Rates
83
-
84
- ```python
85
- class SimpleModel(nn.Module):
86
- def __init__(self):
87
- super().__init__()
88
- self.base = nn.Linear(10, 5)
89
- self.classifier = nn.Linear(5, 2)
90
-
91
- def forward(self, x):
92
- x = torch.relu(self.base(x))
93
- return self.classifier(x)
94
-
95
- model = SimpleModel()
96
- optimizer = Azure([
97
- {'params': model.base.parameters(), 'lr': 1e-2},
98
- {'params': model.classifier.parameters()}
99
- ])
100
- ```
101
-
102
- For additional examples, see `azure_optimizer/usage_example.py`.
103
-
104
- ---
105
-
106
- ## Hyperparameters
107
-
108
- Default hyperparameters (from Optuna Trial 99, the best validation loss: 0.0893 on Two Moons):
109
-
110
- | Parameter | Value | Description |
111
- |-------------|-----------------------|------------------------------|
112
- | lr | 0.0007518383921113902 | Learning rate for Adam phase |
113
- | T0 | 2.2723218904585964 | Initial temperature for SA |
114
- | sigma | 0.17181058166567398 | Perturbation strength for SA |
115
- | SA_steps | 5 | Steps for SA phase |
116
- | sa_momentum | 0.6612913488540948 | Momentum for SA updates |
117
-
118
- ---
119
-
120
- ## Performance
121
-
122
- Evaluated on the Two Moons dataset (5000 samples, 20% noise):
123
-
124
- - **Best Validation Loss:** 0.0919
125
- - **Final Validation Accuracy:** 96.7%
126
- - **Epochs to Convergence:** 50
127
-
128
- Compared to:
129
- - **Adam:** loss 0.0927, acc 96.8%
130
- - **AdamW:** loss 0.0917, acc 97.1%
131
-
132
- Azure Sky prioritizes robust generalization over rapid convergence, making it ideal for pre-training and tasks requiring deep exploration.
133
-
134
- ---
135
-
136
- ## Contributing
137
-
138
- Contributions are welcome!
139
-
140
- 1. Fork the repository.
141
- 2. Create a feature branch: `git checkout -b feature/your-feature`
142
- 3. Commit your changes.
143
- 4. Push to your branch.
144
- 5. Open a pull request.
145
-
146
- Please follow PEP 8 standards. Tests are not yet implemented; contributions to add testing infrastructure are highly encouraged.
147
-
148
- ---
149
-
150
- ## License
151
-
152
- This project is licensed under the MIT License. See the [LICENSE](LICENSE) file for details.
153
-
154
- ---
155
-
156
- ## Citation
157
-
158
- If you use Azure Sky Optimizer in your research or engineering projects, please cite:
159
-
160
- ```
161
- [Allan]. (2025). Azure Sky Optimizer: A Hybrid Approach for Exploration and Exploitation. GitHub Repository.
162
- ```
163
-
164
- ---
165
-
166
- ## Project Status
167
-
168
- As of May 27, 2025, Azure Sky Optimizer is stable and production-ready.
169
-
170
- **Planned improvements:**
171
- - Testing on larger datasets (e.g., CIFAR-10, MNIST)
172
- - Ablation studies for hyperparameter impact
173
- - Integration with PyTorch Lightning
174
- - Adding a comprehensive test suite
175
-
176
- For questions or collaboration, please open an issue on GitHub.
177
-
178
- Kaggle Notebook: https://www.kaggle.com/code/allanwandia/non-convex-research
179
-
180
- Writeup It has old metrics so watch out: https://github.com/DarkStarStrix/CSE-Repo-of-Advanced-Computation-ML-and-Systems-Engineering/blob/main/Papers/Computer_Science/Optimization/Optimization_Algothrims_The_HimmelBlau_Function_Case_Study.pdf
181
-
182
- ---
183
-
184
- ## Repository Structure
185
-
186
- ```
187
- azure-sky-optimizer/
188
- ├── azure_optimizer/
189
- │ ├── __init__.py
190
- │ ├── azure.py # Updated Azure class
191
- │ ├── hooks.py
192
- │ └── usage_example.py # Usage demonstrations
193
- ├── README.md
194
- └── LICENSE
195
- ```
 
1
+ ---
2
+ title: Nexa R&D
3
+ emoji: 🔬
4
+ colourFrom: blue
5
+ colorTo: green
6
+ sdk: gradio
7
+ sdk_version: 4.44.0
8
+ app_file: App.py
9
+ pinned: false
10
+ license: apache-2.0
11
+ tags:
12
+ - optimization
13
+ - machine-learning
14
+ - research-tool
15
+ - gradio
16
+ - azure-sky
17
+ ---
18
+
19
+ # Nexa R&D
20
+
21
+ Nexa R&D is a visual research platform designed for researchers and industry professionals to compare and evaluate optimisers (e.g., AzureSky, Adam, SGD, AdamW, RMSprop) on analytical benchmark functions (e.g., Himmelblau, Ackley) and machine learning tasks (e.g., MNIST, CIFAR-10). It supports ablation studies, hyperparameter tuning, and side-by-side evaluations through an intuitive Gradio-based interface, optimised for deployment on Hugging Face Spaces.
22
+
23
+ ## Features
24
+
25
+ - **Modes**:
26
+ - **Benchmark Optimisation**: Visualise optimiser trajectories on loss surfaces of mathematical functions.
27
+ - **ML Task Training**: Train and compare optimisers on datasets like MNIST and CIFAR-10.
28
+ - **Optimisers**: AzureSky (hybrid SA + Adam), Adam, AdamW, SGD, RMSprop.
29
+ - **Ablation Suite**: Configure AzureSky’s Simulated Annealing (SA) with options to enable/disable SA, set initial temperature, and adjust cooling rate.
30
+ - **Interactive UI**: Gradio interface with plots, metrics tables, and JSON export for results.
31
+ - **Metrics**:
32
+ - **Benchmark Mode**: Distance to global minimum, final loss, convergence rate.
33
+ - **ML Mode**: Train/validation accuracy, generalisation gap, final loss, best epoch.
34
+ - **Deployment**: Optimised for Hugging Face Spaces with optional GPU acceleration.
35
+
36
+
37
+ # Usage
38
+
39
+ Configure a Study:
40
+
41
+ Select Mode: Choose "Benchmark Optimisation" or "ML Task Training" from the Study Configuration tab.
42
+ Select Optimisers: Pick one or more optimisers (e.g., AzureSky, Adam).
43
+ Configure Parameters:
44
+ Benchmark Mode: Select a function (e.g., Himmelblau) and dimensionality (default: 2).
45
+ ML Task Mode: Select a dataset (e.g., MNIST), epochs (default: 10), batch size (default: 32), and learning rate (default: 0.001).
46
+
47
+
48
+ Ablation Settings (if AzureSky is selected):
49
+ Enable/disable simulated annealing (default: enabled).
50
+ Set initial SA temperature (default: 1.0).
51
+ Set SA cooling rate (default: 0.95).
52
+
53
+
54
+ Run Study:
55
+
56
+ Click "Run Study" to execute the experiment.
57
+ View results in the "Results" tab, including:
58
+ Plots (loss surfaces for benchmarks or accuracy/loss curves for ML tasks).
59
+ Metrics table summarising performance.
60
+ Detailed JSON metrics.
61
+
62
+
63
+ Export Results:
64
+
65
+ Click "Export Results as JSON" to download a results.json file containing metrics, paths, and histories.
66
+
67
+
68
+ Ablation Suite
69
+ The ablation suite enables detailed analysis of the AzureSky optimiser’s components:
70
+
71
+ Simulated Annealing (SA): Toggle SA on/off to assess its impact on optimisation.
72
+ Initial Temperature: Adjust the starting temperature for SA (higher values increase exploration).
73
+ Cooling Rate: Control the rate at which SA cools (values closer to 1 result in slower cooling, preserving exploration).
74
+
75
+ To use:
76
+
77
+ Select AzureSky in the optimisers list.
78
+ Open the "AzureSky Ablation Settings" accordion in the Gradio UI.
79
+ Adjust SA parameters and run the study to compare results with other optimisers or configurations.
80
+
81
+ Example
82
+ To compare AzureSky (with SA) and Adam on the Himmelblau function:
83
+
84
+ Select "Benchmark Optimisation" in the Study Configuration tab.
85
+ Choose "Himmelblau" and the optimisers "AzureSky" and "Adam".
86
+ Set dimensionality to 2.
87
+ In the AzureSky Ablation Settings, enable SA, set temperature = 1.0, and set cooling rate = 0.95.
88
+ Click "Run Study".
89
+ View the 3D loss surface plot and metrics table in the Results tab.
90
+
91
+ # Testing Checklist
92
+
93
+ Optimisers: Verify convergence on benchmark functions.
94
+ Benchmarks: Confirm global minima and surface plots are accurate.
95
+ ML Tasks: Check epoch stability and output formats.
96
+ UI: Test mode switching, input validation, and result display.
97
+ Ablation: Validate AzureSky behaviour with/without SA and different temperature/cooling settings.
98
+ Export: Ensure JSON exports include all metrics and results.
99
+
100
+ # Future Enhancements
101
+
102
+ Support for user-defined benchmark functions via file uploads.
103
+ Additional ML datasets (e.g., Fashion-MNIST).
104
+ API access for scripted experiments.
105
+ Extended ablation options for other optimisers.
106
+
107
+ For issues or contributions, contact the maintainers.