Spaces:
Runtime error
Runtime error
File size: 8,916 Bytes
29cb445 494bac0 29cb445 d9ad0d8 29cb445 494bac0 29cb445 d9ad0d8 29cb445 494bac0 29cb445 d9ad0d8 5d10b71 d9ad0d8 5d10b71 d9ad0d8 29cb445 08a14f0 29cb445 08a14f0 29cb445 08a14f0 5d10b71 494bac0 29cb445 08a14f0 29cb445 08a14f0 494bac0 08a14f0 5d10b71 08a14f0 5d10b71 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 |
from transformers import AutoModelForSequenceClassification, AutoTokenizer
import torch
import streamlit as st
import pandas as pd
model1 = AutoModelForSequenceClassification.from_pretrained(
"rexarski/bert-base-climate-fever-fixed"
)
tokenizer1 = AutoTokenizer.from_pretrained(
"rexarski/bert-base-climate-fever-fixed"
)
label_mapping1 = ["SUPPORTS", "REFUTES", "NOT_ENOUGH_INFO"]
model2 = AutoModelForSequenceClassification.from_pretrained(
"rexarski/distilroberta-tcfd-disclosure"
)
tokenizer2 = AutoTokenizer.from_pretrained("distilroberta-base")
label_mapping2 = [
"Governance a)",
"Governance b)",
"Metrics and Targets a)",
"Metrics and Targets b)",
"Metrics and Targets c)",
"Risk Management a)",
"Risk Management b)",
"Risk Management c)",
"Strategy a)",
"Strategy b)",
"Strategy c)",
]
def factcheck(text1, text2):
features = tokenizer1(
[text1],
[text2],
padding="max_length",
truncation=True,
return_tensors="pt",
max_length=512,
)
model1.eval()
with torch.no_grad():
scores = model1(**features).logits
labels = [
label_mapping1[score_max] for score_max in scores.argmax(dim=1)
]
return labels[0]
def tcfd_classify(text):
features = tokenizer2(
text,
padding="max_length",
truncation=True,
return_tensors="pt",
max_length=512,
)
model2.eval()
with torch.no_grad():
scores = model2(**features).logits
labels = [
label_mapping2[score_max] for score_max in scores.argmax(dim=1)
]
return labels[0]
data1 = {
"example": [
"Example 1 (Sea ice has diminished much faster than scientists and climate models anticipated.)",
"Example 2 (Climate Models Have Overestimated Global Warming)",
"Example 3 (Climate skeptics argue temperature records have been adjusted in recent years to ...)",
"Example 4 (Humans are too insignificant to affect global climate.)",
],
"claim": [
"Sea ice has diminished much faster than scientists and climate models anticipated.",
"Climate Models Have Overestimated Global Warming",
"Climate skeptics argue temperature records have been adjusted in recent years to make the past appear cooler and the present warmer, although the Carbon Brief showed that NOAA has actually made the past warmer, evening out the difference.",
"Humans are too insignificant to affect global climate.",
],
"evidence": [
"Past models have underestimated the rate of Arctic shrinkage and underestimated the rate of precipitation increase.",
"""The 2017 United States-published National Climate Assessment notes that "climate models may still be underestimating or missing relevant feedback processes".""",
"""Reconstructions have consistently shown that the rise in the instrumental temperature record of the past 150 years is not matched in earlier centuries, and the name "hockey stick graph" was coined for figures showing a long-term decline followed by an abrupt rise in temperatures.""",
"Human impact on the environment or anthropogenic impact on the environment includes changes to biophysical environments and ecosystems, biodiversity, and natural resources caused directly or indirectly by humans, including global warming, environmental degradation (such as ocean acidification), mass extinction and biodiversity loss, ecological crisis, and ecological collapse.",
],
"label": ["SUPPORTS", "REFUTES", "NOT_ENOUGH_INFO", "REFUTES"],
}
data2 = {
"example": [
"Example 1 (As a global provider of transport and logistics services ...)",
"Example 2 (There are no sentences in the provided excerpts that disclose Scope 1 and Scope 2)",
"Example 3 (Our strategy needs to be resilient under a range of climate-related scenarios.)",
"Example 4 (AXA created a Group-level Responsible Investment Committee ...)",
],
"text": [
"As a global provider of transport and logistics services, we are often called on for expert input and industry insights by government representatives.",
"There are no sentences in the provided excerpts that disclose Scope 1 and Scope 2, and, if appropriate Scope 3 GHG emissions. The provided excerpts focus on other metrics and targets related to social impact investing, assets under management, and carbon footprint calculations.",
"""Our strategy needs to be resilient under a range of climate-related scenarios. This year we have undertaken climate-related scenario testing of a select group of customers in the thermal coal supply chain. We assessed these customers using two of the International Energy Agency’s scenarios; the ‘New Policies Scenario’ and the ‘450 Scenario’. Our reporting reflects the Financial Stability Board’s (FSB) Task Force on Climate-Related Disclosures (TCFD) recommendations. Using the FSB TCFD’s disclosure framework, we have begun discussions with some of our customers in emissions-intensive industries. The ESG Committee is responsible for reviewing and approving our climate change-related objectives, including goals and targets. The Board Risk Committee has formal responsibility for the overview of ANZ’s management of new and emerging risks, including climate change-related risks.""",
"AXA created a Group-level Responsible Investment Committee (RIC), chaired by the Group Chief Investment Officer, and including representatives from AXA Asset Management entities, representatives of Corporate Responsibility (CR), Risk Management and Group Communication.",
],
"label": [
"Risk Management a)",
"Metrics and Targets b)",
"Strategy c)",
"Governance b)",
],
}
def get_pred_emoji(str1, str2, mode="factcheck"):
if mode == "factcheck":
if str1 == str2:
return "✅"
else:
return "❌"
elif mode == "tcfd":
if str1 == str2:
return "✅"
elif str1.split()[:-1] == str2.split()[:-1]:
return "🔧"
else:
return "❌"
df1 = pd.DataFrame(data1)
df2 = pd.DataFrame(data2)
st.markdown(
"""
# climate-plus demo
This is a minimal example of two models we trained for `climate-plus` project:
- [bert-base-climate-fever-fixed](https://huggingface.co/rexarski/bert-base-climate-fever-fixed)
- [distilroberta-tcfd-disclosure](https://huggingface.co/rexarski/distilroberta-tcfd-disclosure)
See the [GitHub repo](https://github.com/rexarski/climate-plus) for more details."
"""
)
st.markdown("## Factchecking")
factchecking_demo = st.radio(
"What examples do you want to see?",
("Preloaded examples", "Custom examples"),
)
if factchecking_demo == "Preloaded examples":
ex1_selected = st.selectbox(
"Select a climate claim-evidence pair", df1["example"]
)
selected_row1 = df1[df1["example"] == ex1_selected]
ex_claim = selected_row1["claim"].values[0]
ex_evidence = selected_row1["evidence"].values[0]
ex_label = selected_row1["label"].values[0]
ex_pred = factcheck(
selected_row1["claim"].values[0], selected_row1["evidence"].values[0]
)
st.markdown(f"**Claim**: {ex_claim}")
st.markdown(f"**Evidence**: {ex_evidence}")
st.markdown(f"**Label**: {ex_label}")
st.markdown(
f'**Prediction**: {ex_pred} {get_pred_emoji(ex_label, ex_pred, mode="factcheck")}'
)
else:
st.markdown("Or enter your own claim and evidence below:")
custom_claim = st.text_input(label="Enter your claim.")
custom_evidence = st.text_input(label="Enter your evidence.")
if custom_claim != "" and custom_evidence != "":
st.markdown(
f"**Prediction**: {factcheck(custom_claim, custom_evidence)}"
)
st.markdown("---")
st.markdown("## TCFD disclosure classification")
tcfd_demo = st.radio(
"What examples do you want to see?",
("Preloaded examples", "Custom examples"),
)
if tcfd_demo == "Preloaded examples":
ex2_selected = st.selectbox(
"Select a TCFD disclosure example", df2["example"]
)
selected_row2 = df2[df2["example"] == ex2_selected]
ex_text = selected_row2["text"].values[0]
ex_label2 = selected_row2["label"].values[0]
ex_pred2 = tcfd_classify(selected_row2["text"].values[0])
st.markdown(f"**Text**: {ex_text}")
st.markdown(f"**Label**: {ex_label2}")
st.markdown(
f'**Prediction**: {ex_pred2} {get_pred_emoji(ex_label2, ex_pred2, mode="tcfd")}'
)
else:
st.markdown(
"Or enter your own sentence to see if it belongs to any specific TCFD disclosure category:"
)
custom_text = st.text_input(label="Enter your text.")
if custom_text != "":
st.markdown(f"**Prediction**: {tcfd_classify(custom_text)}")
|