SentenceTransformer based on intfloat/e5-large-v2

This is a sentence-transformers model finetuned from intfloat/e5-large-v2. It maps sentences & paragraphs to a 1024-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.

Model Details

Model Description

  • Model Type: Sentence Transformer
  • Base model: intfloat/e5-large-v2
  • Maximum Sequence Length: 512 tokens
  • Output Dimensionality: 1024 dimensions
  • Similarity Function: Cosine Similarity

Model Sources

Full Model Architecture

SentenceTransformer(
  (0): Transformer({'max_seq_length': 512, 'do_lower_case': False}) with Transformer model: BertModel 
  (1): Pooling({'word_embedding_dimension': 1024, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
  (2): Normalize()
)

Usage

Direct Usage (Sentence Transformers)

First install the Sentence Transformers library:

pip install -U sentence-transformers

Then you can load this model and run inference.

from sentence_transformers import SentenceTransformer

# Download from the 🤗 Hub
model = SentenceTransformer("sentence_transformers_model_id")
# Run inference
sentences = [
    "query: Turns out, sea levels haven't been rising any faster in the last 120 years. #ClimateAction #Sustainability",
    'passage: Abstract. Alteration of natural environment in the wake of global warming is one of the most serious issues, which is being discussed across the world. Over the last 100 years, global sea level rose by 1.0–2.5 mm/y. Present estimates of future sea-level rise induced by climate change range from 28 to 98 cm for the year 2100. It has been estimated that a 1-m rise in sea-level could displace nearly 7 million people from their homes in India. The climate change and associated sea level rise is proclaimed to be a serious threat especially to the low lying coastal areas. Thus, study of long term effects on an estuarine region not only gives opportunity for identifying the vulnerable areas but also gives a clue to the periods where the sea level rise was significant and verifies climate change impact on sea level rise. Multi-temporal remote sensing data and GIS tools are often used to study the pattern of erosion/ accretion in an area and to predict the future coast lines. The present study has been carried out in the Indian Sundarbans area. Major land cover/ land use classes has been delineated and change analysis of the land cover/ land use feature was performed using multi-temporal satellite images (Landsat MSS, TM, ETM+) from 1973 to 2010. Multivariate GIS based analysis was carried out to depict vulnerability and its trend, spatially. Digital Shoreline change analysis also was attempted for two islands, namely, Ghoramara and Sagar Islands using the past 40 years of satellite data and validated with 2012 Resourcesat-2 LISS III data.',
    'passage: With the limited fossil fuel resources and aggravating energy crisis, coupled with the concern about the climate change caused by greenhouse gases, many people hope that renewable fuels will be developed as an alternative to fossil fuels, with special attention being paid to bioethanol.1,2 Compared to fossil fuels, biofuels emit less ozone, benzene, carbon dioxide and other harmful pollutants. For a long time, bioethanol has been raising world-wide attention and many researchers are searching for alternative biomass sources for the production of bioethanol, such as corn,4 wood,5 sugarcane6 switch grass,7 rice straw,8 corn straw9 and wheat straw.10 Today, about 30 % of the corn currently grown is used for ethanol production, and more corn is needed to meet the increasing demand for bioethanol.11 The higher amounts of corn turned to biofuel production could have devastating effects on food supply around the world and cause conflicts in the food vs. fuel dilemma. Ethanol production from lignocellulose is a promising alternative but the current technologies for lignocellulose fermentation have to overcome the cost of the complex processes needed to release simple sugars from recalcitrant polysaccharides.12 With limited land area, pretreatment technical difficulties and low conversion rate, much more needs to be done in bioethanol production from lignocellulose. And the increasing need for energy consumption is expected to continue as the world’s population is expected to increase. In order to meet the expected increasing demand for bioethanol, there is a need to find alternative biomass sources, particularly those that do not rely on using large amounts of agricultural land. Marine algae are attractive renewable energy resources due to their abundance, high photosynthetic efficiency and production rate. Algae contain a low concentration of lignin and sugars can be easily released by simple operations such as milling or crushing, so seaweeds are proposed as one of the most promising biomass materials for ethanol production.13 Marine algae are classified into three groups by their colors: green, brown, and red. Brown algae, as the second most abundant marine biomass, have several key features of an ideal feedstock for biofuel production. They do not require arable land, fertilizer, or fresh water, they are of The Isolation and Performance Studies of an Alginate Degrading and Ethanol Producing Strain',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 1024]

# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]

Evaluation

Metrics

Triplet

  • Datasets: claims-abstracts-dev and claims-abstracts-test
  • Evaluated with TripletEvaluator
Metric claims-abstracts-dev claims-abstracts-test
cosine_accuracy 0.9706 0.9706

Training Details

Training Dataset

Unnamed Dataset

  • Size: 1,110 training samples
  • Columns: anchor, positive, and negative
  • Approximate statistics based on the first 1000 samples:
    anchor positive negative
    type string string list
    details
    • min: 9 tokens
    • mean: 28.04 tokens
    • max: 57 tokens
    • min: 43 tokens
    • mean: 320.62 tokens
    • max: 512 tokens
    • size: 3 elements
  • Samples:
    anchor positive negative
    query: It's not just Australia's problem. Saving the Great Barrier Reef requires a worldwide agreement to reduce emissions. We're all in this together. #ClimateChange #GlobalEffort passage: Abstract The Great Barrier Reef World Heritage Area, Australia, covers over 348,000 km 2 of tropical marine ecosystems of global significance. In July 2015, the World Heritage Committee called attention to the cumulative impacts of climate change, poor water quality, and coastal development on the region's outstanding universal value, but stopped short of inscribing the Great Barrier Reef on the List of World Heritage in Danger. Restoring the region's values is hindered by an environmental decision‐making process that fails to incorporate cumulative impacts, including the climate change impacts of greenhouse gas emissions sourced from one of Australia's largest exports, thermal coal. We identify policy and processes that enable a more comprehensive consideration of the cumulative effects of coal mining by environmental decision‐makers. Implementing cumulative impact assessment requires a collaborative and transparent program of planning and monitoring independent of Government... ["passage: Human development has ushered in an era of converging crises: climate change, ecological destruction, disease, pollution, and socioeconomic inequality. This review synthesizes the breadth of these interwoven emergencies and underscores the urgent need for comprehensive, integrated action. Propelled by imperialism, extractive capitalism, and a surging population, we are speeding past Earth's material limits, destroying critical ecosystems, and triggering irreversible changes in biophysical systems that underpin the Holocene climatic stability which fostered human civilization. The consequences of these actions are disproportionately borne by vulnerable populations, further entrenching global inequities. Marine and terrestrial biomes face critical tipping points, while escalating challenges to food and water access foreshadow a bleak outlook for global security. Against this backdrop of Earth at risk, we call for a global response centered on urgent decarbonization, fostering reciprocity with nature, and implementing regenerative practices in natural resource management. We call for the elimination of detrimental subsidies, promotion of equitable human development, and transformative financial support for lower income nations. A critical paradigm shift must occur that replaces exploitative, wealth-oriented capitalism with an economic model that prioritizes sustainability, resilience, and justice. We advocate a global cultural shift that elevates kinship with nature and communal well-being, underpinned by the recognition of Earth's finite resources and the interconnectedness of its inhabitants. The imperative is clear: to navigate away from this precipice, we must collectively harness political will, economic resources, and societal values to steer toward a future where human progress does not come at the cost of ecological integrity and social equity.", "passage: The natural attributes of Australia's Great Barrier Reef, a UNESCO world heritage site listed for its natural beauty and biological diversity, are rapidly declining due to major threats from diffuse water pollution and climate change. The environmental, social, political and legal conditions that have enabled or blocked successful management of diffuse water pollution are analyzed. We find that the management approach has transitioned towards resilience-focused adaptive management of impacts from outside the marine park. Despite key enablers of adaptive governance, deep-seated political ideology is a major barrier to transformational adaptive governance to improve reef water quality.", 'passage: Abstract Climate change is the most significant threat to the Great Barrier Reef (GBR). While Australians express appreciation and concern for the GBR, it is not clear whether they connect climate‐related action with reef conservation. An online survey of 4,285 Australians asked “…what types of actions could people like you do that would be helpful for the GBR?” Only 4.1% mentioned a specific action related to mitigating climate change; another 3.8% mentioned climate change but no specific action. The most common responses related to reducing plastic pollution (25.6%). These findings demonstrate that most Australians have poor capacity to identify individual climate‐related actions as helpful for reef protection, and that generic calls to action—such as “protect the reef”—are unlikely to elicit climate‐related actions. As such, reef conservation initiatives must explicitly promote actions—in the home and in society—that reduce emissions and support the transition to a low carbon society.']
    query: Fossil fuels likely will supply much of the world’s energy needs for decades to come passage: In 2013, renewable energy accounted for only 8.9% of global commercial primary energy use, with fossil fuels supplying nearly all the rest. A number of official forecasts project such global energy growing by 50% or more by mid-century, and continuing to rise thereafter, in parallel with continued global economic growth. All energy sources of the future must meet three criteria: reserves or annual technical capacity must be adequate to meet projected demand; their climate change effects must be minimal; finally, they must be able to be widely deployed in the limited time available for climate mitigation. It is argued here that existing future energy scenarios generally fail to meet all three criteria. Most scenarios assume that adequate fossil/nuclear reserves are available, and that technical fixes can overcome greenhouse gas emissions from fossil fuels. The few scenarios projecting that renewables will supply most of the world's energy by mid-century assume unrealistic techn... ["passage: Future Energy will allow us to make reasonable, logical and correct decisions on our future energy as a result of two of the most serious problems that the civilized world has to face; the looming shortage of oil (which supplies most of our transport fuel) and the alarming rise in atmospheric carbon dioxide over the past 50 years (resulting from the burning of oil, gas and coal and the loss of forests) that threatens to change the world's climate through global warming. Future Energy focuses on all the types of energy available to us, taking into account a future involving a reduction in oil and gas production and the rapidly increasing amount of carbon dioxide in our atmosphere. It is unique in the genre of books of similar title in that each chapter has been written by a scientist or engineer who is an expert in his or her field. The book is divided into four sections: . Traditional Fossil Fuel and Nuclear Energy . Renewable Energy . Potentially Important New Types of Energy . New Aspects to Future Energy Usage Each chapter highlights the basic theory and implementation, scope, problems and costs associated with a particular type of energy. The traditional fuels are included because they will be with us for decades to come - but, we hope, in a cleaner form. The renewable energy types includes wind power, wave power, tidal energy, two forms of solar energy, bio-mass, hydroelectricity, geothermal and the hydrogen economy. Potentially important new types of energy include: pebble bed nuclear reactors, nuclear fusion, methane hydrates and recent developments in fuel cells and batteries. - Written by experts in the key future energy disciplines from around the globe - Details of all possible forms of energy that are and will be available globally in the next two decades - Puts each type of available energy into perspective with realistic, future options", 'passage: Nowadays, oil is the source of the vast majority of fuels used for transport, heating and of the hydrocarbons used in petrochemical industry. However, oil is a fossil fuel and some experts predict that its reserves will exhaust approximately in 20–30 years. Moreover, it seems that the demand of fossil fuels will increase at rates that can be estimated from “World Energy Outlook” elaborated by the International Energy Agency (IEA, 2007). Apart from all these data, there are evidences that the climate of the planet is changing due to the global warning. The temperature of the earth is increasing and the ice of the poles is beginning to melt; all these changes are attributed to the Greenhouse Effect. Besides, it has been estimated that 82 % of the anthropogenic CO2 emissions are due to fossil fuel combustion so it is clear that alternative energy sources are needed (see Figure 1).', 'passage: Nowadays, oil is the source of the vast majority of fuels used for transport, heating and of the hydrocarbons used in petrochemical industry. However, oil is a fossil fuel and some experts predict that its reserves will exhaust approximately in 20–30 years. Moreover, it seems that the demand of fossil fuels will increase at rates that can be estimated from “World Energy Outlook” elaborated by the International Energy Agency (IEA, 2007). Apart from all these data, there are evidences that the climate of the planet is changing due to the global warning. The temperature of the earth is increasing and the ice of the poles is beginning to melt; all these changes are attributed to the Greenhouse Effect. Besides, it has been estimated that 82 % of the anthropogenic CO2 emissions are due to fossil fuel combustion so it is clear that alternative energy sources are needed (see Figure 1).']
    query: Did you know that carbon dioxide only hangs around in the atmosphere for about 5 years? 🤔 passage: Of the carbon dioxide that we emit, a substantial fraction remains in the atmosphere for thousands of years. Combined with the slow response of the climate system, this results in the global temperature increase resulting from CO2 being nearly proportional to the total emitted amount of CO2 since preindustrial times. This has a number of simple but far-reaching consequences that raise important questions for climate change mitigation, policy and ethics. Even if anthropogenic emissions of CO2 were stopped, most of the realized climate change would persist for centuries and thus be irreversible on human timescales, yet standard economic thinking largely discounts these long-term intergenerational effects. Countries and generations to first order contribute to both past and future climate change in proportion to their total emissions. A global temperature target implies a CO2 "budget" or "quota", a finite amount of CO2 that society is allowed to emit to stay below the target. Dis... ['passage: Significance Climate change is one of the greatest challenges of our times. Human activities, like fossil-fuel burning, result in emissions of radiation-modifying substances that have a detectable, either warming or cooling, influence on our climate. Some, like soot (black carbon), are very short lived, whereas others, like carbon dioxide (CO2), are very persistent and remain in the atmosphere for centuries to millennia. Importantly, these substances are often emitted by common sources. As climate policy is looking at options to limit emissions of all these substances, understanding their linkages becomes extremely important. Our study disentangles these linkages and therewith helps to avoid crucial misconceptions: Measures reducing short-lived climate forcers are complementary to CO2 mitigation, but neglecting linkages leads to overestimating their climate benefits. Anthropogenic global warming is driven by emissions of a wide variety of radiative forcers ranging from very short-lived climate forcers (SLCFs), like black carbon, to very long-lived, like CO2. These species are often released from common sources and are therefore intricately linked. However, for reasons of simplification, this CO2–SLCF linkage was often disregarded in long-term projections of earlier studies. Here we explicitly account for CO2–SLCF linkages and show that the short- and long-term climate effects of many SLCF measures consistently become smaller in scenarios that keep warming to below 2 °C relative to preindustrial levels. Although long-term mitigation of methane and hydrofluorocarbons are integral parts of 2 °C scenarios, early action on these species mainly influences near-term temperatures and brings small benefits for limiting maximum warming relative to comparable reductions taking place later. Furthermore, we find that maximum 21st-century warming in 2 °C-consistent scenarios is largely unaffected by additional black-carbon-related measures because key emission sources are already phased-out through CO2 mitigation. Our study demonstrates the importance of coherently considering CO2–SLCF coevolutions. Failing to do so leads to strongly and consistently overestimating the effect of SLCF measures in climate stabilization scenarios. Our results reinforce that SLCF measures are to be considered complementary rather than a substitute for early and stringent CO2 mitigation. Near-term SLCF measures do not allow for more time for CO2 mitigation. We disentangle and resolve the distinct benefits across different species and therewith facilitate an integrated strategy for mitigating both short and long-term climate change.', 'passage: Mark Twain once quipped that everyone talks about the weather but no one does anything about it. With interest in global climate change on the rise, researchers in the fossil-energy sector are feeling the heat to provide new technology to permit continued use of fossil fuels but with reduced emissions of so-called greenhouse gases. Three important greenhouse gases, carbon dioxide, methane, and nitrous oxide, are released to the atmosphere in the course of recovering and combusting fossil fuels. Their importance for trapping radiation, called forcing, is in the order given. In this report, we briefly review how greenhouse gases cause forcing and why this has a warming effect on the Earth`s atmosphere. Then we discuss programs underway at FETC that are aimed at reducing emissions of methane and carbon dioxide.', "passage: Abstract The atmospheric residence time of carbon dioxide is hundreds of years, many orders of magnitude longer than that of common air pollution, which is typically hours to a few days. However, randomly selected respondents in a mail survey in Allegheny County, PA ( N = 119) and in a national survey conducted with MTurk ( N = 1,013) judged the two to be identical (in decades), considerably overestimating the residence time of air pollution and drastically underestimating that of carbon dioxide. Moreover, while many respondents believed that action is needed today to avoid climate change (regardless of cause), roughly a quarter held the view that if climate change is real and serious, we will be able to stop it in the future when it happens, just as we did with common air pollution. In addition to assessing respondents’ understanding of how long carbon dioxide and common air pollution stay in the atmosphere, we also explored the extent to which people correctly identified causes of climate change and how their beliefs affect support for action. With climate change at the forefront of politics and mainstream media, informing discussions of policy is increasingly important. Confusion about the causes and consequences of climate change, and especially about carbon dioxide's long atmospheric residence time, could have profound implications for sustained support of policies to achieve reductions in carbon dioxide emissions and other greenhouse gases."]
  • Loss: MultipleNegativesRankingLoss with these parameters:
    {
        "scale": 20.0,
        "similarity_fct": "cos_sim"
    }
    

Evaluation Dataset

Unnamed Dataset

  • Size: 34 evaluation samples
  • Columns: anchor, positive, and negative
  • Approximate statistics based on the first 34 samples:
    anchor positive negative
    type string string string
    details
    • min: 12 tokens
    • mean: 30.44 tokens
    • max: 44 tokens
    • min: 93 tokens
    • mean: 361.65 tokens
    • max: 512 tokens
    • min: 179 tokens
    • mean: 380.56 tokens
    • max: 512 tokens
  • Samples:
    anchor positive negative
    query: While CO2 gets a lot of attention, it's actually water vapor that plays the biggest role in trapping heat in our atmosphere #ClimateAction #ClimateAwareness passage: Carbon dioxide (CO2), methane (CH4), and nitrous oxide (N2O) are the greenhouse gases largely responsible for anthropogenic climate change. Natural plant and microbial metabolic processes play a major role in the global atmospheric budget of each. We have been studying ecosystem-atmosphere trace gas exchange at a sub-boreal forest in the northeastern United States for over two decades. Historically our emphasis was on turbulent fluxes of CO2 and water vapor. In 2012 we embarked on an expanded campaign to also measure CH4 and N2O. Here we present continuous tower-based measurements of the ecosystem-atmosphere exchange of CO2 and CH4, recorded over the period 2012-2018 and reported at a 30-minute time step. Additionally, we describe a five-year (2012-2016) dataset of chamber-based measurements of soil fluxes of CO2, CH4, and N2O (2013-2016 only), conducted each year from May to November. These data can be used for process studies, for biogeochemical and land surface model valida... passage: Human activities are releasing gigatonnes of carbon to the Earth's atmosphere annually. Direct consequences of cumulative post-industrial emissions include increasing global temperature, perturbed regional weather patterns, rising sea levels, acidifying oceans, changed nutrient loads and altered ocean circulation. These and other physical consequences are affecting marine biological processes from genes to ecosystems, over scales from rock pools to ocean basins, impacting ecosystem services and threatening human food security. The rates of physical change are unprecedented in some cases. Biological change is likely to be commensurately quick, although the resistance and resilience of organisms and ecosystems is highly variable. Biological changes founded in physiological response manifest as species range-changes, invasions and extinctions, and ecosystem regime shifts. Given the essential roles that oceans play in planetary function and provision of human sustenance, the grand...
    query: Temps in the Arctic haven't risen above -20°C since December, which is impacting the region's ecosystem. #Arctic #ClimateChange passage: Abstract December through February 2015–2016 defines the warmest winter season over the Arctic in the observational record. Positive 2 m temperature anomalies were focused over regions of reduced sea ice cover in the Kara and Barents Seas and southwestern Alaska. A third region is found over the ice‐covered central Arctic Ocean. The period is marked by a strong synoptic pattern which produced melting temperatures in close proximity to the North Pole in late December and anomalous high pressure near the Taymyr Peninsula. Atmospheric teleconnections from the Atlantic contributed to warming over Eurasian high‐latitude land surfaces, and El Niño‐related teleconnections explain warming over southwestern Alaska and British Columbia, while warm anomalies over the central Arctic are associated with physical processes including the presence of enhanced atmospheric water vapor and an increased downwelling longwave radiative flux. Preconditioning of sea ice conditions by warm temperature... passage: The chemical weathering of Ca-Mg silicate rocks is the principal process whereby CO2 is removed from the atmosphere on long or multimillion-year time scales. At present vascular land plants exert an important influence on weathering by recycling water and by bringing about the secretion of acids and the buildup of high levels of CO2 in soils. Before the spread of vascular plants on the continents during the mid-Paleozoic, chemical weathering must have been achieved by higher levels of atmospheric CO2 and/or by the intercession of primitive land biota such as lichens and cyanobacteria. From examination of several pre-vascular weathering scenarios, it is concluded that in order to prohibit unreasonably high levels of atmospheric CO2, weathering during the Precambrian and early Paleozoic must have taken place under essentially closed-system abiotic conditions (linear feedback) or via strong regulation by primitive biota. Preliminary examination of modern weathering by lichens ove...
    query: According to a recent study, there's no link between CO2 emissions and temperature changes in our lifetime. #ClimateChange #Science passage: Recent years have witnessed a growing recognition of the link between emissions of carbon dioxide (CO{sub 2}) and changes in the global climate. of all anthropogenic activities, energy production and use generate the single largest portion of these greenhouse gases. Although developing countries currently account for a small share of global carbon emissions, their contribution is increasing rapidly. Due to the rapid expansion of energy demand in these nations, the developing world's share in global modern energy use rose from 16 to 27 percent between 1970 and 1990. If the growth rates observed over the past 20 years persist, energy demand in developing nations will surpass that in the countries of the Organization for Economic Cooperation and Development (OECD) early in the 21st century. The study seeks to examine the forces that galvanize the growth of energy use and carbon emissions, to assess the likely future levels of energy and CO{sub 2} in selected developing nations an... passage: Abstract. The emphasis for informing policy makers on future sea-level rise has been on projections by the end of the 21st century. However, due to the long lifetime of atmospheric CO2, the thermal inertia of the climate system and the slow equilibration of the ice sheets, global sea level will continue to rise on a multi-millennial timescale even when anthropogenic CO2 emissions cease completely during the coming decades to centuries. Here we present global sea-level change projections due to the melting of land ice combined with steric sea effects during the next 10 000 years calculated in a fully interactive way with the Earth system model of intermediate complexity LOVECLIMv1.3. The greenhouse forcing is based on the Extended Concentration Pathways defined until 2300 CE with no carbon dioxide emissions thereafter, equivalent to a cumulative CO2 release of between 460 and 5300 GtC. We performed one additional experiment for the highest-forcing scenario with the inclusion of...
  • Loss: MultipleNegativesRankingLoss with these parameters:
    {
        "scale": 20.0,
        "similarity_fct": "cos_sim"
    }
    

Training Hyperparameters

Non-Default Hyperparameters

  • eval_strategy: epoch
  • per_device_train_batch_size: 16
  • per_device_eval_batch_size: 16
  • learning_rate: 2e-05
  • warmup_ratio: 0.1
  • fp16: True
  • load_best_model_at_end: True
  • batch_sampler: no_duplicates

All Hyperparameters

Click to expand
  • overwrite_output_dir: False
  • do_predict: False
  • eval_strategy: epoch
  • prediction_loss_only: True
  • per_device_train_batch_size: 16
  • per_device_eval_batch_size: 16
  • per_gpu_train_batch_size: None
  • per_gpu_eval_batch_size: None
  • gradient_accumulation_steps: 1
  • eval_accumulation_steps: None
  • torch_empty_cache_steps: None
  • learning_rate: 2e-05
  • weight_decay: 0.0
  • adam_beta1: 0.9
  • adam_beta2: 0.999
  • adam_epsilon: 1e-08
  • max_grad_norm: 1.0
  • num_train_epochs: 3
  • max_steps: -1
  • lr_scheduler_type: linear
  • lr_scheduler_kwargs: {}
  • warmup_ratio: 0.1
  • warmup_steps: 0
  • log_level: passive
  • log_level_replica: warning
  • log_on_each_node: True
  • logging_nan_inf_filter: True
  • save_safetensors: True
  • save_on_each_node: False
  • save_only_model: False
  • restore_callback_states_from_checkpoint: False
  • no_cuda: False
  • use_cpu: False
  • use_mps_device: False
  • seed: 42
  • data_seed: None
  • jit_mode_eval: False
  • use_ipex: False
  • bf16: False
  • fp16: True
  • fp16_opt_level: O1
  • half_precision_backend: auto
  • bf16_full_eval: False
  • fp16_full_eval: False
  • tf32: None
  • local_rank: 0
  • ddp_backend: None
  • tpu_num_cores: None
  • tpu_metrics_debug: False
  • debug: []
  • dataloader_drop_last: False
  • dataloader_num_workers: 0
  • dataloader_prefetch_factor: None
  • past_index: -1
  • disable_tqdm: False
  • remove_unused_columns: True
  • label_names: None
  • load_best_model_at_end: True
  • ignore_data_skip: False
  • fsdp: []
  • fsdp_min_num_params: 0
  • fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
  • tp_size: 0
  • fsdp_transformer_layer_cls_to_wrap: None
  • accelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
  • deepspeed: None
  • label_smoothing_factor: 0.0
  • optim: adamw_torch
  • optim_args: None
  • adafactor: False
  • group_by_length: False
  • length_column_name: length
  • ddp_find_unused_parameters: None
  • ddp_bucket_cap_mb: None
  • ddp_broadcast_buffers: False
  • dataloader_pin_memory: True
  • dataloader_persistent_workers: False
  • skip_memory_metrics: True
  • use_legacy_prediction_loop: False
  • push_to_hub: False
  • resume_from_checkpoint: None
  • hub_model_id: None
  • hub_strategy: every_save
  • hub_private_repo: None
  • hub_always_push: False
  • gradient_checkpointing: False
  • gradient_checkpointing_kwargs: None
  • include_inputs_for_metrics: False
  • include_for_metrics: []
  • eval_do_concat_batches: True
  • fp16_backend: auto
  • push_to_hub_model_id: None
  • push_to_hub_organization: None
  • mp_parameters:
  • auto_find_batch_size: False
  • full_determinism: False
  • torchdynamo: None
  • ray_scope: last
  • ddp_timeout: 1800
  • torch_compile: False
  • torch_compile_backend: None
  • torch_compile_mode: None
  • include_tokens_per_second: False
  • include_num_input_tokens_seen: False
  • neftune_noise_alpha: None
  • optim_target_modules: None
  • batch_eval_metrics: False
  • eval_on_start: False
  • use_liger_kernel: False
  • eval_use_gather_object: False
  • average_tokens_across_devices: False
  • prompts: None
  • batch_sampler: no_duplicates
  • multi_dataset_batch_sampler: proportional

Training Logs

Epoch Step Training Loss Validation Loss claims-abstracts-dev_cosine_accuracy claims-abstracts-test_cosine_accuracy
-1 -1 - - 0.9706 -
1.0 70 - 0.5560 1.0 -
1.4286 100 0.8357 - - -
2.0 140 - 0.5118 0.9412 -
2.8571 200 0.1838 - - -
3.0 210 - 0.4425 0.9706 -
-1 -1 - - - 0.9706
  • The bold row denotes the saved checkpoint.

Framework Versions

  • Python: 3.10.12
  • Sentence Transformers: 4.1.0
  • Transformers: 4.51.3
  • PyTorch: 2.6.0+cu124
  • Accelerate: 1.6.0
  • Datasets: 3.5.0
  • Tokenizers: 0.21.1

Citation

BibTeX

Sentence Transformers

@inproceedings{reimers-2019-sentence-bert,
    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2019",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/1908.10084",
}

MultipleNegativesRankingLoss

@misc{henderson2017efficient,
    title={Efficient Natural Language Response Suggestion for Smart Reply},
    author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
    year={2017},
    eprint={1705.00652},
    archivePrefix={arXiv},
    primaryClass={cs.CL}
}
Downloads last month
7
Safetensors
Model size
335M params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for xplainlp/e5-large-v2-climatecheck

Finetuned
(11)
this model

Evaluation results