Spaces:
Running
Running
| !!! note | |
| To run this notebook in JupyterLab, load [`examples/ex2_0.ipynb`](https://github.com/DerwenAI/textgraphs/blob/main/examples/ex2_0.ipynb) | |
| # bootstrap the _lemma graph_ with RDF triples | |
| Show how to bootstrap definitions in a _lemma graph_ by loading RDF, e.g., for synonyms. | |
| ## environment | |
| ```python | |
| from icecream import ic | |
| from pyinstrument import Profiler | |
| import pyvis | |
| import textgraphs | |
| ``` | |
| ```python | |
| %load_ext watermark | |
| ``` | |
| ```python | |
| %watermark | |
| ``` | |
| Last updated: 2024-01-16T17:35:59.608787-08:00 | |
| Python implementation: CPython | |
| Python version : 3.10.11 | |
| IPython version : 8.20.0 | |
| Compiler : Clang 13.0.0 (clang-1300.0.29.30) | |
| OS : Darwin | |
| Release : 21.6.0 | |
| Machine : x86_64 | |
| Processor : i386 | |
| CPU cores : 8 | |
| Architecture: 64bit | |
| ```python | |
| %watermark --iversions | |
| ``` | |
| pyvis : 0.3.2 | |
| textgraphs: 0.5.0 | |
| sys : 3.10.11 (v3.10.11:7d4cc5aa85, Apr 4 2023, 19:05:19) [Clang 13.0.0 (clang-1300.0.29.30)] | |
| ## load the bootstrap definitions | |
| Define the bootstrap RDF triples in N3/Turtle format: we define an entity `Werner` as a synonym for `Werner Herzog` by using the [`skos:broader`](https://www.w3.org/TR/skos-reference/#semantic-relations) relation. Keep in mind that this entity may also refer to other Werners... | |
| ```python | |
| TTL_STR: str = """ | |
| @base <https://github.com/DerwenAI/textgraphs/ns/> . | |
| @prefix dbo: <http://dbpedia.org/ontology/> . | |
| @prefix skos: <http://www.w3.org/2004/02/skos/core#> . | |
| <entity/werner_PROPN> a dbo:Person ; | |
| skos:prefLabel "Werner"@en . | |
| <entity/werner_PROPN_herzog_PROPN> a dbo:Person ; | |
| skos:prefLabel "Werner Herzog"@en. | |
| dbo:Person skos:definition "People, including fictional"@en ; | |
| skos:prefLabel "person"@en . | |
| <entity/werner_PROPN_herzog_PROPN> skos:broader <entity/werner_PROPN> . | |
| """ | |
| ``` | |
| Provide the source text | |
| ```python | |
| SRC_TEXT: str = """ | |
| Werner Herzog is a remarkable filmmaker and an intellectual originally from Germany, the son of Dietrich Herzog. | |
| After the war, Werner fled to America to become famous. | |
| """ | |
| ``` | |
| set up the statistical stack profiling | |
| ```python | |
| profiler: Profiler = Profiler() | |
| profiler.start() | |
| ``` | |
| set up the `TextGraphs` pipeline | |
| ```python | |
| tg: textgraphs.TextGraphs = textgraphs.TextGraphs( | |
| factory = textgraphs.PipelineFactory( | |
| kg = textgraphs.KGWikiMedia( | |
| spotlight_api = textgraphs.DBPEDIA_SPOTLIGHT_API, | |
| dbpedia_search_api = textgraphs.DBPEDIA_SEARCH_API, | |
| dbpedia_sparql_api = textgraphs.DBPEDIA_SPARQL_API, | |
| wikidata_api = textgraphs.WIKIDATA_API, | |
| min_alias = textgraphs.DBPEDIA_MIN_ALIAS, | |
| min_similarity = textgraphs.DBPEDIA_MIN_SIM, | |
| ), | |
| ), | |
| ) | |
| ``` | |
| load the bootstrap definitions | |
| ```python | |
| tg.load_bootstrap_ttl( | |
| TTL_STR, | |
| debug = False, | |
| ) | |
| ``` | |
| parse the input text | |
| ```python | |
| pipe: textgraphs.Pipeline = tg.create_pipeline( | |
| SRC_TEXT.strip(), | |
| ) | |
| tg.collect_graph_elements( | |
| pipe, | |
| debug = False, | |
| ) | |
| tg.construct_lemma_graph( | |
| debug = False, | |
| ) | |
| ``` | |
| ## visualize the lemma graph | |
| ```python | |
| render: textgraphs.RenderPyVis = tg.create_render() | |
| pv_graph: pyvis.network.Network = render.render_lemma_graph( | |
| debug = False, | |
| ) | |
| ``` | |
| initialize the layout parameters | |
| ```python | |
| pv_graph.force_atlas_2based( | |
| gravity = -38, | |
| central_gravity = 0.01, | |
| spring_length = 231, | |
| spring_strength = 0.7, | |
| damping = 0.8, | |
| overlap = 0, | |
| ) | |
| pv_graph.show_buttons(filter_ = [ "physics" ]) | |
| pv_graph.toggle_physics(True) | |
| ``` | |
| ```python | |
| pv_graph.prep_notebook() | |
| pv_graph.show("tmp.fig04.html") | |
| ``` | |
| tmp.fig04.html | |
|  | |
| Notice how the `Werner` and `Werner Herzog` nodes are now linked? This synonym from the bootstrap definitions above provided means to link more portions of the _lemma graph_ than the demo in `ex0_0` with the same input text. | |
| ## statistical stack profile instrumentation | |
| ```python | |
| profiler.stop() | |
| ``` | |
| <pyinstrument.session.Session at 0x1522e2110> | |
| ```python | |
| profiler.print() | |
| ``` | |
| _ ._ __/__ _ _ _ _ _/_ Recorded: 17:35:59 Samples: 2846 | |
| /_//_/// /_\ / //_// / //_'/ // Duration: 4.111 CPU time: 3.294 | |
| / _/ v4.6.1 | |
| Program: /Users/paco/src/textgraphs/venv/lib/python3.10/site-packages/ipykernel_launcher.py -f /Users/paco/Library/Jupyter/runtime/kernel-4365d4ba-2d4d-4d4b-83e2-eb5ef8abfe26.json | |
| 4.111 IPythonKernel.dispatch_shell ipykernel/kernelbase.py:378 | |
| └─ 4.075 IPythonKernel.execute_request ipykernel/kernelbase.py:721 | |
| [9 frames hidden] ipykernel, IPython | |
| 3.995 ZMQInteractiveShell.run_ast_nodes IPython/core/interactiveshell.py:3394 | |
| ├─ 3.250 <module> ../ipykernel_4433/1372904243.py:1 | |
| │ └─ 3.248 PipelineFactory.__init__ textgraphs/pipe.py:434 | |
| │ └─ 3.232 load spacy/__init__.py:27 | |
| │ [98 frames hidden] spacy, en_core_web_sm, catalogue, imp... | |
| │ 0.496 tokenizer_factory spacy/language.py:110 | |
| │ └─ 0.108 _validate_special_case spacy/tokenizer.pyx:573 | |
| │ 0.439 <lambda> spacy/language.py:2170 | |
| │ └─ 0.085 _validate_special_case spacy/tokenizer.pyx:573 | |
| ├─ 0.672 <module> ../ipykernel_4433/3257668275.py:1 | |
| │ └─ 0.669 TextGraphs.create_pipeline textgraphs/doc.py:103 | |
| │ └─ 0.669 PipelineFactory.create_pipeline textgraphs/pipe.py:508 | |
| │ └─ 0.669 Pipeline.__init__ textgraphs/pipe.py:216 | |
| │ └─ 0.669 English.__call__ spacy/language.py:1016 | |
| │ [31 frames hidden] spacy, spacy_dbpedia_spotlight, reque... | |
| └─ 0.055 <module> ../ipykernel_4433/72966960.py:1 | |
| └─ 0.046 Network.prep_notebook pyvis/network.py:552 | |
| [5 frames hidden] pyvis, jinja2 | |
| ## outro | |
| _\[ more parts are in progress, getting added to this demo \]_ | |