File size: 8,345 Bytes
f1c16a5
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
# -*- coding: utf-8 -*-
"""mahabharata-chatbot.ipynb

Automatically generated by Colaboratory.

Original file is located at
    https://colab.research.google.com/drive/14PZqWZU0cD12pm4gVs2BT4VpWlr1CHKx

# Graph Powered NLP Workshop Python Notebook

### Installing necessary drivers - Google Generative AI, Neo4j and Gradio.

**Instruction:** Don't forget to restart the runtime after running the below cell
"""

### pip install google-generativeai
### pip install -q neo4j-driver
### pip install -q gradio

"""### Import necessary libraries from installed packages/drivers - PaLM, base64, json, gradio and GraphDatabase"""

import google.generativeai as genai
import base64
import json
import gradio as gr
from neo4j import GraphDatabase

"""### Add PaLM2 API Key from Google MakerSuite

**Instruction:** Replace "API_KEY" with the value of API Key copied from MakerSuite as mentioned in Step #6 of Part #2 of [Step-by-Step Guide](https://github.com/sidagarwal04/graph-powered-nlp-workshop/blob/main/step-by-step-guide.md#part-2-create-google-makersuite-account-train--test-prompt-in-google-makersuite-and-get-google-palm-2-api-key). Don't forget to add to include the key in double-quotes (" ")
"""

genai.configure(api_key = "AIzaSyA2eO7Mc1d_yTXBcJSH8w3XqdcVZD3Pl6s")

"""### Include the generated prompt from MakerSuite.

**Instruction:** Remove the initial part of installing drivers and configuring the API key as it has already been done in previous steps. Also, put the entire code as a function with output to be returned instead of printing it.
"""

def get_answer(input):

  # Set up the model
  generation_config = {
    "temperature": 0.9,
    "top_p": 1,
    "top_k": 1,
    "max_output_tokens": 2048,
  }

  safety_settings = [
    {
      "category": "HARM_CATEGORY_HARASSMENT",
      "threshold": "BLOCK_MEDIUM_AND_ABOVE"
    },
    {
      "category": "HARM_CATEGORY_HATE_SPEECH",
      "threshold": "BLOCK_MEDIUM_AND_ABOVE"
    },
    {
      "category": "HARM_CATEGORY_SEXUALLY_EXPLICIT",
      "threshold": "BLOCK_MEDIUM_AND_ABOVE"
    },
    {
      "category": "HARM_CATEGORY_DANGEROUS_CONTENT",
      "threshold": "BLOCK_MEDIUM_AND_ABOVE"
    },
  ]

  model = genai.GenerativeModel(model_name="gemini-1.0-pro",
                                generation_config=generation_config,
                                safety_settings=safety_settings)

  prompt_parts = [
    "You are an expert in converting English questions to Neo4j Cypher Graph code! The Graph has just 1 Node Label - Person! the Movie Node has the following properties: name, gender, type, dynasty, count, health, marital_status, nickname, number_of_children and title. The Neo4j Graph has the following Relationship types DAUGHTER_OF, FATHER_OF, HUSBAND_OF, KILLED, MOTHER_OF, SON_OF and WIFE_OF.In this graph, if FATHER_OF or MOTHER_OF relationship exists between two nodes that mean that starting node are parents of the destination node. So if I am asking, who are the parents of a node named \"Karna\", the graph should search for FATHER_OF and/or MOTHER_OF relationship ending to Karna and generate a cypher query and hence a response accordingly. Similarly if HUSBAND_OF or WIFE_OF relationship exists between two nodes that mean that both starting nodes and the destination node are married to each other. So if I am asking, who is Kunti married to, the graph should search for WIFE_OF relationship starting from Kunti or HUSBAND_OF relationship ending to Kunto and generate a cypher query and hence a response accordingly.\n\nDo not include ```, \\n and other verbose in the output. Also, do not include single quotes with the output. Just give the final cypher statement in the output.",
    "input: Who is the husband of Kunti?",
    "output: MATCH (n:Person)-[HUSBAND_OF]->(m:Person) where m.name = \"Kunti\" RETURN n.name;",
    "input: Who are the parents of Karna?",
    "output: MATCH (karna:Person {name: \"Karna\"}), (p:Person)-[:FATHER_OF|MOTHER_OF]->(karna) RETURN p.name",
    "input: Who is Kunti married to?",
    "output: MATCH (n:Person),(m:Person {name:\"Kunti\"})WHERE (n)-[:HUSBAND_OF]->(m) OR (m)-[:WIFE_OF]->(n)RETURN n.name",
    f"input:{input}",
  ]

  response = model.generate_content(prompt_parts)

  return response.text

"""### Testing the output of get_answer() function with a test input"""

get_answer("Who killed Ghatotakach?")

"""### Initialize GraphDatabase driver

**Instruction:** Replace URI, username and password before running the cell with the values from the txt file downloaded when creating Neo4j AuraDB instance in Step #4 of Part #1 of [Step By Step Guide](https://github.com/sidagarwal04/graph-powered-nlp-workshop/blob/main/step-by-step-guide.md#part-1-create-and-load-a-neo4j-instance) of this workshop.
"""

driver = GraphDatabase.driver("neo4j+s://9bebfeb5.databases.neo4j.io",
                              auth=("neo4j",
                                    "SQsRVyWfLuq8dl24WjcvlMhw7P20-TTT30Ywb-2miZM"))

"""### Import required library for processing regular expressions"""

import re

"""### Function to clean the output query from get_answer() function by removing slash n's (\n) and substituting it with a space if it exists. Also, extract the string after RETURN expression in the output cypher query and utilize as a separate key to be used for printing the output in chatbot in later steps"""

def extract_query_and_return_key(input_query_result):
    slash_n_pattern = r'[ \n]+'
    ret_pattern = r'RETURN\s+(.*)'
    replacement = ' '

    cleaned_query = re.sub(slash_n_pattern, replacement, input_query_result)
    if cleaned_query:
        match = re.search(ret_pattern, cleaned_query)
        if match:
            extracted_string = match.group(1)
        else:
            extracted_string = ""
    return cleaned_query, extracted_string

"""### Testing the extract_query_and_return_key() function with a test input in natural language"""

extract_query_and_return_key(get_answer("Who killed Ghatotakach?"))

"""### format_names_with_ampersand() to return results as a comma-separated string of values with last value having '&' (ampersand/and) in case the output is a list of values."""

def format_names_with_ampersand(names):
    if len(names) == 0:
        return ""
    elif len(names) == 1:
        return names[0]
    else:
        formatted_names = ", ".join(names[:-1]) + " & " + names[-1]
        return formatted_names

"""### Testing format_names_with_ampersand() with sample input having list of values"""

format_names_with_ampersand(["Karna"])

"""### run_cypher_on_neo4j() to pass the output query from get_answer() to the Neo4j Database. If the length of output list is more than 1, format_name_with_ampersand() will further format the list and if the length of output list is equal to 1, output list is returned as it is. In case the output list is empty, an empty string is returned"""

def run_cypher_on_neo4j(inp_query, inp_key):
  out_list = []
  with driver.session() as session:
      result = session.run(inp_query)
      for record in result:
          out_list.append(record[inp_key])
  driver.close()
  if len(out_list) > 1:
      return format_names_with_ampersand(out_list)
  elif len(out_list) == 1:
      return out_list[0]
  else:
      return ""

"""### Additional generate_and_exec_cypher() to parse and format the output of get_answer() and pass it to run_cypher_on_neo4j()"""

def generate_and_exec_cypher(input_query):
    gen_query, gen_key = extract_query_and_return_key(get_answer(input_query))
    return run_cypher_on_neo4j(gen_query, gen_key)

generate_and_exec_cypher("Who killed Ghatotakach?")

"""### chatbot() to initiliaze the chatbot and pass the output of generate_and_exec_cypher to be displayed in the chatbot"""

def chatbot(input, history=[]):
    output = str(generate_and_exec_cypher(input))
    history.append((input, output))
    return history, history

"""### Initializing Gradio interface to run the chatbot.

**Instruction:** Run the chatbot in the localhost url generated after running the cell and play aroung with input and output in natural language while fetching the results from the Neo4j Database using PaLM 2 API for converting input text into cypher code.
"""

gr.Interface(fn = chatbot,
             inputs = ["text",'state'],
             outputs = ["chatbot",'state']).launch(debug = True, share=True)