Spaces:

kardosdrur
/

topic-arena-demo

Sleeping

kardosdrur commited on Feb 18

Commit

08620e1

1 Parent(s): 33f044a

Added corpus

Files changed (3) hide show

Dockerfile CHANGED Viewed

@@ -32,6 +32,6 @@ RUN git clone https://github.com/x-tabdeveloping/topicwizard
 WORKDIR $HOME/app/topicwizard
 RUN git checkout topic-arena
 RUN cp $HOME/app/main.py $HOME/app/topicwizard/main.py
-RUN mkdir data
 EXPOSE 7860
 CMD gunicorn --timeout 0 -b 0.0.0.0:7860 --workers=2 --threads=4 --worker-class=gthread main:server

 WORKDIR $HOME/app/topicwizard
 RUN git checkout topic-arena
 RUN cp $HOME/app/main.py $HOME/app/topicwizard/main.py
+RUN cp $HOME/app/corpus.txt $HOME/app/topicwizard/corpus.txt
 EXPOSE 7860
 CMD gunicorn --timeout 0 -b 0.0.0.0:7860 --workers=2 --threads=4 --worker-class=gthread main:server

corpus.txt ADDED Viewed

The diff for this file is too large to render. See raw diff

main.py CHANGED Viewed

@@ -29,14 +29,8 @@ def create_app(blueprint):
     return app
-print("Fetching data")
-newsgroups = fetch_20newsgroups(
-    data_home="data",
-    subset="all",
-    remove=("headers", "footers", "quotes"),
-    categories=["alt.atheism", "sci.space"],
-)
-corpus = newsgroups.data
 print("Calculating embeddings")
 encoder = SentenceTransformer("sentence-transformers/static-retrieval-mrl-en-v1")

     return app
+with open("corpus.txt") as in_file:
+    corpus = in_file.read().split("\n")
 print("Calculating embeddings")
 encoder = SentenceTransformer("sentence-transformers/static-retrieval-mrl-en-v1")