Spaces:
Paused
Paused
Commit
·
f35697f
1
Parent(s):
abd7248
added first page & resume parser
Browse files- .venv/bin/Activate.ps1 +247 -0
- .venv/bin/activate +70 -0
- .venv/bin/activate.csh +27 -0
- .venv/bin/activate.fish +69 -0
- .venv/bin/chardetect +10 -0
- .venv/bin/docx2txt +10 -0
- .venv/bin/dumppdf.py +480 -0
- .venv/bin/f2py +10 -0
- .venv/bin/flask +10 -0
- .venv/bin/jsonschema +10 -0
- .venv/bin/markdown-it +10 -0
- .venv/bin/nltk +10 -0
- .venv/bin/normalizer +10 -0
- .venv/bin/numpy-config +10 -0
- .venv/bin/pdf2txt.py +323 -0
- .venv/bin/pip +10 -0
- .venv/bin/pip3 +10 -0
- .venv/bin/pip3.12 +10 -0
- .venv/bin/pygmentize +10 -0
- .venv/bin/pymupdf +10 -0
- .venv/bin/pyresparser +10 -0
- .venv/bin/python +1 -0
- .venv/bin/python3 +1 -0
- .venv/bin/python3.12 +1 -0
- .venv/bin/spacy +10 -0
- .venv/bin/tqdm +10 -0
- .venv/bin/typer +10 -0
- .venv/bin/weasel +10 -0
- .venv/pyvenv.cfg +5 -0
- backend/model/resume-parser/resume_to_features.py +10 -0
- backend/templates/index.html +615 -0
- data/resumes/Hussein El Saadi - CV.pdf +0 -0
- data/resumes/Mohamad Moallem(CV-2024).pdf +0 -0
- requirements.txt +4 -1
.venv/bin/Activate.ps1
ADDED
@@ -0,0 +1,247 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
<#
|
2 |
+
.Synopsis
|
3 |
+
Activate a Python virtual environment for the current PowerShell session.
|
4 |
+
|
5 |
+
.Description
|
6 |
+
Pushes the python executable for a virtual environment to the front of the
|
7 |
+
$Env:PATH environment variable and sets the prompt to signify that you are
|
8 |
+
in a Python virtual environment. Makes use of the command line switches as
|
9 |
+
well as the `pyvenv.cfg` file values present in the virtual environment.
|
10 |
+
|
11 |
+
.Parameter VenvDir
|
12 |
+
Path to the directory that contains the virtual environment to activate. The
|
13 |
+
default value for this is the parent of the directory that the Activate.ps1
|
14 |
+
script is located within.
|
15 |
+
|
16 |
+
.Parameter Prompt
|
17 |
+
The prompt prefix to display when this virtual environment is activated. By
|
18 |
+
default, this prompt is the name of the virtual environment folder (VenvDir)
|
19 |
+
surrounded by parentheses and followed by a single space (ie. '(.venv) ').
|
20 |
+
|
21 |
+
.Example
|
22 |
+
Activate.ps1
|
23 |
+
Activates the Python virtual environment that contains the Activate.ps1 script.
|
24 |
+
|
25 |
+
.Example
|
26 |
+
Activate.ps1 -Verbose
|
27 |
+
Activates the Python virtual environment that contains the Activate.ps1 script,
|
28 |
+
and shows extra information about the activation as it executes.
|
29 |
+
|
30 |
+
.Example
|
31 |
+
Activate.ps1 -VenvDir C:\Users\MyUser\Common\.venv
|
32 |
+
Activates the Python virtual environment located in the specified location.
|
33 |
+
|
34 |
+
.Example
|
35 |
+
Activate.ps1 -Prompt "MyPython"
|
36 |
+
Activates the Python virtual environment that contains the Activate.ps1 script,
|
37 |
+
and prefixes the current prompt with the specified string (surrounded in
|
38 |
+
parentheses) while the virtual environment is active.
|
39 |
+
|
40 |
+
.Notes
|
41 |
+
On Windows, it may be required to enable this Activate.ps1 script by setting the
|
42 |
+
execution policy for the user. You can do this by issuing the following PowerShell
|
43 |
+
command:
|
44 |
+
|
45 |
+
PS C:\> Set-ExecutionPolicy -ExecutionPolicy RemoteSigned -Scope CurrentUser
|
46 |
+
|
47 |
+
For more information on Execution Policies:
|
48 |
+
https://go.microsoft.com/fwlink/?LinkID=135170
|
49 |
+
|
50 |
+
#>
|
51 |
+
Param(
|
52 |
+
[Parameter(Mandatory = $false)]
|
53 |
+
[String]
|
54 |
+
$VenvDir,
|
55 |
+
[Parameter(Mandatory = $false)]
|
56 |
+
[String]
|
57 |
+
$Prompt
|
58 |
+
)
|
59 |
+
|
60 |
+
<# Function declarations --------------------------------------------------- #>
|
61 |
+
|
62 |
+
<#
|
63 |
+
.Synopsis
|
64 |
+
Remove all shell session elements added by the Activate script, including the
|
65 |
+
addition of the virtual environment's Python executable from the beginning of
|
66 |
+
the PATH variable.
|
67 |
+
|
68 |
+
.Parameter NonDestructive
|
69 |
+
If present, do not remove this function from the global namespace for the
|
70 |
+
session.
|
71 |
+
|
72 |
+
#>
|
73 |
+
function global:deactivate ([switch]$NonDestructive) {
|
74 |
+
# Revert to original values
|
75 |
+
|
76 |
+
# The prior prompt:
|
77 |
+
if (Test-Path -Path Function:_OLD_VIRTUAL_PROMPT) {
|
78 |
+
Copy-Item -Path Function:_OLD_VIRTUAL_PROMPT -Destination Function:prompt
|
79 |
+
Remove-Item -Path Function:_OLD_VIRTUAL_PROMPT
|
80 |
+
}
|
81 |
+
|
82 |
+
# The prior PYTHONHOME:
|
83 |
+
if (Test-Path -Path Env:_OLD_VIRTUAL_PYTHONHOME) {
|
84 |
+
Copy-Item -Path Env:_OLD_VIRTUAL_PYTHONHOME -Destination Env:PYTHONHOME
|
85 |
+
Remove-Item -Path Env:_OLD_VIRTUAL_PYTHONHOME
|
86 |
+
}
|
87 |
+
|
88 |
+
# The prior PATH:
|
89 |
+
if (Test-Path -Path Env:_OLD_VIRTUAL_PATH) {
|
90 |
+
Copy-Item -Path Env:_OLD_VIRTUAL_PATH -Destination Env:PATH
|
91 |
+
Remove-Item -Path Env:_OLD_VIRTUAL_PATH
|
92 |
+
}
|
93 |
+
|
94 |
+
# Just remove the VIRTUAL_ENV altogether:
|
95 |
+
if (Test-Path -Path Env:VIRTUAL_ENV) {
|
96 |
+
Remove-Item -Path env:VIRTUAL_ENV
|
97 |
+
}
|
98 |
+
|
99 |
+
# Just remove VIRTUAL_ENV_PROMPT altogether.
|
100 |
+
if (Test-Path -Path Env:VIRTUAL_ENV_PROMPT) {
|
101 |
+
Remove-Item -Path env:VIRTUAL_ENV_PROMPT
|
102 |
+
}
|
103 |
+
|
104 |
+
# Just remove the _PYTHON_VENV_PROMPT_PREFIX altogether:
|
105 |
+
if (Get-Variable -Name "_PYTHON_VENV_PROMPT_PREFIX" -ErrorAction SilentlyContinue) {
|
106 |
+
Remove-Variable -Name _PYTHON_VENV_PROMPT_PREFIX -Scope Global -Force
|
107 |
+
}
|
108 |
+
|
109 |
+
# Leave deactivate function in the global namespace if requested:
|
110 |
+
if (-not $NonDestructive) {
|
111 |
+
Remove-Item -Path function:deactivate
|
112 |
+
}
|
113 |
+
}
|
114 |
+
|
115 |
+
<#
|
116 |
+
.Description
|
117 |
+
Get-PyVenvConfig parses the values from the pyvenv.cfg file located in the
|
118 |
+
given folder, and returns them in a map.
|
119 |
+
|
120 |
+
For each line in the pyvenv.cfg file, if that line can be parsed into exactly
|
121 |
+
two strings separated by `=` (with any amount of whitespace surrounding the =)
|
122 |
+
then it is considered a `key = value` line. The left hand string is the key,
|
123 |
+
the right hand is the value.
|
124 |
+
|
125 |
+
If the value starts with a `'` or a `"` then the first and last character is
|
126 |
+
stripped from the value before being captured.
|
127 |
+
|
128 |
+
.Parameter ConfigDir
|
129 |
+
Path to the directory that contains the `pyvenv.cfg` file.
|
130 |
+
#>
|
131 |
+
function Get-PyVenvConfig(
|
132 |
+
[String]
|
133 |
+
$ConfigDir
|
134 |
+
) {
|
135 |
+
Write-Verbose "Given ConfigDir=$ConfigDir, obtain values in pyvenv.cfg"
|
136 |
+
|
137 |
+
# Ensure the file exists, and issue a warning if it doesn't (but still allow the function to continue).
|
138 |
+
$pyvenvConfigPath = Join-Path -Resolve -Path $ConfigDir -ChildPath 'pyvenv.cfg' -ErrorAction Continue
|
139 |
+
|
140 |
+
# An empty map will be returned if no config file is found.
|
141 |
+
$pyvenvConfig = @{ }
|
142 |
+
|
143 |
+
if ($pyvenvConfigPath) {
|
144 |
+
|
145 |
+
Write-Verbose "File exists, parse `key = value` lines"
|
146 |
+
$pyvenvConfigContent = Get-Content -Path $pyvenvConfigPath
|
147 |
+
|
148 |
+
$pyvenvConfigContent | ForEach-Object {
|
149 |
+
$keyval = $PSItem -split "\s*=\s*", 2
|
150 |
+
if ($keyval[0] -and $keyval[1]) {
|
151 |
+
$val = $keyval[1]
|
152 |
+
|
153 |
+
# Remove extraneous quotations around a string value.
|
154 |
+
if ("'""".Contains($val.Substring(0, 1))) {
|
155 |
+
$val = $val.Substring(1, $val.Length - 2)
|
156 |
+
}
|
157 |
+
|
158 |
+
$pyvenvConfig[$keyval[0]] = $val
|
159 |
+
Write-Verbose "Adding Key: '$($keyval[0])'='$val'"
|
160 |
+
}
|
161 |
+
}
|
162 |
+
}
|
163 |
+
return $pyvenvConfig
|
164 |
+
}
|
165 |
+
|
166 |
+
|
167 |
+
<# Begin Activate script --------------------------------------------------- #>
|
168 |
+
|
169 |
+
# Determine the containing directory of this script
|
170 |
+
$VenvExecPath = Split-Path -Parent $MyInvocation.MyCommand.Definition
|
171 |
+
$VenvExecDir = Get-Item -Path $VenvExecPath
|
172 |
+
|
173 |
+
Write-Verbose "Activation script is located in path: '$VenvExecPath'"
|
174 |
+
Write-Verbose "VenvExecDir Fullname: '$($VenvExecDir.FullName)"
|
175 |
+
Write-Verbose "VenvExecDir Name: '$($VenvExecDir.Name)"
|
176 |
+
|
177 |
+
# Set values required in priority: CmdLine, ConfigFile, Default
|
178 |
+
# First, get the location of the virtual environment, it might not be
|
179 |
+
# VenvExecDir if specified on the command line.
|
180 |
+
if ($VenvDir) {
|
181 |
+
Write-Verbose "VenvDir given as parameter, using '$VenvDir' to determine values"
|
182 |
+
}
|
183 |
+
else {
|
184 |
+
Write-Verbose "VenvDir not given as a parameter, using parent directory name as VenvDir."
|
185 |
+
$VenvDir = $VenvExecDir.Parent.FullName.TrimEnd("\\/")
|
186 |
+
Write-Verbose "VenvDir=$VenvDir"
|
187 |
+
}
|
188 |
+
|
189 |
+
# Next, read the `pyvenv.cfg` file to determine any required value such
|
190 |
+
# as `prompt`.
|
191 |
+
$pyvenvCfg = Get-PyVenvConfig -ConfigDir $VenvDir
|
192 |
+
|
193 |
+
# Next, set the prompt from the command line, or the config file, or
|
194 |
+
# just use the name of the virtual environment folder.
|
195 |
+
if ($Prompt) {
|
196 |
+
Write-Verbose "Prompt specified as argument, using '$Prompt'"
|
197 |
+
}
|
198 |
+
else {
|
199 |
+
Write-Verbose "Prompt not specified as argument to script, checking pyvenv.cfg value"
|
200 |
+
if ($pyvenvCfg -and $pyvenvCfg['prompt']) {
|
201 |
+
Write-Verbose " Setting based on value in pyvenv.cfg='$($pyvenvCfg['prompt'])'"
|
202 |
+
$Prompt = $pyvenvCfg['prompt'];
|
203 |
+
}
|
204 |
+
else {
|
205 |
+
Write-Verbose " Setting prompt based on parent's directory's name. (Is the directory name passed to venv module when creating the virtual environment)"
|
206 |
+
Write-Verbose " Got leaf-name of $VenvDir='$(Split-Path -Path $venvDir -Leaf)'"
|
207 |
+
$Prompt = Split-Path -Path $venvDir -Leaf
|
208 |
+
}
|
209 |
+
}
|
210 |
+
|
211 |
+
Write-Verbose "Prompt = '$Prompt'"
|
212 |
+
Write-Verbose "VenvDir='$VenvDir'"
|
213 |
+
|
214 |
+
# Deactivate any currently active virtual environment, but leave the
|
215 |
+
# deactivate function in place.
|
216 |
+
deactivate -nondestructive
|
217 |
+
|
218 |
+
# Now set the environment variable VIRTUAL_ENV, used by many tools to determine
|
219 |
+
# that there is an activated venv.
|
220 |
+
$env:VIRTUAL_ENV = $VenvDir
|
221 |
+
|
222 |
+
if (-not $Env:VIRTUAL_ENV_DISABLE_PROMPT) {
|
223 |
+
|
224 |
+
Write-Verbose "Setting prompt to '$Prompt'"
|
225 |
+
|
226 |
+
# Set the prompt to include the env name
|
227 |
+
# Make sure _OLD_VIRTUAL_PROMPT is global
|
228 |
+
function global:_OLD_VIRTUAL_PROMPT { "" }
|
229 |
+
Copy-Item -Path function:prompt -Destination function:_OLD_VIRTUAL_PROMPT
|
230 |
+
New-Variable -Name _PYTHON_VENV_PROMPT_PREFIX -Description "Python virtual environment prompt prefix" -Scope Global -Option ReadOnly -Visibility Public -Value $Prompt
|
231 |
+
|
232 |
+
function global:prompt {
|
233 |
+
Write-Host -NoNewline -ForegroundColor Green "($_PYTHON_VENV_PROMPT_PREFIX) "
|
234 |
+
_OLD_VIRTUAL_PROMPT
|
235 |
+
}
|
236 |
+
$env:VIRTUAL_ENV_PROMPT = $Prompt
|
237 |
+
}
|
238 |
+
|
239 |
+
# Clear PYTHONHOME
|
240 |
+
if (Test-Path -Path Env:PYTHONHOME) {
|
241 |
+
Copy-Item -Path Env:PYTHONHOME -Destination Env:_OLD_VIRTUAL_PYTHONHOME
|
242 |
+
Remove-Item -Path Env:PYTHONHOME
|
243 |
+
}
|
244 |
+
|
245 |
+
# Add the venv to the PATH
|
246 |
+
Copy-Item -Path Env:PATH -Destination Env:_OLD_VIRTUAL_PATH
|
247 |
+
$Env:PATH = "$VenvExecDir$([System.IO.Path]::PathSeparator)$Env:PATH"
|
.venv/bin/activate
ADDED
@@ -0,0 +1,70 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
# This file must be used with "source bin/activate" *from bash*
|
2 |
+
# You cannot run it directly
|
3 |
+
|
4 |
+
deactivate () {
|
5 |
+
# reset old environment variables
|
6 |
+
if [ -n "${_OLD_VIRTUAL_PATH:-}" ] ; then
|
7 |
+
PATH="${_OLD_VIRTUAL_PATH:-}"
|
8 |
+
export PATH
|
9 |
+
unset _OLD_VIRTUAL_PATH
|
10 |
+
fi
|
11 |
+
if [ -n "${_OLD_VIRTUAL_PYTHONHOME:-}" ] ; then
|
12 |
+
PYTHONHOME="${_OLD_VIRTUAL_PYTHONHOME:-}"
|
13 |
+
export PYTHONHOME
|
14 |
+
unset _OLD_VIRTUAL_PYTHONHOME
|
15 |
+
fi
|
16 |
+
|
17 |
+
# Call hash to forget past commands. Without forgetting
|
18 |
+
# past commands the $PATH changes we made may not be respected
|
19 |
+
hash -r 2> /dev/null
|
20 |
+
|
21 |
+
if [ -n "${_OLD_VIRTUAL_PS1:-}" ] ; then
|
22 |
+
PS1="${_OLD_VIRTUAL_PS1:-}"
|
23 |
+
export PS1
|
24 |
+
unset _OLD_VIRTUAL_PS1
|
25 |
+
fi
|
26 |
+
|
27 |
+
unset VIRTUAL_ENV
|
28 |
+
unset VIRTUAL_ENV_PROMPT
|
29 |
+
if [ ! "${1:-}" = "nondestructive" ] ; then
|
30 |
+
# Self destruct!
|
31 |
+
unset -f deactivate
|
32 |
+
fi
|
33 |
+
}
|
34 |
+
|
35 |
+
# unset irrelevant variables
|
36 |
+
deactivate nondestructive
|
37 |
+
|
38 |
+
# on Windows, a path can contain colons and backslashes and has to be converted:
|
39 |
+
if [ "${OSTYPE:-}" = "cygwin" ] || [ "${OSTYPE:-}" = "msys" ] ; then
|
40 |
+
# transform D:\path\to\venv to /d/path/to/venv on MSYS
|
41 |
+
# and to /cygdrive/d/path/to/venv on Cygwin
|
42 |
+
export VIRTUAL_ENV=$(cygpath "/Users/husseinelsaadi/Documents/Data Science USAL/Spring 24-25/FYP - Codingo/Codingo/.venv")
|
43 |
+
else
|
44 |
+
# use the path as-is
|
45 |
+
export VIRTUAL_ENV="/Users/husseinelsaadi/Documents/Data Science USAL/Spring 24-25/FYP - Codingo/Codingo/.venv"
|
46 |
+
fi
|
47 |
+
|
48 |
+
_OLD_VIRTUAL_PATH="$PATH"
|
49 |
+
PATH="$VIRTUAL_ENV/bin:$PATH"
|
50 |
+
export PATH
|
51 |
+
|
52 |
+
# unset PYTHONHOME if set
|
53 |
+
# this will fail if PYTHONHOME is set to the empty string (which is bad anyway)
|
54 |
+
# could use `if (set -u; : $PYTHONHOME) ;` in bash
|
55 |
+
if [ -n "${PYTHONHOME:-}" ] ; then
|
56 |
+
_OLD_VIRTUAL_PYTHONHOME="${PYTHONHOME:-}"
|
57 |
+
unset PYTHONHOME
|
58 |
+
fi
|
59 |
+
|
60 |
+
if [ -z "${VIRTUAL_ENV_DISABLE_PROMPT:-}" ] ; then
|
61 |
+
_OLD_VIRTUAL_PS1="${PS1:-}"
|
62 |
+
PS1="(.venv) ${PS1:-}"
|
63 |
+
export PS1
|
64 |
+
VIRTUAL_ENV_PROMPT="(.venv) "
|
65 |
+
export VIRTUAL_ENV_PROMPT
|
66 |
+
fi
|
67 |
+
|
68 |
+
# Call hash to forget past commands. Without forgetting
|
69 |
+
# past commands the $PATH changes we made may not be respected
|
70 |
+
hash -r 2> /dev/null
|
.venv/bin/activate.csh
ADDED
@@ -0,0 +1,27 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
# This file must be used with "source bin/activate.csh" *from csh*.
|
2 |
+
# You cannot run it directly.
|
3 |
+
|
4 |
+
# Created by Davide Di Blasi <[email protected]>.
|
5 |
+
# Ported to Python 3.3 venv by Andrew Svetlov <[email protected]>
|
6 |
+
|
7 |
+
alias deactivate 'test $?_OLD_VIRTUAL_PATH != 0 && setenv PATH "$_OLD_VIRTUAL_PATH" && unset _OLD_VIRTUAL_PATH; rehash; test $?_OLD_VIRTUAL_PROMPT != 0 && set prompt="$_OLD_VIRTUAL_PROMPT" && unset _OLD_VIRTUAL_PROMPT; unsetenv VIRTUAL_ENV; unsetenv VIRTUAL_ENV_PROMPT; test "\!:*" != "nondestructive" && unalias deactivate'
|
8 |
+
|
9 |
+
# Unset irrelevant variables.
|
10 |
+
deactivate nondestructive
|
11 |
+
|
12 |
+
setenv VIRTUAL_ENV "/Users/husseinelsaadi/Documents/Data Science USAL/Spring 24-25/FYP - Codingo/Codingo/.venv"
|
13 |
+
|
14 |
+
set _OLD_VIRTUAL_PATH="$PATH"
|
15 |
+
setenv PATH "$VIRTUAL_ENV/bin:$PATH"
|
16 |
+
|
17 |
+
|
18 |
+
set _OLD_VIRTUAL_PROMPT="$prompt"
|
19 |
+
|
20 |
+
if (! "$?VIRTUAL_ENV_DISABLE_PROMPT") then
|
21 |
+
set prompt = "(.venv) $prompt"
|
22 |
+
setenv VIRTUAL_ENV_PROMPT "(.venv) "
|
23 |
+
endif
|
24 |
+
|
25 |
+
alias pydoc python -m pydoc
|
26 |
+
|
27 |
+
rehash
|
.venv/bin/activate.fish
ADDED
@@ -0,0 +1,69 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
# This file must be used with "source <venv>/bin/activate.fish" *from fish*
|
2 |
+
# (https://fishshell.com/). You cannot run it directly.
|
3 |
+
|
4 |
+
function deactivate -d "Exit virtual environment and return to normal shell environment"
|
5 |
+
# reset old environment variables
|
6 |
+
if test -n "$_OLD_VIRTUAL_PATH"
|
7 |
+
set -gx PATH $_OLD_VIRTUAL_PATH
|
8 |
+
set -e _OLD_VIRTUAL_PATH
|
9 |
+
end
|
10 |
+
if test -n "$_OLD_VIRTUAL_PYTHONHOME"
|
11 |
+
set -gx PYTHONHOME $_OLD_VIRTUAL_PYTHONHOME
|
12 |
+
set -e _OLD_VIRTUAL_PYTHONHOME
|
13 |
+
end
|
14 |
+
|
15 |
+
if test -n "$_OLD_FISH_PROMPT_OVERRIDE"
|
16 |
+
set -e _OLD_FISH_PROMPT_OVERRIDE
|
17 |
+
# prevents error when using nested fish instances (Issue #93858)
|
18 |
+
if functions -q _old_fish_prompt
|
19 |
+
functions -e fish_prompt
|
20 |
+
functions -c _old_fish_prompt fish_prompt
|
21 |
+
functions -e _old_fish_prompt
|
22 |
+
end
|
23 |
+
end
|
24 |
+
|
25 |
+
set -e VIRTUAL_ENV
|
26 |
+
set -e VIRTUAL_ENV_PROMPT
|
27 |
+
if test "$argv[1]" != "nondestructive"
|
28 |
+
# Self-destruct!
|
29 |
+
functions -e deactivate
|
30 |
+
end
|
31 |
+
end
|
32 |
+
|
33 |
+
# Unset irrelevant variables.
|
34 |
+
deactivate nondestructive
|
35 |
+
|
36 |
+
set -gx VIRTUAL_ENV "/Users/husseinelsaadi/Documents/Data Science USAL/Spring 24-25/FYP - Codingo/Codingo/.venv"
|
37 |
+
|
38 |
+
set -gx _OLD_VIRTUAL_PATH $PATH
|
39 |
+
set -gx PATH "$VIRTUAL_ENV/bin" $PATH
|
40 |
+
|
41 |
+
# Unset PYTHONHOME if set.
|
42 |
+
if set -q PYTHONHOME
|
43 |
+
set -gx _OLD_VIRTUAL_PYTHONHOME $PYTHONHOME
|
44 |
+
set -e PYTHONHOME
|
45 |
+
end
|
46 |
+
|
47 |
+
if test -z "$VIRTUAL_ENV_DISABLE_PROMPT"
|
48 |
+
# fish uses a function instead of an env var to generate the prompt.
|
49 |
+
|
50 |
+
# Save the current fish_prompt function as the function _old_fish_prompt.
|
51 |
+
functions -c fish_prompt _old_fish_prompt
|
52 |
+
|
53 |
+
# With the original prompt function renamed, we can override with our own.
|
54 |
+
function fish_prompt
|
55 |
+
# Save the return status of the last command.
|
56 |
+
set -l old_status $status
|
57 |
+
|
58 |
+
# Output the venv prompt; color taken from the blue of the Python logo.
|
59 |
+
printf "%s%s%s" (set_color 4B8BBE) "(.venv) " (set_color normal)
|
60 |
+
|
61 |
+
# Restore the return status of the previous command.
|
62 |
+
echo "exit $old_status" | .
|
63 |
+
# Output the original/"old" prompt.
|
64 |
+
_old_fish_prompt
|
65 |
+
end
|
66 |
+
|
67 |
+
set -gx _OLD_FISH_PROMPT_OVERRIDE "$VIRTUAL_ENV"
|
68 |
+
set -gx VIRTUAL_ENV_PROMPT "(.venv) "
|
69 |
+
end
|
.venv/bin/chardetect
ADDED
@@ -0,0 +1,10 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
#!/bin/sh
|
2 |
+
'''exec' "/Users/husseinelsaadi/Documents/Data Science USAL/Spring 24-25/FYP - Codingo/Codingo/.venv/bin/python3.12" "$0" "$@"
|
3 |
+
' '''
|
4 |
+
# -*- coding: utf-8 -*-
|
5 |
+
import re
|
6 |
+
import sys
|
7 |
+
from chardet.cli.chardetect import main
|
8 |
+
if __name__ == '__main__':
|
9 |
+
sys.argv[0] = re.sub(r'(-script\.pyw|\.exe)?$', '', sys.argv[0])
|
10 |
+
sys.exit(main())
|
.venv/bin/docx2txt
ADDED
@@ -0,0 +1,10 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
#!/Users/husseinelsaadi/Documents/Data Science USAL/Spring 24-25/FYP - Codingo/Codingo/.venv/bin/python3.12
|
2 |
+
|
3 |
+
import docx2txt
|
4 |
+
|
5 |
+
if __name__ == '__main__':
|
6 |
+
import sys
|
7 |
+
args = docx2txt.process_args()
|
8 |
+
text = docx2txt.process(args.docx, args.img_dir)
|
9 |
+
output = getattr(sys.stdout, 'buffer', sys.stdout)
|
10 |
+
output.write(text.encode('utf-8'))
|
.venv/bin/dumppdf.py
ADDED
@@ -0,0 +1,480 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
#!/Users/husseinelsaadi/Documents/Data Science USAL/Spring 24-25/FYP - Codingo/Codingo/.venv/bin/python3.12
|
2 |
+
"""Extract pdf structure in XML format"""
|
3 |
+
|
4 |
+
import logging
|
5 |
+
import os.path
|
6 |
+
import re
|
7 |
+
import sys
|
8 |
+
from argparse import ArgumentParser
|
9 |
+
from typing import Any, Container, Dict, Iterable, List, Optional, TextIO, Union, cast
|
10 |
+
|
11 |
+
import pdfminer
|
12 |
+
from pdfminer.pdfdocument import PDFDocument, PDFNoOutlines, PDFXRefFallback
|
13 |
+
from pdfminer.pdfexceptions import (
|
14 |
+
PDFIOError,
|
15 |
+
PDFObjectNotFound,
|
16 |
+
PDFTypeError,
|
17 |
+
PDFValueError,
|
18 |
+
)
|
19 |
+
from pdfminer.pdfpage import PDFPage
|
20 |
+
from pdfminer.pdfparser import PDFParser
|
21 |
+
from pdfminer.pdftypes import PDFObjRef, PDFStream, resolve1, stream_value
|
22 |
+
from pdfminer.psparser import LIT, PSKeyword, PSLiteral
|
23 |
+
from pdfminer.utils import isnumber
|
24 |
+
|
25 |
+
logging.basicConfig()
|
26 |
+
logger = logging.getLogger(__name__)
|
27 |
+
|
28 |
+
ESC_PAT = re.compile(r'[\000-\037&<>()"\042\047\134\177-\377]')
|
29 |
+
|
30 |
+
|
31 |
+
def escape(s: Union[str, bytes]) -> str:
|
32 |
+
if isinstance(s, bytes):
|
33 |
+
us = str(s, "latin-1")
|
34 |
+
else:
|
35 |
+
us = s
|
36 |
+
return ESC_PAT.sub(lambda m: "&#%d;" % ord(m.group(0)), us)
|
37 |
+
|
38 |
+
|
39 |
+
def dumpxml(out: TextIO, obj: object, codec: Optional[str] = None) -> None:
|
40 |
+
if obj is None:
|
41 |
+
out.write("<null />")
|
42 |
+
return
|
43 |
+
|
44 |
+
if isinstance(obj, dict):
|
45 |
+
out.write('<dict size="%d">\n' % len(obj))
|
46 |
+
for k, v in obj.items():
|
47 |
+
out.write("<key>%s</key>\n" % k)
|
48 |
+
out.write("<value>")
|
49 |
+
dumpxml(out, v)
|
50 |
+
out.write("</value>\n")
|
51 |
+
out.write("</dict>")
|
52 |
+
return
|
53 |
+
|
54 |
+
if isinstance(obj, list):
|
55 |
+
out.write('<list size="%d">\n' % len(obj))
|
56 |
+
for v in obj:
|
57 |
+
dumpxml(out, v)
|
58 |
+
out.write("\n")
|
59 |
+
out.write("</list>")
|
60 |
+
return
|
61 |
+
|
62 |
+
if isinstance(obj, (str, bytes)):
|
63 |
+
out.write('<string size="%d">%s</string>' % (len(obj), escape(obj)))
|
64 |
+
return
|
65 |
+
|
66 |
+
if isinstance(obj, PDFStream):
|
67 |
+
if codec == "raw":
|
68 |
+
# Bug: writing bytes to text I/O. This will raise TypeError.
|
69 |
+
out.write(obj.get_rawdata()) # type: ignore [arg-type]
|
70 |
+
elif codec == "binary":
|
71 |
+
# Bug: writing bytes to text I/O. This will raise TypeError.
|
72 |
+
out.write(obj.get_data()) # type: ignore [arg-type]
|
73 |
+
else:
|
74 |
+
out.write("<stream>\n<props>\n")
|
75 |
+
dumpxml(out, obj.attrs)
|
76 |
+
out.write("\n</props>\n")
|
77 |
+
if codec == "text":
|
78 |
+
data = obj.get_data()
|
79 |
+
out.write('<data size="%d">%s</data>\n' % (len(data), escape(data)))
|
80 |
+
out.write("</stream>")
|
81 |
+
return
|
82 |
+
|
83 |
+
if isinstance(obj, PDFObjRef):
|
84 |
+
out.write('<ref id="%d" />' % obj.objid)
|
85 |
+
return
|
86 |
+
|
87 |
+
if isinstance(obj, PSKeyword):
|
88 |
+
# Likely bug: obj.name is bytes, not str
|
89 |
+
out.write("<keyword>%s</keyword>" % obj.name) # type: ignore [str-bytes-safe]
|
90 |
+
return
|
91 |
+
|
92 |
+
if isinstance(obj, PSLiteral):
|
93 |
+
# Likely bug: obj.name may be bytes, not str
|
94 |
+
out.write("<literal>%s</literal>" % obj.name) # type: ignore [str-bytes-safe]
|
95 |
+
return
|
96 |
+
|
97 |
+
if isnumber(obj):
|
98 |
+
out.write("<number>%s</number>" % obj)
|
99 |
+
return
|
100 |
+
|
101 |
+
raise PDFTypeError(obj)
|
102 |
+
|
103 |
+
|
104 |
+
def dumptrailers(
|
105 |
+
out: TextIO,
|
106 |
+
doc: PDFDocument,
|
107 |
+
show_fallback_xref: bool = False,
|
108 |
+
) -> None:
|
109 |
+
for xref in doc.xrefs:
|
110 |
+
if not isinstance(xref, PDFXRefFallback) or show_fallback_xref:
|
111 |
+
out.write("<trailer>\n")
|
112 |
+
dumpxml(out, xref.get_trailer())
|
113 |
+
out.write("\n</trailer>\n\n")
|
114 |
+
no_xrefs = all(isinstance(xref, PDFXRefFallback) for xref in doc.xrefs)
|
115 |
+
if no_xrefs and not show_fallback_xref:
|
116 |
+
msg = (
|
117 |
+
"This PDF does not have an xref. Use --show-fallback-xref if "
|
118 |
+
"you want to display the content of a fallback xref that "
|
119 |
+
"contains all objects."
|
120 |
+
)
|
121 |
+
logger.warning(msg)
|
122 |
+
|
123 |
+
|
124 |
+
def dumpallobjs(
|
125 |
+
out: TextIO,
|
126 |
+
doc: PDFDocument,
|
127 |
+
codec: Optional[str] = None,
|
128 |
+
show_fallback_xref: bool = False,
|
129 |
+
) -> None:
|
130 |
+
visited = set()
|
131 |
+
out.write("<pdf>")
|
132 |
+
for xref in doc.xrefs:
|
133 |
+
for objid in xref.get_objids():
|
134 |
+
if objid in visited:
|
135 |
+
continue
|
136 |
+
visited.add(objid)
|
137 |
+
try:
|
138 |
+
obj = doc.getobj(objid)
|
139 |
+
if obj is None:
|
140 |
+
continue
|
141 |
+
out.write('<object id="%d">\n' % objid)
|
142 |
+
dumpxml(out, obj, codec=codec)
|
143 |
+
out.write("\n</object>\n\n")
|
144 |
+
except PDFObjectNotFound as e:
|
145 |
+
print("not found: %r" % e)
|
146 |
+
dumptrailers(out, doc, show_fallback_xref)
|
147 |
+
out.write("</pdf>")
|
148 |
+
|
149 |
+
|
150 |
+
def dumpoutline(
|
151 |
+
outfp: TextIO,
|
152 |
+
fname: str,
|
153 |
+
objids: Any,
|
154 |
+
pagenos: Container[int],
|
155 |
+
password: str = "",
|
156 |
+
dumpall: bool = False,
|
157 |
+
codec: Optional[str] = None,
|
158 |
+
extractdir: Optional[str] = None,
|
159 |
+
) -> None:
|
160 |
+
fp = open(fname, "rb")
|
161 |
+
parser = PDFParser(fp)
|
162 |
+
doc = PDFDocument(parser, password)
|
163 |
+
pages = {
|
164 |
+
page.pageid: pageno
|
165 |
+
for (pageno, page) in enumerate(PDFPage.create_pages(doc), 1)
|
166 |
+
}
|
167 |
+
|
168 |
+
def resolve_dest(dest: object) -> Any:
|
169 |
+
if isinstance(dest, (str, bytes)):
|
170 |
+
dest = resolve1(doc.get_dest(dest))
|
171 |
+
elif isinstance(dest, PSLiteral):
|
172 |
+
dest = resolve1(doc.get_dest(dest.name))
|
173 |
+
if isinstance(dest, dict):
|
174 |
+
dest = dest["D"]
|
175 |
+
if isinstance(dest, PDFObjRef):
|
176 |
+
dest = dest.resolve()
|
177 |
+
return dest
|
178 |
+
|
179 |
+
try:
|
180 |
+
outlines = doc.get_outlines()
|
181 |
+
outfp.write("<outlines>\n")
|
182 |
+
for level, title, dest, a, se in outlines:
|
183 |
+
pageno = None
|
184 |
+
if dest:
|
185 |
+
dest = resolve_dest(dest)
|
186 |
+
pageno = pages[dest[0].objid]
|
187 |
+
elif a:
|
188 |
+
action = a
|
189 |
+
if isinstance(action, dict):
|
190 |
+
subtype = action.get("S")
|
191 |
+
if subtype and repr(subtype) == "/'GoTo'" and action.get("D"):
|
192 |
+
dest = resolve_dest(action["D"])
|
193 |
+
pageno = pages[dest[0].objid]
|
194 |
+
s = escape(title)
|
195 |
+
outfp.write(f'<outline level="{level!r}" title="{s}">\n')
|
196 |
+
if dest is not None:
|
197 |
+
outfp.write("<dest>")
|
198 |
+
dumpxml(outfp, dest)
|
199 |
+
outfp.write("</dest>\n")
|
200 |
+
if pageno is not None:
|
201 |
+
outfp.write("<pageno>%r</pageno>\n" % pageno)
|
202 |
+
outfp.write("</outline>\n")
|
203 |
+
outfp.write("</outlines>\n")
|
204 |
+
except PDFNoOutlines:
|
205 |
+
pass
|
206 |
+
parser.close()
|
207 |
+
fp.close()
|
208 |
+
|
209 |
+
|
210 |
+
LITERAL_FILESPEC = LIT("Filespec")
|
211 |
+
LITERAL_EMBEDDEDFILE = LIT("EmbeddedFile")
|
212 |
+
|
213 |
+
|
214 |
+
def extractembedded(fname: str, password: str, extractdir: str) -> None:
|
215 |
+
def extract1(objid: int, obj: Dict[str, Any]) -> None:
|
216 |
+
filename = os.path.basename(obj.get("UF") or cast(bytes, obj.get("F")).decode())
|
217 |
+
fileref = obj["EF"].get("UF") or obj["EF"].get("F")
|
218 |
+
fileobj = doc.getobj(fileref.objid)
|
219 |
+
if not isinstance(fileobj, PDFStream):
|
220 |
+
error_msg = (
|
221 |
+
"unable to process PDF: reference for %r is not a "
|
222 |
+
"PDFStream" % filename
|
223 |
+
)
|
224 |
+
raise PDFValueError(error_msg)
|
225 |
+
if fileobj.get("Type") is not LITERAL_EMBEDDEDFILE:
|
226 |
+
raise PDFValueError(
|
227 |
+
"unable to process PDF: reference for %r "
|
228 |
+
"is not an EmbeddedFile" % (filename),
|
229 |
+
)
|
230 |
+
path = os.path.join(extractdir, "%.6d-%s" % (objid, filename))
|
231 |
+
if os.path.exists(path):
|
232 |
+
raise PDFIOError("file exists: %r" % path)
|
233 |
+
print("extracting: %r" % path)
|
234 |
+
os.makedirs(os.path.dirname(path), exist_ok=True)
|
235 |
+
out = open(path, "wb")
|
236 |
+
out.write(fileobj.get_data())
|
237 |
+
out.close()
|
238 |
+
|
239 |
+
with open(fname, "rb") as fp:
|
240 |
+
parser = PDFParser(fp)
|
241 |
+
doc = PDFDocument(parser, password)
|
242 |
+
extracted_objids = set()
|
243 |
+
for xref in doc.xrefs:
|
244 |
+
for objid in xref.get_objids():
|
245 |
+
obj = doc.getobj(objid)
|
246 |
+
if (
|
247 |
+
objid not in extracted_objids
|
248 |
+
and isinstance(obj, dict)
|
249 |
+
and obj.get("Type") is LITERAL_FILESPEC
|
250 |
+
):
|
251 |
+
extracted_objids.add(objid)
|
252 |
+
extract1(objid, obj)
|
253 |
+
|
254 |
+
|
255 |
+
def dumppdf(
|
256 |
+
outfp: TextIO,
|
257 |
+
fname: str,
|
258 |
+
objids: Iterable[int],
|
259 |
+
pagenos: Container[int],
|
260 |
+
password: str = "",
|
261 |
+
dumpall: bool = False,
|
262 |
+
codec: Optional[str] = None,
|
263 |
+
extractdir: Optional[str] = None,
|
264 |
+
show_fallback_xref: bool = False,
|
265 |
+
) -> None:
|
266 |
+
fp = open(fname, "rb")
|
267 |
+
parser = PDFParser(fp)
|
268 |
+
doc = PDFDocument(parser, password)
|
269 |
+
if objids:
|
270 |
+
for objid in objids:
|
271 |
+
obj = doc.getobj(objid)
|
272 |
+
dumpxml(outfp, obj, codec=codec)
|
273 |
+
if pagenos:
|
274 |
+
for pageno, page in enumerate(PDFPage.create_pages(doc)):
|
275 |
+
if pageno in pagenos:
|
276 |
+
if codec:
|
277 |
+
for obj in page.contents:
|
278 |
+
obj = stream_value(obj)
|
279 |
+
dumpxml(outfp, obj, codec=codec)
|
280 |
+
else:
|
281 |
+
dumpxml(outfp, page.attrs)
|
282 |
+
if dumpall:
|
283 |
+
dumpallobjs(outfp, doc, codec, show_fallback_xref)
|
284 |
+
if (not objids) and (not pagenos) and (not dumpall):
|
285 |
+
dumptrailers(outfp, doc, show_fallback_xref)
|
286 |
+
fp.close()
|
287 |
+
if codec not in ("raw", "binary"):
|
288 |
+
outfp.write("\n")
|
289 |
+
|
290 |
+
|
291 |
+
def create_parser() -> ArgumentParser:
|
292 |
+
parser = ArgumentParser(description=__doc__, add_help=True)
|
293 |
+
parser.add_argument(
|
294 |
+
"files",
|
295 |
+
type=str,
|
296 |
+
default=None,
|
297 |
+
nargs="+",
|
298 |
+
help="One or more paths to PDF files.",
|
299 |
+
)
|
300 |
+
|
301 |
+
parser.add_argument(
|
302 |
+
"--version",
|
303 |
+
"-v",
|
304 |
+
action="version",
|
305 |
+
version=f"pdfminer.six v{pdfminer.__version__}",
|
306 |
+
)
|
307 |
+
parser.add_argument(
|
308 |
+
"--debug",
|
309 |
+
"-d",
|
310 |
+
default=False,
|
311 |
+
action="store_true",
|
312 |
+
help="Use debug logging level.",
|
313 |
+
)
|
314 |
+
procedure_parser = parser.add_mutually_exclusive_group()
|
315 |
+
procedure_parser.add_argument(
|
316 |
+
"--extract-toc",
|
317 |
+
"-T",
|
318 |
+
default=False,
|
319 |
+
action="store_true",
|
320 |
+
help="Extract structure of outline",
|
321 |
+
)
|
322 |
+
procedure_parser.add_argument(
|
323 |
+
"--extract-embedded",
|
324 |
+
"-E",
|
325 |
+
type=str,
|
326 |
+
help="Extract embedded files",
|
327 |
+
)
|
328 |
+
|
329 |
+
parse_params = parser.add_argument_group(
|
330 |
+
"Parser",
|
331 |
+
description="Used during PDF parsing",
|
332 |
+
)
|
333 |
+
parse_params.add_argument(
|
334 |
+
"--page-numbers",
|
335 |
+
type=int,
|
336 |
+
default=None,
|
337 |
+
nargs="+",
|
338 |
+
help="A space-seperated list of page numbers to parse.",
|
339 |
+
)
|
340 |
+
parse_params.add_argument(
|
341 |
+
"--pagenos",
|
342 |
+
"-p",
|
343 |
+
type=str,
|
344 |
+
help="A comma-separated list of page numbers to parse. Included for "
|
345 |
+
"legacy applications, use --page-numbers for more idiomatic "
|
346 |
+
"argument entry.",
|
347 |
+
)
|
348 |
+
parse_params.add_argument(
|
349 |
+
"--objects",
|
350 |
+
"-i",
|
351 |
+
type=str,
|
352 |
+
help="Comma separated list of object numbers to extract",
|
353 |
+
)
|
354 |
+
parse_params.add_argument(
|
355 |
+
"--all",
|
356 |
+
"-a",
|
357 |
+
default=False,
|
358 |
+
action="store_true",
|
359 |
+
help="If the structure of all objects should be extracted",
|
360 |
+
)
|
361 |
+
parse_params.add_argument(
|
362 |
+
"--show-fallback-xref",
|
363 |
+
action="store_true",
|
364 |
+
help="Additionally show the fallback xref. Use this if the PDF "
|
365 |
+
"has zero or only invalid xref's. This setting is ignored if "
|
366 |
+
"--extract-toc or --extract-embedded is used.",
|
367 |
+
)
|
368 |
+
parse_params.add_argument(
|
369 |
+
"--password",
|
370 |
+
"-P",
|
371 |
+
type=str,
|
372 |
+
default="",
|
373 |
+
help="The password to use for decrypting PDF file.",
|
374 |
+
)
|
375 |
+
|
376 |
+
output_params = parser.add_argument_group(
|
377 |
+
"Output",
|
378 |
+
description="Used during output generation.",
|
379 |
+
)
|
380 |
+
output_params.add_argument(
|
381 |
+
"--outfile",
|
382 |
+
"-o",
|
383 |
+
type=str,
|
384 |
+
default="-",
|
385 |
+
help='Path to file where output is written. Or "-" (default) to '
|
386 |
+
"write to stdout.",
|
387 |
+
)
|
388 |
+
codec_parser = output_params.add_mutually_exclusive_group()
|
389 |
+
codec_parser.add_argument(
|
390 |
+
"--raw-stream",
|
391 |
+
"-r",
|
392 |
+
default=False,
|
393 |
+
action="store_true",
|
394 |
+
help="Write stream objects without encoding",
|
395 |
+
)
|
396 |
+
codec_parser.add_argument(
|
397 |
+
"--binary-stream",
|
398 |
+
"-b",
|
399 |
+
default=False,
|
400 |
+
action="store_true",
|
401 |
+
help="Write stream objects with binary encoding",
|
402 |
+
)
|
403 |
+
codec_parser.add_argument(
|
404 |
+
"--text-stream",
|
405 |
+
"-t",
|
406 |
+
default=False,
|
407 |
+
action="store_true",
|
408 |
+
help="Write stream objects as plain text",
|
409 |
+
)
|
410 |
+
|
411 |
+
return parser
|
412 |
+
|
413 |
+
|
414 |
+
def main(argv: Optional[List[str]] = None) -> None:
|
415 |
+
parser = create_parser()
|
416 |
+
args = parser.parse_args(args=argv)
|
417 |
+
|
418 |
+
if args.debug:
|
419 |
+
logging.getLogger().setLevel(logging.DEBUG)
|
420 |
+
|
421 |
+
if args.outfile == "-":
|
422 |
+
outfp = sys.stdout
|
423 |
+
else:
|
424 |
+
outfp = open(args.outfile, "w")
|
425 |
+
|
426 |
+
if args.objects:
|
427 |
+
objids = [int(x) for x in args.objects.split(",")]
|
428 |
+
else:
|
429 |
+
objids = []
|
430 |
+
|
431 |
+
if args.page_numbers:
|
432 |
+
pagenos = {x - 1 for x in args.page_numbers}
|
433 |
+
elif args.pagenos:
|
434 |
+
pagenos = {int(x) - 1 for x in args.pagenos.split(",")}
|
435 |
+
else:
|
436 |
+
pagenos = set()
|
437 |
+
|
438 |
+
password = args.password
|
439 |
+
|
440 |
+
if args.raw_stream:
|
441 |
+
codec: Optional[str] = "raw"
|
442 |
+
elif args.binary_stream:
|
443 |
+
codec = "binary"
|
444 |
+
elif args.text_stream:
|
445 |
+
codec = "text"
|
446 |
+
else:
|
447 |
+
codec = None
|
448 |
+
|
449 |
+
for fname in args.files:
|
450 |
+
if args.extract_toc:
|
451 |
+
dumpoutline(
|
452 |
+
outfp,
|
453 |
+
fname,
|
454 |
+
objids,
|
455 |
+
pagenos,
|
456 |
+
password=password,
|
457 |
+
dumpall=args.all,
|
458 |
+
codec=codec,
|
459 |
+
extractdir=None,
|
460 |
+
)
|
461 |
+
elif args.extract_embedded:
|
462 |
+
extractembedded(fname, password=password, extractdir=args.extract_embedded)
|
463 |
+
else:
|
464 |
+
dumppdf(
|
465 |
+
outfp,
|
466 |
+
fname,
|
467 |
+
objids,
|
468 |
+
pagenos,
|
469 |
+
password=password,
|
470 |
+
dumpall=args.all,
|
471 |
+
codec=codec,
|
472 |
+
extractdir=None,
|
473 |
+
show_fallback_xref=args.show_fallback_xref,
|
474 |
+
)
|
475 |
+
|
476 |
+
outfp.close()
|
477 |
+
|
478 |
+
|
479 |
+
if __name__ == "__main__":
|
480 |
+
main()
|
.venv/bin/f2py
ADDED
@@ -0,0 +1,10 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
#!/bin/sh
|
2 |
+
'''exec' "/Users/husseinelsaadi/Documents/Data Science USAL/Spring 24-25/FYP - Codingo/Codingo/.venv/bin/python3.12" "$0" "$@"
|
3 |
+
' '''
|
4 |
+
# -*- coding: utf-8 -*-
|
5 |
+
import re
|
6 |
+
import sys
|
7 |
+
from numpy.f2py.f2py2e import main
|
8 |
+
if __name__ == '__main__':
|
9 |
+
sys.argv[0] = re.sub(r'(-script\.pyw|\.exe)?$', '', sys.argv[0])
|
10 |
+
sys.exit(main())
|
.venv/bin/flask
ADDED
@@ -0,0 +1,10 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
#!/bin/sh
|
2 |
+
'''exec' "/Users/husseinelsaadi/Documents/Data Science USAL/Spring 24-25/FYP - Codingo/Codingo/.venv/bin/python3.12" "$0" "$@"
|
3 |
+
' '''
|
4 |
+
# -*- coding: utf-8 -*-
|
5 |
+
import re
|
6 |
+
import sys
|
7 |
+
from flask.cli import main
|
8 |
+
if __name__ == '__main__':
|
9 |
+
sys.argv[0] = re.sub(r'(-script\.pyw|\.exe)?$', '', sys.argv[0])
|
10 |
+
sys.exit(main())
|
.venv/bin/jsonschema
ADDED
@@ -0,0 +1,10 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
#!/bin/sh
|
2 |
+
'''exec' "/Users/husseinelsaadi/Documents/Data Science USAL/Spring 24-25/FYP - Codingo/Codingo/.venv/bin/python3.12" "$0" "$@"
|
3 |
+
' '''
|
4 |
+
# -*- coding: utf-8 -*-
|
5 |
+
import re
|
6 |
+
import sys
|
7 |
+
from jsonschema.cli import main
|
8 |
+
if __name__ == '__main__':
|
9 |
+
sys.argv[0] = re.sub(r'(-script\.pyw|\.exe)?$', '', sys.argv[0])
|
10 |
+
sys.exit(main())
|
.venv/bin/markdown-it
ADDED
@@ -0,0 +1,10 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
#!/bin/sh
|
2 |
+
'''exec' "/Users/husseinelsaadi/Documents/Data Science USAL/Spring 24-25/FYP - Codingo/Codingo/.venv/bin/python3.12" "$0" "$@"
|
3 |
+
' '''
|
4 |
+
# -*- coding: utf-8 -*-
|
5 |
+
import re
|
6 |
+
import sys
|
7 |
+
from markdown_it.cli.parse import main
|
8 |
+
if __name__ == '__main__':
|
9 |
+
sys.argv[0] = re.sub(r'(-script\.pyw|\.exe)?$', '', sys.argv[0])
|
10 |
+
sys.exit(main())
|
.venv/bin/nltk
ADDED
@@ -0,0 +1,10 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
#!/bin/sh
|
2 |
+
'''exec' "/Users/husseinelsaadi/Documents/Data Science USAL/Spring 24-25/FYP - Codingo/Codingo/.venv/bin/python3.12" "$0" "$@"
|
3 |
+
' '''
|
4 |
+
# -*- coding: utf-8 -*-
|
5 |
+
import re
|
6 |
+
import sys
|
7 |
+
from nltk.cli import cli
|
8 |
+
if __name__ == '__main__':
|
9 |
+
sys.argv[0] = re.sub(r'(-script\.pyw|\.exe)?$', '', sys.argv[0])
|
10 |
+
sys.exit(cli())
|
.venv/bin/normalizer
ADDED
@@ -0,0 +1,10 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
#!/bin/sh
|
2 |
+
'''exec' "/Users/husseinelsaadi/Documents/Data Science USAL/Spring 24-25/FYP - Codingo/Codingo/.venv/bin/python3.12" "$0" "$@"
|
3 |
+
' '''
|
4 |
+
# -*- coding: utf-8 -*-
|
5 |
+
import re
|
6 |
+
import sys
|
7 |
+
from charset_normalizer import cli
|
8 |
+
if __name__ == '__main__':
|
9 |
+
sys.argv[0] = re.sub(r'(-script\.pyw|\.exe)?$', '', sys.argv[0])
|
10 |
+
sys.exit(cli.cli_detect())
|
.venv/bin/numpy-config
ADDED
@@ -0,0 +1,10 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
#!/bin/sh
|
2 |
+
'''exec' "/Users/husseinelsaadi/Documents/Data Science USAL/Spring 24-25/FYP - Codingo/Codingo/.venv/bin/python3.12" "$0" "$@"
|
3 |
+
' '''
|
4 |
+
# -*- coding: utf-8 -*-
|
5 |
+
import re
|
6 |
+
import sys
|
7 |
+
from numpy._configtool import main
|
8 |
+
if __name__ == '__main__':
|
9 |
+
sys.argv[0] = re.sub(r'(-script\.pyw|\.exe)?$', '', sys.argv[0])
|
10 |
+
sys.exit(main())
|
.venv/bin/pdf2txt.py
ADDED
@@ -0,0 +1,323 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
#!/Users/husseinelsaadi/Documents/Data Science USAL/Spring 24-25/FYP - Codingo/Codingo/.venv/bin/python3.12
|
2 |
+
"""A command line tool for extracting text and images from PDF and
|
3 |
+
output it to plain text, html, xml or tags.
|
4 |
+
"""
|
5 |
+
|
6 |
+
import argparse
|
7 |
+
import logging
|
8 |
+
import sys
|
9 |
+
from typing import Any, Container, Iterable, List, Optional
|
10 |
+
|
11 |
+
import pdfminer.high_level
|
12 |
+
from pdfminer.layout import LAParams
|
13 |
+
from pdfminer.pdfexceptions import PDFValueError
|
14 |
+
from pdfminer.utils import AnyIO
|
15 |
+
|
16 |
+
logging.basicConfig()
|
17 |
+
|
18 |
+
OUTPUT_TYPES = ((".htm", "html"), (".html", "html"), (".xml", "xml"), (".tag", "tag"))
|
19 |
+
|
20 |
+
|
21 |
+
def float_or_disabled(x: str) -> Optional[float]:
|
22 |
+
if x.lower().strip() == "disabled":
|
23 |
+
return None
|
24 |
+
try:
|
25 |
+
return float(x)
|
26 |
+
except ValueError:
|
27 |
+
raise argparse.ArgumentTypeError(f"invalid float value: {x}")
|
28 |
+
|
29 |
+
|
30 |
+
def extract_text(
|
31 |
+
files: Iterable[str] = [],
|
32 |
+
outfile: str = "-",
|
33 |
+
laparams: Optional[LAParams] = None,
|
34 |
+
output_type: str = "text",
|
35 |
+
codec: str = "utf-8",
|
36 |
+
strip_control: bool = False,
|
37 |
+
maxpages: int = 0,
|
38 |
+
page_numbers: Optional[Container[int]] = None,
|
39 |
+
password: str = "",
|
40 |
+
scale: float = 1.0,
|
41 |
+
rotation: int = 0,
|
42 |
+
layoutmode: str = "normal",
|
43 |
+
output_dir: Optional[str] = None,
|
44 |
+
debug: bool = False,
|
45 |
+
disable_caching: bool = False,
|
46 |
+
**kwargs: Any,
|
47 |
+
) -> AnyIO:
|
48 |
+
if not files:
|
49 |
+
raise PDFValueError("Must provide files to work upon!")
|
50 |
+
|
51 |
+
if output_type == "text" and outfile != "-":
|
52 |
+
for override, alttype in OUTPUT_TYPES:
|
53 |
+
if outfile.endswith(override):
|
54 |
+
output_type = alttype
|
55 |
+
|
56 |
+
if outfile == "-":
|
57 |
+
outfp: AnyIO = sys.stdout
|
58 |
+
if sys.stdout.encoding is not None:
|
59 |
+
codec = "utf-8"
|
60 |
+
else:
|
61 |
+
outfp = open(outfile, "wb")
|
62 |
+
|
63 |
+
for fname in files:
|
64 |
+
with open(fname, "rb") as fp:
|
65 |
+
pdfminer.high_level.extract_text_to_fp(fp, **locals())
|
66 |
+
return outfp
|
67 |
+
|
68 |
+
|
69 |
+
def create_parser() -> argparse.ArgumentParser:
|
70 |
+
parser = argparse.ArgumentParser(description=__doc__, add_help=True)
|
71 |
+
parser.add_argument(
|
72 |
+
"files",
|
73 |
+
type=str,
|
74 |
+
default=None,
|
75 |
+
nargs="+",
|
76 |
+
help="One or more paths to PDF files.",
|
77 |
+
)
|
78 |
+
|
79 |
+
parser.add_argument(
|
80 |
+
"--version",
|
81 |
+
"-v",
|
82 |
+
action="version",
|
83 |
+
version=f"pdfminer.six v{pdfminer.__version__}",
|
84 |
+
)
|
85 |
+
parser.add_argument(
|
86 |
+
"--debug",
|
87 |
+
"-d",
|
88 |
+
default=False,
|
89 |
+
action="store_true",
|
90 |
+
help="Use debug logging level.",
|
91 |
+
)
|
92 |
+
parser.add_argument(
|
93 |
+
"--disable-caching",
|
94 |
+
"-C",
|
95 |
+
default=False,
|
96 |
+
action="store_true",
|
97 |
+
help="If caching or resources, such as fonts, should be disabled.",
|
98 |
+
)
|
99 |
+
|
100 |
+
parse_params = parser.add_argument_group(
|
101 |
+
"Parser",
|
102 |
+
description="Used during PDF parsing",
|
103 |
+
)
|
104 |
+
parse_params.add_argument(
|
105 |
+
"--page-numbers",
|
106 |
+
type=int,
|
107 |
+
default=None,
|
108 |
+
nargs="+",
|
109 |
+
help="A space-seperated list of page numbers to parse.",
|
110 |
+
)
|
111 |
+
parse_params.add_argument(
|
112 |
+
"--pagenos",
|
113 |
+
"-p",
|
114 |
+
type=str,
|
115 |
+
help="A comma-separated list of page numbers to parse. "
|
116 |
+
"Included for legacy applications, use --page-numbers "
|
117 |
+
"for more idiomatic argument entry.",
|
118 |
+
)
|
119 |
+
parse_params.add_argument(
|
120 |
+
"--maxpages",
|
121 |
+
"-m",
|
122 |
+
type=int,
|
123 |
+
default=0,
|
124 |
+
help="The maximum number of pages to parse.",
|
125 |
+
)
|
126 |
+
parse_params.add_argument(
|
127 |
+
"--password",
|
128 |
+
"-P",
|
129 |
+
type=str,
|
130 |
+
default="",
|
131 |
+
help="The password to use for decrypting PDF file.",
|
132 |
+
)
|
133 |
+
parse_params.add_argument(
|
134 |
+
"--rotation",
|
135 |
+
"-R",
|
136 |
+
default=0,
|
137 |
+
type=int,
|
138 |
+
help="The number of degrees to rotate the PDF "
|
139 |
+
"before other types of processing.",
|
140 |
+
)
|
141 |
+
|
142 |
+
la_params = LAParams() # will be used for defaults
|
143 |
+
la_param_group = parser.add_argument_group(
|
144 |
+
"Layout analysis",
|
145 |
+
description="Used during layout analysis.",
|
146 |
+
)
|
147 |
+
la_param_group.add_argument(
|
148 |
+
"--no-laparams",
|
149 |
+
"-n",
|
150 |
+
default=False,
|
151 |
+
action="store_true",
|
152 |
+
help="If layout analysis parameters should be ignored.",
|
153 |
+
)
|
154 |
+
la_param_group.add_argument(
|
155 |
+
"--detect-vertical",
|
156 |
+
"-V",
|
157 |
+
default=la_params.detect_vertical,
|
158 |
+
action="store_true",
|
159 |
+
help="If vertical text should be considered during layout analysis",
|
160 |
+
)
|
161 |
+
la_param_group.add_argument(
|
162 |
+
"--line-overlap",
|
163 |
+
type=float,
|
164 |
+
default=la_params.line_overlap,
|
165 |
+
help="If two characters have more overlap than this they "
|
166 |
+
"are considered to be on the same line. The overlap is specified "
|
167 |
+
"relative to the minimum height of both characters.",
|
168 |
+
)
|
169 |
+
la_param_group.add_argument(
|
170 |
+
"--char-margin",
|
171 |
+
"-M",
|
172 |
+
type=float,
|
173 |
+
default=la_params.char_margin,
|
174 |
+
help="If two characters are closer together than this margin they "
|
175 |
+
"are considered to be part of the same line. The margin is "
|
176 |
+
"specified relative to the width of the character.",
|
177 |
+
)
|
178 |
+
la_param_group.add_argument(
|
179 |
+
"--word-margin",
|
180 |
+
"-W",
|
181 |
+
type=float,
|
182 |
+
default=la_params.word_margin,
|
183 |
+
help="If two characters on the same line are further apart than this "
|
184 |
+
"margin then they are considered to be two separate words, and "
|
185 |
+
"an intermediate space will be added for readability. The margin "
|
186 |
+
"is specified relative to the width of the character.",
|
187 |
+
)
|
188 |
+
la_param_group.add_argument(
|
189 |
+
"--line-margin",
|
190 |
+
"-L",
|
191 |
+
type=float,
|
192 |
+
default=la_params.line_margin,
|
193 |
+
help="If two lines are close together they are considered to "
|
194 |
+
"be part of the same paragraph. The margin is specified "
|
195 |
+
"relative to the height of a line.",
|
196 |
+
)
|
197 |
+
la_param_group.add_argument(
|
198 |
+
"--boxes-flow",
|
199 |
+
"-F",
|
200 |
+
type=float_or_disabled,
|
201 |
+
default=la_params.boxes_flow,
|
202 |
+
help="Specifies how much a horizontal and vertical position of a "
|
203 |
+
"text matters when determining the order of lines. The value "
|
204 |
+
"should be within the range of -1.0 (only horizontal position "
|
205 |
+
"matters) to +1.0 (only vertical position matters). You can also "
|
206 |
+
"pass `disabled` to disable advanced layout analysis, and "
|
207 |
+
"instead return text based on the position of the bottom left "
|
208 |
+
"corner of the text box.",
|
209 |
+
)
|
210 |
+
la_param_group.add_argument(
|
211 |
+
"--all-texts",
|
212 |
+
"-A",
|
213 |
+
default=la_params.all_texts,
|
214 |
+
action="store_true",
|
215 |
+
help="If layout analysis should be performed on text in figures.",
|
216 |
+
)
|
217 |
+
|
218 |
+
output_params = parser.add_argument_group(
|
219 |
+
"Output",
|
220 |
+
description="Used during output generation.",
|
221 |
+
)
|
222 |
+
output_params.add_argument(
|
223 |
+
"--outfile",
|
224 |
+
"-o",
|
225 |
+
type=str,
|
226 |
+
default="-",
|
227 |
+
help="Path to file where output is written. "
|
228 |
+
'Or "-" (default) to write to stdout.',
|
229 |
+
)
|
230 |
+
output_params.add_argument(
|
231 |
+
"--output_type",
|
232 |
+
"-t",
|
233 |
+
type=str,
|
234 |
+
default="text",
|
235 |
+
help="Type of output to generate {text,html,xml,tag}.",
|
236 |
+
)
|
237 |
+
output_params.add_argument(
|
238 |
+
"--codec",
|
239 |
+
"-c",
|
240 |
+
type=str,
|
241 |
+
default="utf-8",
|
242 |
+
help="Text encoding to use in output file.",
|
243 |
+
)
|
244 |
+
output_params.add_argument(
|
245 |
+
"--output-dir",
|
246 |
+
"-O",
|
247 |
+
default=None,
|
248 |
+
help="The output directory to put extracted images in. If not given, "
|
249 |
+
"images are not extracted.",
|
250 |
+
)
|
251 |
+
output_params.add_argument(
|
252 |
+
"--layoutmode",
|
253 |
+
"-Y",
|
254 |
+
default="normal",
|
255 |
+
type=str,
|
256 |
+
help="Type of layout to use when generating html "
|
257 |
+
"{normal,exact,loose}. If normal,each line is"
|
258 |
+
" positioned separately in the html. If exact"
|
259 |
+
", each character is positioned separately in"
|
260 |
+
" the html. If loose, same result as normal "
|
261 |
+
"but with an additional newline after each "
|
262 |
+
"text line. Only used when output_type is html.",
|
263 |
+
)
|
264 |
+
output_params.add_argument(
|
265 |
+
"--scale",
|
266 |
+
"-s",
|
267 |
+
type=float,
|
268 |
+
default=1.0,
|
269 |
+
help="The amount of zoom to use when generating html file. "
|
270 |
+
"Only used when output_type is html.",
|
271 |
+
)
|
272 |
+
output_params.add_argument(
|
273 |
+
"--strip-control",
|
274 |
+
"-S",
|
275 |
+
default=False,
|
276 |
+
action="store_true",
|
277 |
+
help="Remove control statement from text. "
|
278 |
+
"Only used when output_type is xml.",
|
279 |
+
)
|
280 |
+
|
281 |
+
return parser
|
282 |
+
|
283 |
+
|
284 |
+
def parse_args(args: Optional[List[str]]) -> argparse.Namespace:
|
285 |
+
parsed_args = create_parser().parse_args(args=args)
|
286 |
+
|
287 |
+
# Propagate parsed layout parameters to LAParams object
|
288 |
+
if parsed_args.no_laparams:
|
289 |
+
parsed_args.laparams = None
|
290 |
+
else:
|
291 |
+
parsed_args.laparams = LAParams(
|
292 |
+
line_overlap=parsed_args.line_overlap,
|
293 |
+
char_margin=parsed_args.char_margin,
|
294 |
+
line_margin=parsed_args.line_margin,
|
295 |
+
word_margin=parsed_args.word_margin,
|
296 |
+
boxes_flow=parsed_args.boxes_flow,
|
297 |
+
detect_vertical=parsed_args.detect_vertical,
|
298 |
+
all_texts=parsed_args.all_texts,
|
299 |
+
)
|
300 |
+
|
301 |
+
if parsed_args.page_numbers:
|
302 |
+
parsed_args.page_numbers = {x - 1 for x in parsed_args.page_numbers}
|
303 |
+
|
304 |
+
if parsed_args.pagenos:
|
305 |
+
parsed_args.page_numbers = {int(x) - 1 for x in parsed_args.pagenos.split(",")}
|
306 |
+
|
307 |
+
if parsed_args.output_type == "text" and parsed_args.outfile != "-":
|
308 |
+
for override, alttype in OUTPUT_TYPES:
|
309 |
+
if parsed_args.outfile.endswith(override):
|
310 |
+
parsed_args.output_type = alttype
|
311 |
+
|
312 |
+
return parsed_args
|
313 |
+
|
314 |
+
|
315 |
+
def main(args: Optional[List[str]] = None) -> int:
|
316 |
+
parsed_args = parse_args(args)
|
317 |
+
outfp = extract_text(**vars(parsed_args))
|
318 |
+
outfp.close()
|
319 |
+
return 0
|
320 |
+
|
321 |
+
|
322 |
+
if __name__ == "__main__":
|
323 |
+
sys.exit(main())
|
.venv/bin/pip
ADDED
@@ -0,0 +1,10 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
#!/bin/sh
|
2 |
+
'''exec' "/Users/husseinelsaadi/Documents/Data Science USAL/Spring 24-25/FYP - Codingo/Codingo/.venv/bin/python3.12" "$0" "$@"
|
3 |
+
' '''
|
4 |
+
# -*- coding: utf-8 -*-
|
5 |
+
import re
|
6 |
+
import sys
|
7 |
+
from pip._internal.cli.main import main
|
8 |
+
if __name__ == '__main__':
|
9 |
+
sys.argv[0] = re.sub(r'(-script\.pyw|\.exe)?$', '', sys.argv[0])
|
10 |
+
sys.exit(main())
|
.venv/bin/pip3
ADDED
@@ -0,0 +1,10 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
#!/bin/sh
|
2 |
+
'''exec' "/Users/husseinelsaadi/Documents/Data Science USAL/Spring 24-25/FYP - Codingo/Codingo/.venv/bin/python3.12" "$0" "$@"
|
3 |
+
' '''
|
4 |
+
# -*- coding: utf-8 -*-
|
5 |
+
import re
|
6 |
+
import sys
|
7 |
+
from pip._internal.cli.main import main
|
8 |
+
if __name__ == '__main__':
|
9 |
+
sys.argv[0] = re.sub(r'(-script\.pyw|\.exe)?$', '', sys.argv[0])
|
10 |
+
sys.exit(main())
|
.venv/bin/pip3.12
ADDED
@@ -0,0 +1,10 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
#!/bin/sh
|
2 |
+
'''exec' "/Users/husseinelsaadi/Documents/Data Science USAL/Spring 24-25/FYP - Codingo/Codingo/.venv/bin/python3.12" "$0" "$@"
|
3 |
+
' '''
|
4 |
+
# -*- coding: utf-8 -*-
|
5 |
+
import re
|
6 |
+
import sys
|
7 |
+
from pip._internal.cli.main import main
|
8 |
+
if __name__ == '__main__':
|
9 |
+
sys.argv[0] = re.sub(r'(-script\.pyw|\.exe)?$', '', sys.argv[0])
|
10 |
+
sys.exit(main())
|
.venv/bin/pygmentize
ADDED
@@ -0,0 +1,10 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
#!/bin/sh
|
2 |
+
'''exec' "/Users/husseinelsaadi/Documents/Data Science USAL/Spring 24-25/FYP - Codingo/Codingo/.venv/bin/python3.12" "$0" "$@"
|
3 |
+
' '''
|
4 |
+
# -*- coding: utf-8 -*-
|
5 |
+
import re
|
6 |
+
import sys
|
7 |
+
from pygments.cmdline import main
|
8 |
+
if __name__ == '__main__':
|
9 |
+
sys.argv[0] = re.sub(r'(-script\.pyw|\.exe)?$', '', sys.argv[0])
|
10 |
+
sys.exit(main())
|
.venv/bin/pymupdf
ADDED
@@ -0,0 +1,10 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
#!/bin/sh
|
2 |
+
'''exec' "/Users/husseinelsaadi/Documents/Data Science USAL/Spring 24-25/FYP - Codingo/Codingo/.venv/bin/python3.12" "$0" "$@"
|
3 |
+
' '''
|
4 |
+
# -*- coding: utf-8 -*-
|
5 |
+
import re
|
6 |
+
import sys
|
7 |
+
from pymupdf.__main__ import main
|
8 |
+
if __name__ == '__main__':
|
9 |
+
sys.argv[0] = re.sub(r'(-script\.pyw|\.exe)?$', '', sys.argv[0])
|
10 |
+
sys.exit(main())
|
.venv/bin/pyresparser
ADDED
@@ -0,0 +1,10 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
#!/bin/sh
|
2 |
+
'''exec' "/Users/husseinelsaadi/Documents/Data Science USAL/Spring 24-25/FYP - Codingo/Codingo/.venv/bin/python3.12" "$0" "$@"
|
3 |
+
' '''
|
4 |
+
# -*- coding: utf-8 -*-
|
5 |
+
import re
|
6 |
+
import sys
|
7 |
+
from pyresparser.command_line import main
|
8 |
+
if __name__ == '__main__':
|
9 |
+
sys.argv[0] = re.sub(r'(-script\.pyw|\.exe)?$', '', sys.argv[0])
|
10 |
+
sys.exit(main())
|
.venv/bin/python
ADDED
@@ -0,0 +1 @@
|
|
|
|
|
1 |
+
python3.12
|
.venv/bin/python3
ADDED
@@ -0,0 +1 @@
|
|
|
|
|
1 |
+
python3.12
|
.venv/bin/python3.12
ADDED
@@ -0,0 +1 @@
|
|
|
|
|
1 |
+
/Library/Frameworks/Python.framework/Versions/3.12/bin/python3.12
|
.venv/bin/spacy
ADDED
@@ -0,0 +1,10 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
#!/bin/sh
|
2 |
+
'''exec' "/Users/husseinelsaadi/Documents/Data Science USAL/Spring 24-25/FYP - Codingo/Codingo/.venv/bin/python3.12" "$0" "$@"
|
3 |
+
' '''
|
4 |
+
# -*- coding: utf-8 -*-
|
5 |
+
import re
|
6 |
+
import sys
|
7 |
+
from spacy.cli import setup_cli
|
8 |
+
if __name__ == '__main__':
|
9 |
+
sys.argv[0] = re.sub(r'(-script\.pyw|\.exe)?$', '', sys.argv[0])
|
10 |
+
sys.exit(setup_cli())
|
.venv/bin/tqdm
ADDED
@@ -0,0 +1,10 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
#!/bin/sh
|
2 |
+
'''exec' "/Users/husseinelsaadi/Documents/Data Science USAL/Spring 24-25/FYP - Codingo/Codingo/.venv/bin/python3.12" "$0" "$@"
|
3 |
+
' '''
|
4 |
+
# -*- coding: utf-8 -*-
|
5 |
+
import re
|
6 |
+
import sys
|
7 |
+
from tqdm.cli import main
|
8 |
+
if __name__ == '__main__':
|
9 |
+
sys.argv[0] = re.sub(r'(-script\.pyw|\.exe)?$', '', sys.argv[0])
|
10 |
+
sys.exit(main())
|
.venv/bin/typer
ADDED
@@ -0,0 +1,10 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
#!/bin/sh
|
2 |
+
'''exec' "/Users/husseinelsaadi/Documents/Data Science USAL/Spring 24-25/FYP - Codingo/Codingo/.venv/bin/python3.12" "$0" "$@"
|
3 |
+
' '''
|
4 |
+
# -*- coding: utf-8 -*-
|
5 |
+
import re
|
6 |
+
import sys
|
7 |
+
from typer.cli import main
|
8 |
+
if __name__ == '__main__':
|
9 |
+
sys.argv[0] = re.sub(r'(-script\.pyw|\.exe)?$', '', sys.argv[0])
|
10 |
+
sys.exit(main())
|
.venv/bin/weasel
ADDED
@@ -0,0 +1,10 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
#!/bin/sh
|
2 |
+
'''exec' "/Users/husseinelsaadi/Documents/Data Science USAL/Spring 24-25/FYP - Codingo/Codingo/.venv/bin/python3.12" "$0" "$@"
|
3 |
+
' '''
|
4 |
+
# -*- coding: utf-8 -*-
|
5 |
+
import re
|
6 |
+
import sys
|
7 |
+
from weasel.cli import app
|
8 |
+
if __name__ == '__main__':
|
9 |
+
sys.argv[0] = re.sub(r'(-script\.pyw|\.exe)?$', '', sys.argv[0])
|
10 |
+
sys.exit(app())
|
.venv/pyvenv.cfg
ADDED
@@ -0,0 +1,5 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
home = /Library/Frameworks/Python.framework/Versions/3.12/bin
|
2 |
+
include-system-site-packages = false
|
3 |
+
version = 3.12.6
|
4 |
+
executable = /Library/Frameworks/Python.framework/Versions/3.12/bin/python3.12
|
5 |
+
command = /Library/Frameworks/Python.framework/Versions/3.12/bin/python3 -m venv /Users/husseinelsaadi/Documents/Data Science USAL/Spring 24-25/FYP - Codingo/Codingo/.venv
|
backend/model/resume-parser/resume_to_features.py
ADDED
@@ -0,0 +1,10 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
import os
|
2 |
+
from pyresparser import ResumeParser
|
3 |
+
|
4 |
+
# Build absolute path to the resume file
|
5 |
+
current_dir = os.path.dirname(os.path.abspath(__file__))
|
6 |
+
resume_path = os.path.join(current_dir, '../../../data/resumes/Hussein El Saadi - CV.pdf')
|
7 |
+
|
8 |
+
# Parse and print the extracted data
|
9 |
+
data = ResumeParser(resume_path).get_extracted_data()
|
10 |
+
print(data)
|
backend/templates/index.html
ADDED
@@ -0,0 +1,615 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
<!DOCTYPE html>
|
2 |
+
<html lang="en">
|
3 |
+
<head>
|
4 |
+
<meta charset="UTF-8">
|
5 |
+
<meta name="viewport" content="width=device-width, initial-scale=1.0">
|
6 |
+
<title>Codingo - AI-Powered Recruitment Platform</title>
|
7 |
+
<style>
|
8 |
+
:root {
|
9 |
+
--primary: #4361ee;
|
10 |
+
--secondary: #3a0ca3;
|
11 |
+
--accent: #4cc9f0;
|
12 |
+
--light: #f8f9fa;
|
13 |
+
--dark: #212529;
|
14 |
+
--success: #2ecc71;
|
15 |
+
--warning: #f39c12;
|
16 |
+
--danger: #e74c3c;
|
17 |
+
}
|
18 |
+
|
19 |
+
* {
|
20 |
+
margin: 0;
|
21 |
+
padding: 0;
|
22 |
+
box-sizing: border-box;
|
23 |
+
font-family: 'Segoe UI', Tahoma, Geneva, Verdana, sans-serif;
|
24 |
+
}
|
25 |
+
|
26 |
+
body {
|
27 |
+
background-color: var(--light);
|
28 |
+
color: var(--dark);
|
29 |
+
line-height: 1.6;
|
30 |
+
perspective: 1000px;
|
31 |
+
overflow-x: hidden;
|
32 |
+
}
|
33 |
+
|
34 |
+
header {
|
35 |
+
background: linear-gradient(135deg, var(--primary), var(--secondary));
|
36 |
+
color: white;
|
37 |
+
padding: 1rem;
|
38 |
+
position: sticky;
|
39 |
+
top: 0;
|
40 |
+
z-index: 100;
|
41 |
+
box-shadow: 0 4px 15px rgba(0, 0, 0, 0.2);
|
42 |
+
}
|
43 |
+
|
44 |
+
.container {
|
45 |
+
width: 100%;
|
46 |
+
max-width: 1200px;
|
47 |
+
margin: 0 auto;
|
48 |
+
padding: 0 1rem;
|
49 |
+
}
|
50 |
+
|
51 |
+
.nav-container {
|
52 |
+
display: flex;
|
53 |
+
justify-content: space-between;
|
54 |
+
align-items: center;
|
55 |
+
}
|
56 |
+
|
57 |
+
.logo {
|
58 |
+
display: flex;
|
59 |
+
align-items: center;
|
60 |
+
font-size: 2rem;
|
61 |
+
font-weight: bold;
|
62 |
+
color: white;
|
63 |
+
text-decoration: none;
|
64 |
+
transition: transform 0.3s ease;
|
65 |
+
}
|
66 |
+
|
67 |
+
.logo:hover {
|
68 |
+
transform: scale(1.05);
|
69 |
+
}
|
70 |
+
|
71 |
+
.logo-part1 {
|
72 |
+
color: white;
|
73 |
+
}
|
74 |
+
|
75 |
+
.logo-part2 {
|
76 |
+
color: #FF8C42;
|
77 |
+
font-style: italic;
|
78 |
+
text-shadow: 0px 0px 10px rgba(255, 140, 66, 0.7);
|
79 |
+
animation: glow 2s ease-in-out infinite alternate;
|
80 |
+
}
|
81 |
+
|
82 |
+
@keyframes glow {
|
83 |
+
from {
|
84 |
+
text-shadow: 0 0 5px rgba(255, 140, 66, 0.7);
|
85 |
+
}
|
86 |
+
to {
|
87 |
+
text-shadow: 0 0 15px rgba(255, 140, 66, 1), 0 0 20px rgba(255, 140, 66, 0.8);
|
88 |
+
}
|
89 |
+
}
|
90 |
+
|
91 |
+
.logo-part3 {
|
92 |
+
color: var(--accent);
|
93 |
+
}
|
94 |
+
|
95 |
+
.login-buttons {
|
96 |
+
display: flex;
|
97 |
+
gap: 1rem;
|
98 |
+
}
|
99 |
+
|
100 |
+
.btn {
|
101 |
+
padding: 0.5rem 1.5rem;
|
102 |
+
border-radius: 5px;
|
103 |
+
border: none;
|
104 |
+
cursor: pointer;
|
105 |
+
font-weight: 500;
|
106 |
+
transition: all 0.3s ease;
|
107 |
+
text-decoration: none;
|
108 |
+
position: relative;
|
109 |
+
overflow: hidden;
|
110 |
+
z-index: 1;
|
111 |
+
}
|
112 |
+
|
113 |
+
.btn::before {
|
114 |
+
content: '';
|
115 |
+
position: absolute;
|
116 |
+
top: 0;
|
117 |
+
left: 0;
|
118 |
+
width: 0%;
|
119 |
+
height: 100%;
|
120 |
+
background-color: rgba(255, 255, 255, 0.1);
|
121 |
+
transition: all 0.3s ease;
|
122 |
+
z-index: -1;
|
123 |
+
}
|
124 |
+
|
125 |
+
.btn:hover::before {
|
126 |
+
width: 100%;
|
127 |
+
}
|
128 |
+
|
129 |
+
.btn-primary {
|
130 |
+
background-color: var(--accent);
|
131 |
+
color: var(--dark);
|
132 |
+
box-shadow: 0 4px 8px rgba(76, 201, 240, 0.3);
|
133 |
+
}
|
134 |
+
|
135 |
+
.btn-outline {
|
136 |
+
background-color: transparent;
|
137 |
+
border: 2px solid white;
|
138 |
+
color: white;
|
139 |
+
}
|
140 |
+
|
141 |
+
.btn:hover {
|
142 |
+
transform: translateY(-2px);
|
143 |
+
box-shadow: 0 5px 15px rgba(0, 0, 0, 0.2);
|
144 |
+
}
|
145 |
+
|
146 |
+
.hero {
|
147 |
+
background: linear-gradient(rgba(67, 97, 238, 0.8), rgba(58, 12, 163, 0.9)), url("/api/placeholder/1200/600") no-repeat center center/cover;
|
148 |
+
color: white;
|
149 |
+
text-align: center;
|
150 |
+
padding: 5rem 1rem;
|
151 |
+
position: relative;
|
152 |
+
overflow: hidden;
|
153 |
+
}
|
154 |
+
|
155 |
+
.hero::before {
|
156 |
+
content: '';
|
157 |
+
position: absolute;
|
158 |
+
top: -50%;
|
159 |
+
left: -50%;
|
160 |
+
width: 200%;
|
161 |
+
height: 200%;
|
162 |
+
background: radial-gradient(circle, rgba(255, 255, 255, 0.1) 0%, rgba(255, 255, 255, 0) 60%);
|
163 |
+
animation: rotate 30s linear infinite;
|
164 |
+
z-index: 1;
|
165 |
+
}
|
166 |
+
|
167 |
+
@keyframes rotate {
|
168 |
+
from {
|
169 |
+
transform: rotate(0deg);
|
170 |
+
}
|
171 |
+
to {
|
172 |
+
transform: rotate(360deg);
|
173 |
+
}
|
174 |
+
}
|
175 |
+
|
176 |
+
.hero-content {
|
177 |
+
display: flex;
|
178 |
+
flex-direction: column;
|
179 |
+
align-items: center;
|
180 |
+
transform-style: preserve-3d;
|
181 |
+
transition: transform 0.3s ease;
|
182 |
+
position: relative;
|
183 |
+
z-index: 2;
|
184 |
+
}
|
185 |
+
|
186 |
+
.hero h1 {
|
187 |
+
font-size: 3rem;
|
188 |
+
margin-bottom: 1.5rem;
|
189 |
+
font-weight: 700;
|
190 |
+
text-shadow: 2px 2px 4px rgba(0, 0, 0, 0.3);
|
191 |
+
transform: translateZ(20px);
|
192 |
+
animation: fadeIn 1s ease-out;
|
193 |
+
}
|
194 |
+
|
195 |
+
@keyframes fadeIn {
|
196 |
+
from {
|
197 |
+
opacity: 0;
|
198 |
+
transform: translateY(20px) translateZ(20px);
|
199 |
+
}
|
200 |
+
to {
|
201 |
+
opacity: 1;
|
202 |
+
transform: translateY(0) translateZ(20px);
|
203 |
+
}
|
204 |
+
}
|
205 |
+
|
206 |
+
.hero p {
|
207 |
+
font-size: 1.2rem;
|
208 |
+
max-width: 800px;
|
209 |
+
margin: 0 auto 2rem;
|
210 |
+
transform: translateZ(10px);
|
211 |
+
animation: fadeIn 1s ease-out 0.3s both;
|
212 |
+
}
|
213 |
+
|
214 |
+
.luna-avatar-container {
|
215 |
+
position: relative;
|
216 |
+
width: 250px;
|
217 |
+
height: 250px;
|
218 |
+
margin-bottom: 2rem;
|
219 |
+
perspective: 1000px;
|
220 |
+
}
|
221 |
+
|
222 |
+
.luna-avatar {
|
223 |
+
width: 100%;
|
224 |
+
height: 100%;
|
225 |
+
border-radius: 50%;
|
226 |
+
border: 4px solid var(--accent);
|
227 |
+
box-shadow: 0 10px 30px rgba(0, 0, 0, 0.3), 0 0 20px rgba(76, 201, 240, 0.5);
|
228 |
+
overflow: hidden;
|
229 |
+
animation: float 4s ease-in-out infinite;
|
230 |
+
position: relative;
|
231 |
+
z-index: 2;
|
232 |
+
background: linear-gradient(135deg, rgba(76, 201, 240, 0.2), rgba(58, 12, 163, 0.2));
|
233 |
+
display: flex;
|
234 |
+
justify-content: center;
|
235 |
+
align-items: center;
|
236 |
+
}
|
237 |
+
|
238 |
+
.luna-glow {
|
239 |
+
position: absolute;
|
240 |
+
width: 100%;
|
241 |
+
height: 100%;
|
242 |
+
border-radius: 50%;
|
243 |
+
background: radial-gradient(circle, rgba(76, 201, 240, 0.8) 0%, rgba(76, 201, 240, 0) 70%);
|
244 |
+
filter: blur(15px);
|
245 |
+
opacity: 0.7;
|
246 |
+
z-index: 1;
|
247 |
+
animation: pulse 4s ease-in-out infinite alternate;
|
248 |
+
}
|
249 |
+
|
250 |
+
@keyframes pulse {
|
251 |
+
0% { transform: scale(0.9); opacity: 0.5; }
|
252 |
+
100% { transform: scale(1.1); opacity: 0.7; }
|
253 |
+
}
|
254 |
+
|
255 |
+
@keyframes float {
|
256 |
+
0% { transform: translateY(0px) rotateY(0deg); }
|
257 |
+
50% { transform: translateY(-15px) rotateY(5deg); }
|
258 |
+
100% { transform: translateY(0px) rotateY(0deg); }
|
259 |
+
}
|
260 |
+
|
261 |
+
.luna-avatar img {
|
262 |
+
width: 100%;
|
263 |
+
height: 100%;
|
264 |
+
object-fit: cover;
|
265 |
+
object-position: center -10px;
|
266 |
+
top: 0;
|
267 |
+
left: 0;
|
268 |
+
}
|
269 |
+
|
270 |
+
|
271 |
+
.hero-buttons {
|
272 |
+
display: flex;
|
273 |
+
justify-content: center;
|
274 |
+
gap: 1.5rem;
|
275 |
+
flex-wrap: wrap;
|
276 |
+
transform: translateZ(15px);
|
277 |
+
animation: fadeIn 1s ease-out 0.6s both;
|
278 |
+
}
|
279 |
+
|
280 |
+
.features {
|
281 |
+
padding: 5rem 1rem;
|
282 |
+
background-color: white;
|
283 |
+
position: relative;
|
284 |
+
z-index: 1;
|
285 |
+
}
|
286 |
+
|
287 |
+
.section-title {
|
288 |
+
text-align: center;
|
289 |
+
margin-bottom: 3rem;
|
290 |
+
transform-style: preserve-3d;
|
291 |
+
transition: transform 0.3s ease;
|
292 |
+
}
|
293 |
+
|
294 |
+
.section-title h2 {
|
295 |
+
font-size: 2.5rem;
|
296 |
+
color: var(--primary);
|
297 |
+
margin-bottom: 1rem;
|
298 |
+
position: relative;
|
299 |
+
display: inline-block;
|
300 |
+
}
|
301 |
+
|
302 |
+
.section-title h2::after {
|
303 |
+
content: '';
|
304 |
+
position: absolute;
|
305 |
+
bottom: -10px;
|
306 |
+
left: 50%;
|
307 |
+
transform: translateX(-50%);
|
308 |
+
width: 80px;
|
309 |
+
height: 3px;
|
310 |
+
background: linear-gradient(to right, var(--primary), var(--accent));
|
311 |
+
}
|
312 |
+
|
313 |
+
.section-title p {
|
314 |
+
max-width: 600px;
|
315 |
+
margin: 0 auto;
|
316 |
+
color: #666;
|
317 |
+
}
|
318 |
+
|
319 |
+
.features-grid {
|
320 |
+
display: grid;
|
321 |
+
grid-template-columns: repeat(auto-fit, minmax(300px, 1fr));
|
322 |
+
gap: 2rem;
|
323 |
+
}
|
324 |
+
|
325 |
+
.feature-card {
|
326 |
+
background-color: white;
|
327 |
+
border-radius: 10px;
|
328 |
+
overflow: hidden;
|
329 |
+
box-shadow: 0 5px 15px rgba(0, 0, 0, 0.05);
|
330 |
+
transition: all 0.5s ease;
|
331 |
+
display: flex;
|
332 |
+
flex-direction: column;
|
333 |
+
height: 100%;
|
334 |
+
transform-style: preserve-3d;
|
335 |
+
perspective: 1000px;
|
336 |
+
}
|
337 |
+
|
338 |
+
.feature-card:hover {
|
339 |
+
transform: translateY(-10px) rotateX(5deg);
|
340 |
+
box-shadow: 0 15px 30px rgba(0, 0, 0, 0.1);
|
341 |
+
}
|
342 |
+
|
343 |
+
.feature-icon {
|
344 |
+
background: linear-gradient(135deg, rgba(67, 97, 238, 0.1), rgba(76, 201, 240, 0.1));
|
345 |
+
padding: 2rem;
|
346 |
+
display: flex;
|
347 |
+
justify-content: center;
|
348 |
+
align-items: center;
|
349 |
+
font-size: 2.5rem;
|
350 |
+
color: var(--primary);
|
351 |
+
transition: all 0.3s ease;
|
352 |
+
}
|
353 |
+
|
354 |
+
.feature-card:hover .feature-icon {
|
355 |
+
transform: translateZ(20px);
|
356 |
+
color: var(--accent);
|
357 |
+
}
|
358 |
+
|
359 |
+
.feature-content {
|
360 |
+
padding: 1.5rem;
|
361 |
+
flex-grow: 1;
|
362 |
+
transform: translateZ(0);
|
363 |
+
transition: transform 0.3s ease;
|
364 |
+
}
|
365 |
+
|
366 |
+
.feature-card:hover .feature-content {
|
367 |
+
transform: translateZ(10px);
|
368 |
+
}
|
369 |
+
|
370 |
+
.feature-content h3 {
|
371 |
+
font-size: 1.5rem;
|
372 |
+
margin-bottom: 1rem;
|
373 |
+
color: var(--dark);
|
374 |
+
position: relative;
|
375 |
+
display: inline-block;
|
376 |
+
}
|
377 |
+
|
378 |
+
.feature-content h3::after {
|
379 |
+
content: '';
|
380 |
+
position: absolute;
|
381 |
+
bottom: -5px;
|
382 |
+
left: 0;
|
383 |
+
width: 40px;
|
384 |
+
height: 2px;
|
385 |
+
background: var(--primary);
|
386 |
+
transition: width 0.3s ease;
|
387 |
+
}
|
388 |
+
|
389 |
+
.feature-card:hover .feature-content h3::after {
|
390 |
+
width: 100%;
|
391 |
+
}
|
392 |
+
|
393 |
+
.cta {
|
394 |
+
background: linear-gradient(135deg, var(--secondary), var(--primary));
|
395 |
+
color: white;
|
396 |
+
padding: 5rem 1rem;
|
397 |
+
text-align: center;
|
398 |
+
position: relative;
|
399 |
+
overflow: hidden;
|
400 |
+
}
|
401 |
+
|
402 |
+
.cta::before {
|
403 |
+
content: '';
|
404 |
+
position: absolute;
|
405 |
+
top: -50%;
|
406 |
+
left: -50%;
|
407 |
+
width: 200%;
|
408 |
+
height: 200%;
|
409 |
+
background: radial-gradient(circle, rgba(255, 255, 255, 0.1) 0%, rgba(255, 255, 255, 0) 60%);
|
410 |
+
animation: rotate 20s linear infinite;
|
411 |
+
z-index: 1;
|
412 |
+
}
|
413 |
+
|
414 |
+
.cta .container {
|
415 |
+
position: relative;
|
416 |
+
z-index: 2;
|
417 |
+
}
|
418 |
+
|
419 |
+
.cta h2 {
|
420 |
+
font-size: 2.5rem;
|
421 |
+
margin-bottom: 1.5rem;
|
422 |
+
text-shadow: 2px 2px 4px rgba(0, 0, 0, 0.3);
|
423 |
+
}
|
424 |
+
|
425 |
+
.cta p {
|
426 |
+
max-width: 600px;
|
427 |
+
margin: 0 auto 2rem;
|
428 |
+
font-size: 1.2rem;
|
429 |
+
}
|
430 |
+
|
431 |
+
footer {
|
432 |
+
background-color: var(--dark);
|
433 |
+
color: white;
|
434 |
+
padding: 3rem 1rem;
|
435 |
+
}
|
436 |
+
|
437 |
+
.footer-grid {
|
438 |
+
display: grid;
|
439 |
+
grid-template-columns: repeat(auto-fit, minmax(200px, 1fr));
|
440 |
+
gap: 2rem;
|
441 |
+
}
|
442 |
+
|
443 |
+
.footer-col h3 {
|
444 |
+
font-size: 1.2rem;
|
445 |
+
margin-bottom: 1.5rem;
|
446 |
+
color: var(--accent);
|
447 |
+
position: relative;
|
448 |
+
display: inline-block;
|
449 |
+
}
|
450 |
+
|
451 |
+
.footer-col h3::after {
|
452 |
+
content: '';
|
453 |
+
position: absolute;
|
454 |
+
bottom: -8px;
|
455 |
+
left: 0;
|
456 |
+
width: 30px;
|
457 |
+
height: 2px;
|
458 |
+
background: var(--accent);
|
459 |
+
}
|
460 |
+
|
461 |
+
.social-links {
|
462 |
+
display: flex;
|
463 |
+
gap: 1rem;
|
464 |
+
margin-top: 1rem;
|
465 |
+
}
|
466 |
+
|
467 |
+
.social-links a {
|
468 |
+
color: white;
|
469 |
+
background-color: rgba(255, 255, 255, 0.1);
|
470 |
+
width: 40px;
|
471 |
+
height: 40px;
|
472 |
+
border-radius: 50%;
|
473 |
+
display: flex;
|
474 |
+
justify-content: center;
|
475 |
+
align-items: center;
|
476 |
+
transition: all 0.3s ease;
|
477 |
+
}
|
478 |
+
|
479 |
+
.social-links a:hover {
|
480 |
+
background-color: var(--accent);
|
481 |
+
color: var(--dark);
|
482 |
+
transform: translateY(-3px);
|
483 |
+
}
|
484 |
+
|
485 |
+
.copyright {
|
486 |
+
margin-top: 2rem;
|
487 |
+
border-top: 1px solid rgba(255, 255, 255, 0.1);
|
488 |
+
padding-top: 1.5rem;
|
489 |
+
text-align: center;
|
490 |
+
color: #aaa;
|
491 |
+
}
|
492 |
+
|
493 |
+
@media screen and (max-width: 768px) {
|
494 |
+
.nav-container {
|
495 |
+
flex-direction: row;
|
496 |
+
justify-content: space-between;
|
497 |
+
}
|
498 |
+
|
499 |
+
.luna-avatar-container {
|
500 |
+
width: 180px;
|
501 |
+
height: 180px;
|
502 |
+
}
|
503 |
+
|
504 |
+
.hero h1 {
|
505 |
+
font-size: 2rem;
|
506 |
+
}
|
507 |
+
|
508 |
+
.hero p {
|
509 |
+
font-size: 1rem;
|
510 |
+
}
|
511 |
+
}
|
512 |
+
</style>
|
513 |
+
</head>
|
514 |
+
<body>
|
515 |
+
<header>
|
516 |
+
<div class="container nav-container">
|
517 |
+
<a href="#" class="logo">
|
518 |
+
<span class="logo-part1">Cod</span><span class="logo-part2">in</span><span class="logo-part3">go</span>
|
519 |
+
</a>
|
520 |
+
<div class="login-buttons">
|
521 |
+
<a href="#" class="btn btn-outline">Log In</a>
|
522 |
+
<a href="#" class="btn btn-primary">Sign Up</a>
|
523 |
+
</div>
|
524 |
+
</div>
|
525 |
+
</header>
|
526 |
+
|
527 |
+
<section class="hero">
|
528 |
+
<div class="container">
|
529 |
+
<div class="hero-content">
|
530 |
+
<div class="luna-avatar-container">
|
531 |
+
<div class="luna-glow"></div>
|
532 |
+
<div class="luna-avatar">
|
533 |
+
<img src="../static/images/LUNA.png" alt="LUNA AI Assistant">
|
534 |
+
</div>
|
535 |
+
</div>
|
536 |
+
<h1>Meet LUNA, Your AI Recruitment Assistant</h1>
|
537 |
+
<p>Revolutionize your hiring process with AI-powered candidate screening, automated interviews, and
|
538 |
+
intelligent skill matching to find your perfect technical talent.</p>
|
539 |
+
<div class="hero-buttons">
|
540 |
+
<a href="#" class="btn btn-primary">Get Started</a>
|
541 |
+
<a href="#" class="btn btn-outline">Watch Demo</a>
|
542 |
+
</div>
|
543 |
+
</div>
|
544 |
+
</div>
|
545 |
+
</section>
|
546 |
+
|
547 |
+
<section class="features" id="features">
|
548 |
+
<div class="container">
|
549 |
+
<div class="section-title">
|
550 |
+
<h2>Platform Features</h2>
|
551 |
+
<p>Codingo streamlines every step of your technical hiring process with cutting-edge AI technology</p>
|
552 |
+
</div>
|
553 |
+
<div class="features-grid">
|
554 |
+
<div class="feature-card">
|
555 |
+
<div class="feature-icon">
|
556 |
+
<span>🤖</span>
|
557 |
+
</div>
|
558 |
+
<div class="feature-content">
|
559 |
+
<h3>AI CV Analysis</h3>
|
560 |
+
<p>LUNA automatically analyzes resumes, matching candidates to job requirements with precision and
|
561 |
+
eliminating unconscious bias.</p>
|
562 |
+
</div>
|
563 |
+
</div>
|
564 |
+
<div class="feature-card">
|
565 |
+
<div class="feature-icon">
|
566 |
+
<span>🎯</span>
|
567 |
+
</div>
|
568 |
+
<div class="feature-content">
|
569 |
+
<h3>Smart Shortlisting</h3>
|
570 |
+
<p>Our 70% skill match threshold ensures you only interview candidates who meet your technical
|
571 |
+
requirements.</p>
|
572 |
+
</div>
|
573 |
+
</div>
|
574 |
+
<div class="feature-card">
|
575 |
+
<div class="feature-icon">
|
576 |
+
<span>🖥️</span>
|
577 |
+
</div>
|
578 |
+
<div class="feature-content">
|
579 |
+
<h3>AI-Led Interviews</h3>
|
580 |
+
<p>Structured interview sessions assess soft skills, technical knowledge, and coding abilities with
|
581 |
+
real-time monitoring.</p>
|
582 |
+
</div>
|
583 |
+
</div>
|
584 |
+
</div>
|
585 |
+
</div>
|
586 |
+
</section>
|
587 |
+
|
588 |
+
<section class="cta">
|
589 |
+
<div class="container">
|
590 |
+
<h2>Ready to Transform Your Technical Hiring?</h2>
|
591 |
+
<p>Join hundreds of companies finding their perfect tech talent faster and more efficiently with Codingo.</p>
|
592 |
+
<a href="#" class="btn btn-primary">Start Free Trial</a>
|
593 |
+
</div>
|
594 |
+
</section>
|
595 |
+
|
596 |
+
<footer>
|
597 |
+
<div class="container">
|
598 |
+
<div class="footer-grid">
|
599 |
+
<div class="footer-col">
|
600 |
+
<h3>Codingo</h3>
|
601 |
+
<p>AI-powered recruitment platform that revolutionizes how companies hire technical talent.</p>
|
602 |
+
<div class="social-links">
|
603 |
+
<a href="#"><span>f</span></a>
|
604 |
+
<a href="#"><span>t</span></a>
|
605 |
+
<a href="#"><span>in</span></a>
|
606 |
+
</div>
|
607 |
+
</div>
|
608 |
+
</div>
|
609 |
+
<div class="copyright">
|
610 |
+
<p>© 2025 Codingo. All rights reserved.</p>
|
611 |
+
</div>
|
612 |
+
</div>
|
613 |
+
</footer>
|
614 |
+
</body>
|
615 |
+
</html>
|
data/resumes/Hussein El Saadi - CV.pdf
ADDED
Binary file (72.9 kB). View file
|
|
data/resumes/Mohamad Moallem(CV-2024).pdf
ADDED
Binary file (58.9 kB). View file
|
|
requirements.txt
CHANGED
@@ -3,4 +3,7 @@ scikit-learn
|
|
3 |
pandas
|
4 |
joblib
|
5 |
PyMuPDF
|
6 |
-
python-docx
|
|
|
|
|
|
|
|
3 |
pandas
|
4 |
joblib
|
5 |
PyMuPDF
|
6 |
+
python-docx
|
7 |
+
spacy>=3.0.0
|
8 |
+
nltk
|
9 |
+
pyresparser
|