|
## What's the point of this? |
|
|
|
LaTeX is the de-facto standard markup language for typesetting pretty equations in academic papers. |
|
It is extremely feature rich and flexible but very verbose. |
|
This makes it great for typesetting complex equations, but not very convenient for quick note-taking on the fly. |
|
|
|
For example, here's a short equation from [this page](https://en.wikipedia.org/wiki/Quantum_electrodynamics) on Wikipedia about Quantum Electrodynamics |
|
and the corresponding LaTeX code: |
|
|
|
 |
|
|
|
``` |
|
{\displaystyle {\mathcal {L}}={\bar {\psi }}(i\gamma ^{\mu }D_{\mu }-m)\psi -{\frac {1}{4}}F_{\mu \nu }F^{\mu \nu },} |
|
``` |
|
|
|
|
|
This demo is a first step in solving that problem. |
|
Eventually, you'll be able to take a quick screenshot of an equation from a paper |
|
and a program built with this model will generate its corresponding LaTeX source code |
|
so that you can just copy/paste straight into your personal notes. |
|
No more endless googling obscure LaTeX syntax! |
|
|
|
## How does it work? |
|
|
|
Because this problem involves looking at an image and generating valid LaTeX code, |
|
the model needs to understand both Computer Vision (CV) and Natural Language Processing (NLP). |
|
There are some other projects that aim to solve the same problem with some very interesting architectures |
|
that generally involve some kind of "encoder" that looks at the image and extracts and encodes the information about the equation from the image, |
|
and a "decoder" that takes that information and translates it into what is hopefully both valid and accurate LaTeX code. |
|
|
|
Examples: |
|
... |
|
|
|
I chose to tackle this problem with transfer learning. |
|
The biggest reason for this is computing constraints - |
|
I don't have unlimited access to GPU hours and wanted training to be reasonably fast, on the order of a couple of hours. |
|
There are some other benefits to this approach, |
|
e.g. the architecture is already proven to be robust enough for various applications, so less time spent on trial and error. |
|
|
|
I chose TrOCR, an OCR machine learning model trained by Microsoft on SRIOE data to produce text from receipts. |
|
|
|
<p style='text-align: center'>Made by Young Ho Shin</p> |
|
<p style='text-align: center'> |
|
<a href = "mailto: [email protected]">Email</a> | |
|
<a href='https://www.github.com/yhshin11'>Github</a> | |
|
<a href='https://www.linkedin.com/in/young-ho-shin-3995051b9/'>Linkedin</a> |
|
|
|
</p> |