Spaces:
Running
Running
# UIED - UI element detection, detecting UI elements from UI screenshots or drawnings | |
This project is still ongoing and this repo may be updated irregularly, I developed a web app for the UIED in http://uied.online | |
## Related Publications: | |
[1. UIED: a hybrid tool for GUI element detection](https://dl.acm.org/doi/10.1145/3368089.3417940) | |
[2. Object Detection for Graphical User Interface: Old Fashioned or Deep Learning or a Combination?](https://arxiv.org/abs/2008.05132) | |
>The repo has been **upgraded with Google OCR** for GUI text detection, to use the original version in our paper (using [EAST](https://github.com/argman/EAST) as text detector), check the relase [v2.3](https://github.com/MulongXie/UIED/releases/tag/v2.3) and download the pre-trained model in [this link](https://drive.google.com/drive/folders/1MK0Om7Lx0wRXGDfNcyj21B0FL1T461v5?usp=sharing). | |
## What is it? | |
UI Element Detection (UIED) is an old-fashioned computer vision (CV) based element detection approach for graphic user interface. | |
The input of UIED could be various UI image, such as mobile app or web page screenshot, UI design drawn by Photoshop or Sketch, and even some hand-drawn UI design. Then the approach detects and classifies text and graphic UI elements, and exports the detection result as JSON file for future application. | |
UIED comprises two parts to detect UI text and graphic elements, such as button, image and input bar. | |
* For text, it leverages [Google OCR](https://cloud.google.com/vision/docs/ocr) to perfrom detection. | |
* For graphical elements, it uses old-fashioned CV approaches to locate the elements and a CNN classifier to achieve classification. | |
> UIED is highly customizable, you can replace both parts by your choice (e.g. other text detection approaches). Unlike black-box end-to-end deep learning approach, you can revise the algorithms in the non-text detection and merging (partially or entirely) easily to fit your task. | |
 | |
## How to use? | |
### Dependency | |
* **Python 3.5** | |
* **Opencv 3.4.2** | |
* **Pandas** | |
<!-- * **Tensorflow 1.10.0** | |
* **Keras 2.2.4** | |
* **Sklearn 0.22.2** --> | |
### Installation | |
<!-- Install the mentioned dependencies, and download two pre-trained models from [this link](https://drive.google.com/drive/folders/1MK0Om7Lx0wRXGDfNcyj21B0FL1T461v5?usp=sharing) for EAST text detection and GUI element classification. --> | |
<!-- Change ``CNN_PATH`` and ``EAST_PATH`` in *config/CONFIG.py* to your locations. --> | |
The new version of UIED equipped with Google OCR is easy to deploy and no pre-trained model is needed. Simply donwload the repo along with the dependencies. | |
> Please replace the Google OCR key at `detect_text/ocr.py line 28` with your own (apply in [Google website](https://cloud.google.com/vision)). | |
### Usage | |
To test your own image(s): | |
* To test single image, change *input_path_img* in ``run_single.py`` to your input image and the results will be output to *output_root*. | |
* To test mutiple images, change *input_img_root* in ``run_batch.py`` to your input directory and the results will be output to *output_root*. | |
* To adjust the parameters lively, using ``run_testing.py`` | |
> Note: The best set of parameters vary for different types of GUI image (Mobile App, Web, PC). I highly recommend to first play with the ``run_testing.py`` to pick a good set of parameters for your data. | |
## Folder structure | |
``cnn/`` | |
* Used to train classifier for graphic UI elements | |
* Set path of the CNN classification model | |
``config/`` | |
* Set data paths | |
* Set parameters for graphic elements detection | |
``data/`` | |
* Input UI images and output detection results | |
``detect_compo/`` | |
* Non-text GUI component detection | |
``detect_text/`` | |
* GUI text detection using Google OCR | |
``detect_merge/`` | |
* Merge the detection results of non-text and text GUI elements | |
The major detection algorithms are in ``detect_compo/``, ``detect_text/`` and ``detect_merge/`` | |
## Demo | |
GUI element detection result for web screenshot | |
 | |