Spaces:

mrbear1024
/

mimictalk

Build error

App Files Files Community

mimictalk / docs /process_data /process_th1kh.md

mrbear1024

init project

8eb4303 4 months ago

preview code

raw

history blame contribute delete

1.74 kB

	# process dataset
	we use Talking-Head-1K-Hour as the example.

	## download and crop the talking person video clips
	- Please follow the step in [https://github.com/tcwang0509/TalkingHead-1KH](https://github.com/tcwang0509/TalkingHead-1KH)
	- Put all extracted video clips in a directory like `/home/xxx/TH1KH_512/video_raw/*.mp4`

	## resample & resize video clips to 512x512 resolution and 25FPS
	- You can use the example code in `data_gen/utils/process_video/resample_video_to_25fps_resize_to_512.py`
	- It will generate processed video clips in `/home/xxx/TH1KH_512/video/*.mp4`

	## extract segment images
	- You can use the example code in `data_gen/utils/process_video/extract_segment_imgs.py`
	- It will generate segment images in `/home/xxx/TH1KH_512/{gt_imgs, head_imgs, inpaint_torso_imgs, com_imgs}/*`

	## extract 2d facial landmark
	- You can use the example code in `data_gen/utils/process_video/extract_lm2d.py`
	- It will generate 2d landmarks in `/home/xxx/TH1KH_512/lms_2d/*_lms_2d.npy`

	## extract 3dmm coefficients
	- You can use the example code in `data_gen/utils/process_video/fit_3dmm_landmark.py`
	- It will generate 3dmm coefficients in `/home/xxx/TH1KH_512/coeff_fit_mp/*_coeff_fit_mp.npy`

	## extract audio features
	- You can use the example code in `data_gen/utils/process_audio/extract_mel_f0.py`
	- It will generate raw wav in `/home/xxx/TH1KH_512/audio/.wav` and mel_f0 in `/home/xxx/TH1KH_512/mel_f0/_mel_f0.npy`
	- You can use the example code in `data_gen/utils/process_audio/extract_hubert.py`
	- It will generate hubert in `/home/xxx/TH1KH_512/hubert/*_hubert.npy`

	## Binarize the dataset
	- You can use the example code in `data_gen/runs/binarizer_th1kh.py`
	- You will see a binarized dataset at `data/binary/th1kh`