|
|
2 months ago | |
|---|---|---|
| .. | ||
| evaluation | 2 months ago | |
| model | 2 months ago | |
| models | 2 months ago | |
| tutorials | 2 months ago | |
| .gitignore | 2 months ago | |
| LICENSE | 2 months ago | |
| README.md | 2 months ago | |
| config.py | 2 months ago | |
| dataset.py | 2 months ago | |
| eval_existingOnes.py | 2 months ago | |
| gen_best_ep.py | 2 months ago | |
| image_proc.py | 2 months ago | |
| inference.py | 2 months ago | |
| loss.py | 2 months ago | |
| make_a_copy.sh | 2 months ago | |
| requirements.txt | 2 months ago | |
| rm_cache.sh | 2 months ago | |
| sub.sh | 2 months ago | |
| test.sh | 2 months ago | |
| train.py | 2 months ago | |
| train.sh | 2 months ago | |
| train_test.sh | 2 months ago | |
| utils.py | 2 months ago | |
This repo is the official implementation of "Bilateral Reference for High-Resolution Dichotomous Image Segmentation" (CAAI AIR 2024).
[!note] We need more GPU resources (2024-04-08) to push forward the performance of BiRefNet, especially on pushing BiRefNet to general use and higher-resolution images. If you are happy to cooperate, please contact me at zhengpeng0108@gmail.com.
Sep 23, 2025: We upgraded the attention implementation in swin transformer with the official SDPA in PyTorch. Exactly same models now have lower memory cost and potential acceleration (when attn_mask is adaptiable for flash_attn in the future) for both training and inference.Jun 30, 2025: We managed to accelerate refine_foreground by 8 times (~80ms now on 5090) with the GPU implementation of fast-fg-est (thanks to @lucasgblu) and upgrades on the pre-/post-processing of arrays there.May 15, 2025: We released a video of the tutorial (screen recording) on BiRefNet fine-tuning on both my YouTube and Bilibili channels.Mar 31, 2025: We released the BiRefNet_dynamic for general use, which was trained on images in a dynamic resolution range from 256x256 to 2304x2304 and shows great and robust performance on any resolution images! Thanks again to Freepik their kind GPU support.Feb 12, 2025: We released the BiRefNet_HR-matting for general matting use, which was trained on images in 2048x2048 and shows great matting performance on higher resolution images! Thanks again to Freepik their kind GPU support.Feb 12, 2025: We released the BiRefNet_HR-matting for general matting use, which was trained on images in 2048x2048 and shows great matting performance on higher resolution images! Thanks again to Freepik their kind GPU support.Feb 1, 2025: We released the BiRefNet_HR for general use, which was trained on images in 2048x2048 and shows great performance on higher resolution images! Thanks to Freepik for offering H200x4 GPU for this huge training (~3 weeks).Jan 6, 2025: Validate the success of FP16 inference with ~0 decrease of performance and better efficiency: the standard BiRefNet can run in 17 FPS with resolution==1024x1024 with 3.45GB GPU memory on a single RTX 4090. Check more details in the model efficiency part below in model zoo section.Dec 5, 2024: Fix the bug of using torch.compile in latest PyTorch versions (2.5.1) and the slow iteration in FP16 training with accelerate (set as default).Nov 28, 2024: Congrats to students @Nankai University employed BiRefNet to build their project and won the provincial gold medal and national bronze medal on the China International College Students’ Innovation Competition 2024.Oct 26, 2024: We added the guideline of conducting fine-tuning on custom data with existing weights.Oct 6, 2024: We uploaded the BiRefNet-matting model for general trimap-free matting use.Sep 24, 2024: We uploaded the BiRefNet_lite-2K model, which takes inputs in a much higher resolution (2560x1440). We also added the notebook for inference on videos.Sep 7, 2024: Thanks to Freepik for supporting me with GPUs for more extensive experiments, especially on BiRefNet for 2K inference!Aug 30, 2024: We uploaded notebooks in tutorials to run the inference and ONNX conversion locally.Aug 23, 2024: Our BiRefNet is now officially released online on CAAI AIR journal. And thanks to the press release.Aug 19, 2024: We uploaded the ONNX model files of all weights in the GitHub release and GDrive folder. Check out the ONNX conversion part in model zoo for more details.Jul 30, 2024: Thanks to @not-lain for his kind efforts in adding BiRefNet to the official huggingface.js repo.Jul 28, 2024: We released the Colab demo for box-guided segmentation.Jul 15, 2024: We deployed our BiRefNet on Hugging Face Models for users to easily load it in one line code.Jun 21, 2024: We released and uploaded the Chinese version of our original paper to my GDrive.May 28, 2024: We hold a model zoo with well-trained weights of our BiRefNet in different sizes and for different tasks, including general use, matting segmentation, DIS, HRSOD, COD, etc.May 7, 2024: We also released the Colab demo for multiple images inference. Many thanks to @rishabh063 for his support on it.Apr 9, 2024: Thanks to Features and Labels Inc. for deploying a cool online BiRefNet inference API and providing me with strong GPU resources for 4 months on more extensive experiments!Mar 7, 2024: We released BiRefNet codes, the well-trained weights for all tasks in the original papers, and all related stuff in my GDrive folder. Meanwhile, we also deployed our BiRefNet on Hugging Face Spaces for easier online use and released the Colab demo for inference and evaluation.Jan 7, 2024: We released our paper on arXiv.from transformers import AutoModelForImageSegmentation
birefnet = AutoModelForImageSegmentation.from_pretrained('zhengpeng7/BiRefNet', trust_remote_code=True)
You can access the inference API service of BiRefNet on FAL or click the Deploy button on our HF model page to set up your own deployment.
For more general use of our BiRefNet, I extended the original academic one to more general ones for better real-life application.
Datasets and datasets are suggested to be downloaded from official pages. But you can also download the packaged ones: DIS, HRSOD, COD, Backbones.
Find performances (almost all metrics) of all models in the
exp-TASK_SETTINGSfolders in [stuff].
We found there've been some 3rd party applications based on our BiRefNet. Many thanks for their contribution to the community!
Choose the one you like to try with clicks instead of codes:
Applications:

https://github.com/user-attachments/assets/6cce7ca7-7817-4406-b6c4-6d4e8c414ed4


Thanks dimitribarbot/sd-webui-birefnet: this project allows to add a BiRefNet section to the original Stable Diffusion WebUI's Extras tab.
Thanks fal.ai/birefnet: this project on fal.ai encapsulates BiRefNet online with more useful options in UI and API to call the model.
Thanks ZHO-ZHO-ZHO/ComfyUI-BiRefNet-ZHO: this project further improves the UI for BiRefNet in ComfyUI, especially for video data.
https://github.com/ZhengPeng7/BiRefNet/assets/25921713/3a1c7ab2-9847-4dac-8935-43a2d3cd2671
Thanks viperyl/ComfyUI-BiRefNet: this project packs BiRefNet as ComfyUI nodes, and makes this SOTA model easier use for everyone.
Thanks Rishabh for offering a demo for the easier multiple images inference on colab.
Model Extensions

Thanks nusu-github/BiRefNet-Burn: this project re-implemented BiRefNet with Burn, which is a new deep-learning framework in Rust.
Thanks Acly/BiRefNet-GGUF: this project converted BiRefNet to the GGUF format for lightweight inference on consumer hardware with vision.cpp in C++.
image_data image = image_load("input.png");
backend_device device = backend_init();
birefnet_model model = birefnet_load_model("BiRefNet-F16.gguf", device);
image_data mask = birefnet_compute(model, image);
image_save(mask, "mask.png");
Thanks briaai/RMBG-2.0: this project trained BiRefNet with their high-quality private data, which brings improvement on the DIS task. Note that their weights are for only non-commercial use and are not aware of transparency due to training in the DIS task setting, which focuses only on predicting binary masks.

More Visual Comparisons
# PyTorch>=2.5.0 (I try to make everything as latest as possible) is used for faster training (~40%) with compilation.
conda create -n birefnet python=3.11 -y && conda activate birefnet
pip install -r requirements.txt
Download combined training / test sets I have organized well from: DIS--COD--HRSOD or the single official ones in the single_ones folder, or their official pages. You can also find the same ones on my BaiduDisk: DIS--COD--HRSOD.
Download backbone weights from my google-drive folder or their official pages.
# Train & Test & Evaluation
./train_test.sh RUN_NAME GPU_NUMBERS_FOR_TRAINING GPU_NUMBERS_FOR_TEST
# Example: ./train_test.sh tmp-proj 0,1,2,3,4,5,6,7 0
# See train.sh / test.sh for only training / test-evaluation.
# After the evaluation, run `gen_best_ep.py` to select the best ckpt from a specific metric (you choose it from Sm, wFm, HCE (DIS only)).
A video of the tutorial on BiRefNet fine-tuning has been released on my YouTube channel ⬇️
Suppose you have some custom data, fine-tuning on it tends to bring improvement.
${data_root_dir}/TASK_NAME/DATASET_NAME. For example, ${data_root_dir}/DIS5K/DIS-TR and ${data_root_dir}/General/TR-HRSOD, where im and gt are both in each dataset folder.'General' (with single quotes) in the whole project with your custom task name as the screenshot of vscode given below shows:sys_home_dir: path to the root folder, which contains codes / datasets / weights / ... -- project folder / data folder / backbone weights folder are ${sys_home_dir}/codes/dis/BiRefNet / ${sys_home_dir}/datasets/dis/General / ${sys_home_dir}/weights/cv/swin_xxx, respectively.testsets: your validation set.training_set: your training set.lambdas_pix_last: adapt the weights of different losses if you want, especially for the difference between segmentation (classification task) and matting (regression task).resume argument in train.py. Attention: the epoch of training continues from the epochs the weights file name indicates (e.g., 244 in BiRefNet-general-epoch_244.pth), instead of 1. So, if you want to fine-tune 50 more epochs, please specify the epochs as 294. \#Epochs, \#last epochs for validation, and validation step are set in train.sh.Download the BiRefNet-{TASK}-{EPOCH}.pth from [stuff] and the release page of this repo. Info of the corresponding (predicted_maps/performance/training_log) weights can be also found in folders like exp-BiRefNet-{TASK_SETTINGS} in the same directory.
You can also download the weights from the release of this repo.
The results might be a bit different from those in the original paper, you can see them in the eval_results-BiRefNet-{TASK_SETTINGS} folder in each exp-xx, we will update them in the following days. Due to the very high cost I used (A100-80G x 8), which many people cannot afford (including myself....), I re-trained BiRefNet on a single A100-40G only and achieved the performance on the same level (even better). It means you can directly train the model on a single GPU with 36.5G+ memory. BTW, 5.5G GPU memory is needed for inference in 1024x1024. (I personally paid a lot for renting an A100-40G to re-train BiRefNet on the three tasks... T_T. Hope it can help you.)
But if you have more and more powerful GPUs, you can set GPU IDs and increase the batch size in config.py to accelerate the training. We have made all these kinds of things adaptive in scripts to seamlessly switch between single-card training and multi-card training. Enjoy it :)
This project was originally built for DIS only. But after the updates one by one, I made it larger and larger with many functions embedded together. Finally, you can use it for any binary image segmentation tasks, such as DIS/COD/SOD, medical image segmentation, anomaly segmentation, etc. You can eaily open/close below things (usually in config.py):
Many of my thanks to the companies / institutes below.
@article{zheng2024birefnet,
title={Bilateral Reference for High-Resolution Dichotomous Image Segmentation},
author={Zheng, Peng and Gao, Dehong and Fan, Deng-Ping and Liu, Li and Laaksonen, Jorma and Ouyang, Wanli and Sebe, Nicu},
journal={CAAI Artificial Intelligence Research},
volume = {3},
pages = {9150038},
year={2024}
}
Any questions, discussions, or even complaints, feel free to leave issues here (recommended) or send me e-mails (zhengpeng0108@gmail.com) or book a meeting with me: calendly.com/zhengpeng0108/30min. You can also join the Discord Group (https://discord.gg/d9NN5sgFrq) if you want to talk a lot publicly.