导师简介

姜晓燕,副教授、硕导,博士毕业于耶拿大学(德国)计算机科学专业。

Xiaoyan Jiang is an Associate Professor at Shanghai University of Engineering Science since 2020.01. She is a Visiting Scholar in Leiden Institute of Advanced Computer Science (LIACS), Leiden University, Leiden, Netherlands. She got her doctor's degree in computer science from Friedrich-Schiller University Jena, Germany in 2015. She has published 60 publications in the field of computer vision and artificial intelligence. She serves as the Associate Editior of Applied Intelligence (IF: 3.9) since 2024.


讲授课程 Courses:计算机视觉 Computer Vision、机器学习 Machine Learning、数字图像处理 Digital Image Processing、面向对象程序设计 C++
链接 Links谷歌学术 Google Scholar     GitHub网页      荷兰莱顿大学访问学者页面 Homepage Leiden University
邮箱 E-mail:xiaoyan.jiang@sues.edu.cn


研究课题为计算机视觉、深度学习,应用领域包括视频监控、医疗辅助分析、工业检测、场景理解、智能交通等。在计算机视觉、人工智能领域发表论文60余篇,其中SCI/EI 50余篇,包括Trans. SMC, Trans. ITS, Pattern Recognition, Knowledge-Based Systems (KBS), SPIC, ICIP, ICONIP, ICME等。为多个顶级国际会议与期刊的评审。ICPCSEE2019、IEA/AIE2023的项目委员会成员、在CiSE2023,IWITC2021,ICFTIC2019国际会议做主旨报告。担任Applied Intelligence期刊副主编一职。曾获德国DAAD、中国政府奖学金CSC资助。学校青年五四奖章集体成员、上海市长宁区第四轮创新团队–智能视觉感知与信息处理创新团队核心成员、学院优秀教师。主持/参与国家自然科学基金青年项目、民航重点、面上、上海市教委项目、上海市科委重点项目、上海飞机制造有限公司项目等多项。申请发明专利八项,实用新型专利五项、软著多项。


现为电子与电气工程学院多维度人工智能科研团队负责人,团队注重科技研发的同时,积极推进人工智能技术的产业化。与各行业企业开展产学研合作,以5G + AI为未来模式,已在智能交通、三维场景建模、视频监控、瑕疵巡检、智慧医疗等方面取得多项成果,并应用到实际场景中。研究涵盖计算机视觉领域的多个课题:多目标跟踪、域自适应行人重识别、语义分割、视觉SLAM。所负责的项目应用到的领域:智能交通、视频监控、大型客机表面喷漆瑕疵检测、工业缺陷检测、胃癌淋巴结转移检测、眼震疾病诊断、心脏周期分析等。


团队以学生发展为中心,打牢从传统视觉算法到深度学习及大模型相关的关键知识与理论,结合实际场景,培养学生独立思考,发现问题和解决问题的能力。目标为激发大家持续终身学习的内驱力,最终团队得到成长和发展!如果你对自我有要求,对科学好奇,愿意为解决问题而努力,那本团队适合你,欢迎加入!


Xiaoyan Jiang, Zhi Zhou, Hailing Wang, Guozhong Wang, and Zhijun Fang,2025

团队论文“TexLiverNet: Leveraging Medical Knowledge and Spatial-Frequency Perception for Enhanced Liver Tumor Segmentation”2025IEEE医疗顶会《IEEE International Symposium on Biomedical Imaging》 已接收,祝贺!

Abstract:
Integrating textual data with imaging in liver tumor segmentation is essential for enhancing diagnostic accuracy. However, current multi-modal medical datasets offer only general text annotations, lacking lesion-specific details critical for extracting nuanced features, especially for fine-grained segmentation of tumor boundaries and small lesions. To address these limitations, we developed datasets with lesion-specific text annotations for liver tumors and introduced the TexLiverNet model. TexLiverNet employs an agent-based cross-attention module that integrates text features efficiently with visual features, significantly reducing computational costs. Additionally, enhanced spatial and adaptive frequency domain perception is proposed to precisely delineate lesion boundaries, reduce background interference, and recover fine details in small lesions. Comprehensive evaluations on public and private datasets demonstrate that TexLiverNet achieves superior performance compared to current state-of-the-art methods.

Download: [preprint版本]

Keywords: PTQ,Referring Image Segmentation
Photos:

Xiaoyan Jiang, Hang Yang, Kaiying Zhu, Xihe Qiu, Shibo Zhao, Sifan Zhou,2024

团队论文“PTQ4RIS: Post-Training Quantization for Referring Image Segmentation”被2025IEEE国际机器人与自动化会议《IEEE International Conference on Robotics and Automation》 接收,祝贺!

Abstract:
Referring Image Segmentation (RIS), aims to segment the object referred by a given sentence in an image by understanding both visual and linguistic information. However, existing RIS methods tend to explore top-performance models, disregarding considerations for practical applications on resources-limited edge devices. This oversight poses a significant challenge for on-device RIS inference. To this end, we propose an effective and efficient post-training quantization framework termed PTQ4RIS. Specifically, we first conduct an in-depth analysis of the root causes of performance degradation in RIS model quantization and propose dual-region quantization (DRQ) and reorder-based outlier-retained quantization (RORQ) to address the quantization difficulties in visual and text encoders. Extensive experiments on three benchmarks with different bits settings (from 8 to 4 bits) demonstrates its superior performance. Importantly, we are the first PTQ method specifically designed for the RIS task, highlighting the feasibility of PTQ in RIS applications.

Download: [preprint版本]

Keywords: PTQ,Referring Image Segmentation
Photos:

Xiaoyan Jiang, Xinlong Wan, Kaiying Zhu, Xihe Qiu, Zhijun Fang Distribution-aware Noisy-label Crack Segmentation,2024

团队论文“Distribution-aware Noisy-label Crack Segmentation”IEEE国际机器人与自动化会议《IEEE International Conference on Robotics and Automation》 在投,祝贺!

Abstract:
Road crack segmentation is critical for robotic systems tasked with the inspection, maintenance, and monitoring of road infrastructures. Existing deep learning-based methods for crack segmentation are typically trained on specific datasets, which can lead to significant performance degradation when applied to unseen real-world scenarios. To address this, we introduce the SAM-Adapter, which incorporates the general knowledge of the Segment Anything Model (SAM) into crack segmentation, demonstrating enhanced performance and generalization capabilities. However, the effectiveness of the SAM-Adapter is constrained by noisy labels within small-scale training sets, including omissions and mislabeling of cracks. In this paper, we present an innovative joint learning framework that utilizes distribution-aware domain-specific semantic knowledge to guide the discriminative learning process of the SAM-Adapter. To our knowledge, this is the first approach that effectively minimizes the adverse effects of noisy labels on the supervised learning of the SAM-Adapter. Our experimental results on two public pavement crack segmentation datasets confirm that our method significantly outperforms existing state-of-the-art techniques. Furthermore, evaluations on the completely unseen CFD dataset demonstrate the high cross-domain generalization capability of our model, underscoring its potential for practical applications in crack segmentation.

Download: [preprint版本]

Keywords: SAM-Adapter,Crack Detection
Photos:

Xiaoyan Jiang, Licheng Jiang, Anjie Wang, Kaiying Zhu, Yongbin Gao,CrackSegdiff:Diffusion Probability Model-based Multi-modal Crack Segmentation,2024

团队论文“CrackSegdiff:Diffusion Probability Model-based Multi-modal Crack Segmentation”IEEE国际机器人与自动化会议《IEEE International Conference on Robotics and Automation》 在投,祝贺!

Abstract:
Integrating grayscale and depth data in road inspection robots could enhance the accuracy, reliability, and comprehensiveness of road condition assessments, leading to improved maintenance strategies and safer infrastructure. However, these data sources are often compromised by significant background noise from the pavement. Recent advancements in Diffusion Probabilistic Models (DPM) have demonstrated remarkable success in image segmentation tasks, showcasing potent denoising capabilities, as evidenced in studies like SegDiff [1]. Despite these advancements, current DPM-based segmentors do not fully capitalize on the potential of original image data. In this paper, we propose a novel DPM-based approach for crack segmentation, named CrackSegDiff, which uniquely fuses grayscale and range/depth images. This method enhances the reverse diffusion process by intensifying the interaction between local feature extraction via DPM and global feature extraction. Unlike traditional methods that utilize Transformers for global features, our approach employs Vm-unet [2] to efficiently capture long-range information of the original data. The integration of features is further refined through two innovative modules: the Channel Fusion Module (CFM) and the Shallow Feature Compensation Module (SFCM). Our experimental evaluation on the three-class crack image segmentation tasks within the FIND dataset demonstrates that CrackSegDiff outperforms state-of-the-art methods, particularly excelling in the detection of shallow cracks. Code is available at https://github.com/sky-visionX/CrackSegDiff.

Download: [preprint版本]

Keywords: Channel Fusion Module, Diffusion Probabilistic Models, Shallow Feature Compensation Module
Photos: