Distilling Holistic Knowledge with Graph Neural Networks

ShengZhou, YuchengWang, Defang Chen, Jiawei Chen, XinWang, Can Wang, JiajunBu

February 2022

Abstract

Knowledge Distillation (KD) aims at transferring knowledge from a larger well-optimized teacher network to a smaller learnable student. KD methods have mainly considered two types of knowledge, namely the individual knowledge and the relational knowledge. However, these two types of knowledge are usually modeled independently while the inherent correlations between them are largely ignored. It is critical for sufficient student network learning to integrate both individual knowledge and relational knowledge while reserving their inherent correlation. In this paper, we propose to distill the novel holistic knowledge based on an attributed graph constructed among instances. The holistic knowledge is represented as a unified graph-based embedding by aggregating individual knowledge from relational neighborhood samples with graph neural networks, the student network is learned by distilling the holistic knowledge in a contrastive manner. Extensive experiments and ablation studies are conducted on benchmark datasets, the results demonstrate the effectiveness of the proposed method. The code has been published in https://github.com/wyc-ruiker/HKD.

Type

Knowledge-Distillation

Publication

In IEEE/CVF International Conference on Computer Vision

Knowledge Distillation

Distilling Holistic Knowledge with Graph Neural Networks

Abstract

Defang Chen

陈德仿博士后

Jiawei Chen

陈佳伟研究员

Can Wang

王灿教授

Distilling Holistic Knowledge with Graph Neural Networks

Abstract

Defang Chen

陈德仿 博士后

Jiawei Chen

陈佳伟 研究员

Can Wang

王灿 教授

陈德仿博士后

陈佳伟研究员

王灿教授