Not All Operations Contribute Equally: Hierarchical Operation-Adaptive Predictor for Neural Architecture Search
Graph-based predictors have recently shown promising results on neural architecture search (NAS). Despite their efficiency, current graph-based predictors treat all operations equally, resulting in biased topological knowledge of cell architectures. Intuitively, not all operations are equally significant during forwarding propagation when aggregating information from these operations to another operation. To address the above issue, we propose a Hierarchical Operation-adaptive Predictor (HOP) for NAS. HOP contains an operation-adaptive attention module (OAM) to capture the diverse knowledge between operations by learning the relative significance of operations in cell architectures during aggregation over iterations. In addition, a cell-hierarchical gated module (CGM) further refines and enriches the obtained topological knowledge of cell architectures, by integrating cell information from each iteration of OAM. The experimental results compared with state-of-the-art predictors demonstrate the capability of our proposed HOP. In specific, only using 0.1% training data, HOP improves kendall's Tau by 3.45%, N@5 by 20 places on NASBech-101; only using 1% training data, HOP improves kendall's Tau by 2.12%, N@5 by 18 places on NASBench-201, respectively.