Network Information Office: When using synthetic data for model training and key capability optimization, the security of synthetic data should be evaluated.

08/02/2026

Notice from the Cyberspace Administration of China on Soliciting Opinions on the Interim Measures for the Management of Artificial Intelligence Avatar Interaction Services. It is pointed out that when providers carry out activities such as pre-training and optimization training of data, they should strengthen the management of training data and comply with the following regulations: use data sets that conform to the core socialist values and reflect the excellent traditional Chinese culture; clean and annotate the training data to enhance the transparency and reliability of the training data, prevent behaviors such as data poisoning and data tampering; improve the diversity of training data, enhance the security of content generated by models through negative sampling, adversarial training, etc.; evaluate the security of synthetic data when using it for model training and key capability optimization; strengthen the daily inspection of training data, regularly iterate and upgrade the data to continuously optimize the performance of products and services; ensure the legality and traceability of the source of training data, take necessary measures to ensure data security, and prevent the risk of data leakage.