Microsoft Corporation (MSFT.US) and News Corporation Class B (NWS.US) have teamed up with HarperCollins, a subsidiary of News Corporation, to train AI models using a vast amount of book data.
Microsoft has reached an agreement with HarperCollins, a subsidiary of News Corporation-B, to utilize its extensive collection of non-fiction books to train its artificial intelligence model.
According to sources, Microsoft Corporation (MSFT.US) has reached an agreement with HarperCollins Publishers, a subsidiary of News Corporation Class B (NWS.US), to utilize the latter's extensive non-fiction book resources to train its artificial intelligence models in order to enhance the quality and performance of the models. This collaboration is limited to using selected old books for model training and does not involve creating new books, with authors having the right to choose whether to participate.
Specifically, Microsoft Corporation hopes to incorporate HarperCollins books into its yet-to-be-announced artificial intelligence model to expand high-quality text sources, improve the accuracy of the model, and enhance its capability to provide professional knowledge. Although Microsoft Corporation declined to comment, HarperCollins has confirmed the agreement, stating that it will "allow limited use of selected non-fiction old books for training artificial intelligence models."
Furthermore, HarperCollins emphasized that the scope of this agreement is limited, with clear restrictions in place to ensure exemplary output that respects authors' rights, and authors can choose whether to participate.
"Our mission is to create opportunities for authors to think deeply while ensuring that the core value of their works and the shared income and royalties are protected," HarperCollins stated. "This agreement sets clear limits on outstanding works that respect authors' rights, successfully achieving this goal."
It is understood that tech companies have been seeking more high-quality text sources to train artificial intelligence models, and companies like Microsoft Corporation are no exception. By obtaining licenses to use a range of data from social media websites to news articles, they aim to make their programs more accurate and better at answering questions or providing specialized knowledge on specific topics.
It is worth mentioning that News Corporation had previously signed an agreement with OpenAI allowing the use of content from its various publications. Microsoft Corporation has also collaborated with several publishers on artificial intelligence projects.
Additionally, earlier this year, Alphabet Inc. Class C reached a $60 million agreement with Reddit, allowing the search giant to utilize a large number of subreddits to train its AI models.
However, some publishers have expressed dissatisfaction with the unauthorized use of their content by artificial intelligence companies and have filed lawsuits. For instance, the New York Times Company Class A sued OpenAI and Microsoft Corporation for copyright infringement.
In conclusion, the agreement between Microsoft Corporation and HarperCollins marks another significant advancement for tech companies seeking high-quality text sources to train artificial intelligence models. However, how to respect authors' rights while utilizing these resources remains a challenge that publishers and tech companies need to address together.
Related Articles

Selected Announcement of A-shares | Nanjing Business & Tourism Corp.,Ltd.(600250.SH) Controlling Shareholder Nanjing Tourism Group Plans to Carry out Reform and Restructuring.

CICC: Active foreign capital continues to flow out, while passive foreign capital turns into inflow.

China Rare Earth Resources And Technology (00769) recently received a complaint letter and established an independent investigation committee.
Selected Announcement of A-shares | Nanjing Business & Tourism Corp.,Ltd.(600250.SH) Controlling Shareholder Nanjing Tourism Group Plans to Carry out Reform and Restructuring.

CICC: Active foreign capital continues to flow out, while passive foreign capital turns into inflow.

China Rare Earth Resources And Technology (00769) recently received a complaint letter and established an independent investigation committee.

RECOMMEND

Anti-Overcompetition Drive Takes Hold Across Multiple Chinese Industries
04/07/2025

Bank of England Governor: Rise of Stablecoins May Undermine Confidence in National Currencies
04/07/2025

What the Passage of the “Big and Beautiful Act” Means for Americans: Gains and Losses Across Demographics
04/07/2025