MY KOLKATA EDUGRAPH
ADVERTISEMENT
regular-article-logo Monday, 24 March 2025

Doomsday deferred

Artificial Intelligence might not be the end of the media but it will incur costs

Sevanti Ninan Published 17.02.25, 06:59 AM

Sourced by the Telegraph

In the two years since OpenAI released ChatGPT in November 2022, the implications of Artificial Intelligence for online search and the media industry continue to evolve, but not as drastically as had been predicted.

In 2023, it seemed that online search had begun the process of transitioning from merely serving up links to news articles to providing direct answers to queries supplied by AI. Google and Microsoft both announced revamps of their search engines to include this kind of augmentation. Alarmed analysis followed: the data that large language models are trained on are usually at least a year old, so how updated were those direct answers likely to be, particularly when it came to news?

ADVERTISEMENT

More consequentially, what would this do to the online revenue on which publishers depend? Since 2020, publishers’ associations in different countries have been lobbying for legislation to make platforms pay in a transparent way for the news they garner search revenue from. If search engines were to depend on direct answers supplied by their own AI, they would not need to abide by any online news laws such as those that some governments had come up with.

Two years down the line, the revenue question has very different dimensions — those of copyright and licensing content to AI companies scraping content to train their large language models. While governments were earlier passing online news laws to help news publishers earn revenue from search giants, the government in the United Kingdom, for example, is now proposing the opposite. It is currently holding hearings on proposals to introduce an ‘opt-out’ copyright regime for AI companies. This would permit tech businesses to scrape publishers’ and creatives’ content from the web unless those rights-holders explicitly forbid it. The move is intended to help AI companies gather content to train large language models, the objective being to help these companies get off the ground more quickly to grow the UK’s AI industry. The move has been strongly opposed by individual publishers attending the hearing as well as the UK’s Professional Publisher Association.

As for publishers earning revenue from licensing content to AI companies, there is not enough evidence yet to suggest worthwhile revenues can be earned by permitting and tracking such use.

The advent of AI prompted the appearance of more search start-ups, including those experimenting with developing a new hybrid of AI and traditional search labelled retrieval-augmented generation. You first apply search tools to identify the pages with the most relevant material, use natural language processing to ‘read’ them, and then feed them into a large language model for a textual response to the search query. Two years down the line though, links to published news are still driving search.

But Google Cloud will have us believe that AI agents are taking away some of the search dependence on news sources. It has a much sunnier take to offer on its success with AI search after it released Gemini 2.0. At the end of December 2024, it published what are called real-world gen AI use cases of its augmentation of traditional search with AI agents. “What makes AI agents unique is that they can take actions to achieve specific goals, whether that’s guiding a shopper to the perfect pair of shoes, helping an employee looking for the right health benefits, or supporting nursing staff with smoother patient hand-offs during shift changes.”

But none of this suggests the success of AI agents in serving up reliable news. Last week, the BBC published research on how accurately AI assistants answer questions about the news. Four prominent, publicly available AI assistants, including ChatGPT and Gemini, were tested over a month on news from the BBC’s website. It was found that 51% of the answers that the bots generated to questions about the news had significant issues of some form, such as introducing factual errors or citing quotes sourced from BBC articles which didn’t actually exist in that article. Following those results, the BBC’s CEO for news and current affairs was asking, “We live in troubled times, and how long will it be before an AI-distorted headline causes significant real world harm?”

The other dimension of the relationship between AI and news has been the question of increasing the reliability of what chatbots come up with by introducing citations into their functioning. The outcome is a mixed bag with some chatbots hallucinating URLs, which is a polite word for citing fake or broken links, and others, such as DeepSeek, actually providing attributions. Nieman Lab found that the Chinese chatbot often credited news articles by including the headline, bylined author, or publication date directly in the text of its response. Of course, it also served up broken links.

For the media industry, AI has rapidly become an innovation that you can neither fully depend on nor do without. Using AI in a reliable way in the newsroom to augment a range of functions is a hugely expensive proposition that a few well-resourced publishers are investing in. If you want to be sure of the data your AI assistants are using, you build your own large language models and AI tools for newsroom functions. And train them on reliable datasets. Well-heeled organisations such as Bloomberg and the Financial Times are investing in building or acquiring AI tools, and then putting in substantial research effort to test their performance, as the BBC did.

Bloomberg developed Bloom­berg­GPT early on, a large language model specifically trained on a wide range of financial data to support a diverse set of natural language processing tasks within the financial industry. It said then that the complexity and the unique terminology of the financial domain warrant a domain-specific model.

Leading publishers are also looking to create or acquire curated datasets. AI models require high-quality data inputs in order to learn, and quality inputs yield better results. So the Financial Times, for instance, offers its content as a dataset. The paper has also run an AI launchpad programme for news organisations from a range of countries, which had the technical capability and financial potential, in partnership with the Google News Initiative. Publishers discuss whether it is better to build their own large language models and writing or transcription tools or to buy them. Large news organisations now have AI editors to help make these decisions.

The scenario for the media industry, then, is far from a doomsday one. But ramping up dependable AI use does not come cheap.

Sevanti Ninan is a media commentator. She also publishes the labour newsletter, Worker Web.

Follow us on:
ADVERTISEMENT
ADVERTISEMENT