Data Intelligence Strategies in Publishing

Posted on : September 22nd 2022

Author : Chitra Swaminathan, Associate Vice President, Operations at Straive

Publishers Must Leverage Data for Greater Revenues

For decades, publishers operated in a world where a seemingly unending stream of new authors submitted ever-increasing numbers of articles for consideration. Today, however, even the most established and valued titles face heightened competition from other journals and new alternatives. Also, self-publishing has emerged as a viable alternative. Thus, publishers need to make carefully considered data-driven decisions by analyzing the wealth of data they possess and turning them into actionable insights.

The new power behind the throne in a publishing landscape where content is king is data. Currently, publishers are using their data more than ever. Some publishers have also begun developing their online platforms and launching data-sharing services to provide greater access to their online data content to academic and research communities. In an increasingly complex digital age, the most successful publishers leverage their data to understand their market position, customers, and the positioning of their journals.

Data-driven understanding will enable new business models, increasing sales and minimizing financial risks.

Accessing Hidden Insights in Research and Publishing Data

Publishers have access to vast amounts of data that must be processed and refined at scale to gain insights and drive business transformations. Various data points exist, for example, submission and publication records, usage data, sales information, and researcher satisfaction scores as authors and reviewers, as well as bibliometric databases, including citation databases such as PubMed, Scopus, Google Scholar, and others.

Weighing these data points against a journal’s target audience’s desires and needs, determined through qualitative feedback from the editorial board and the wider community, enables citation-to-competitor title trends to unearth emerging research topics, key funders, and research institutions, among others.

Research topics evolve, funder and institutional mandates change, and researchers need different things from journals over time. When capacity allows, annual reviews of both data and the strategies built on them allow plans to be moderated in line with the research ecosystem’s needs.

So, first identifying their data needs and then mobilizing internal systems to collect the necessary intelligence, such as publication metadata, trending topics, finding the right readers, and the like, is helpful for publishers.

Exhibit 1: Data Source Universe

Source: Straive.

A critical challenge with publishers’ internal data is that such information is often derived from free text added by authors into submission systems. It can be challenging to identify which unique researchers are engaged with a journal without Open Researcher and Contributor ID (ORCID), a specific and open digital researcher identifier.

Moreover, connecting those researchers to an institution becomes nearly impossible. Using robust submission systems, such as publishers mandating authors and coauthors include their ORCID during manuscript submission, can help combat these limitations.

Mapping authors onto institutions offers a heat map of research activity within a journal’s community. Furthermore, research institutions can be mapped onto subscribers, creating a complete picture of a journal’s reach.

A Data-Led Strategy Unlocks New Revenue Streams for Publishers

New business models have taken root in publishing. Recently, the open access (OA) model has gained considerable significance. But any sales approach, whether OA or traditional subscription, needs to deliver the potential for growth in a business context. For a publisher, this means understanding institutional affiliations, market trends, supply chains, author surveys, and more.

When negotiating a “transformative deal,” a publisher must first determine which unique articles can be associated with each institution that may accept such a deal to promote OA. Negotiations frequently need at least three years of publication records.

Institutions often want to know the total number of unique articles for which one of their researchers is a corresponding author, as well as the total number of unique articles for which one of their researchers is a noncorresponding author. Authors typically include their affiliations as plain text fields within a manuscript document or the submission form.

In many cases, publishers have their collected data stored in siloed systems, limiting the potential for analysis or reuse. Before moving these data out of silos, publishers first need to identify which data points are important—which systems collect and store data and which can be used to answer operational or strategic questions.

Then a publisher can start tracking how these vital data points move between systems and departments, identifying and removing inefficiencies, such as rekeying instead of exporting data between systems.

Advantages of Metadata Management

Well-crafted metadata depend extensively on smart content discovery so that researchers can scan expansive databases and subscribe to content they find helpful. A publisher’s discoverability strategy should introduce new pieces of metadata to their published content.

Contextual enrichment using controlled vocabularies or taxonomies, either adding or in place of author-provided keywords, has been a feature of scholarly publishing for some time. Newer machine learning (ML) technologies offer publishers opportunities to go further, automatically identifying and tagging articles for topics at different levels of granularity.

Along with using tags to create informative content collections, editorial teams can also offer users a new way of navigating content based on concepts or topics rather than titles, authors’ names, or journals. Moreover, metadata can improve the discovery of accessible publications and their features, enhancing a journal’s visibility and boosting awareness.

Artificial intelligence/Machine learning- or AI/ML-led automation can effectively read metadata content and source essential details like a journal’s International Standard Serial Number, publication history dates, authors’ names, keywords, and more. Automation simplifies metadata production, allowing its smooth movement between the peer-review system, production service, and the journal hosting platform and saving editorial teams the time they spend reentering the same information in multiple places.

As publishers seek to retain submissions in their publishing ecosystems, the need for a quick and flawless transfer of review information, submission files, and metadata between publications has become increasingly evident.

Exhibit 2: Institutional engagement matrix

Source: Straive.

For publishers, a strategic use of metadata includes extracting useful information about organizations that have funded research in their disciplinary fields. Critically, they can analyze an organization’s attitudes toward and policies regarding OA, which is useful when considering whether to launch new OA journals or pursue an OA transformation or a flip. Trending research areas can surface from analytics regarding active funders.

Data Is the Master Key

Creating and implementing a data strategy is not an end in itself. Using clean and structured data to provide value to the business and customers is the goal. Many publishers are under revenue pressure. With a data pipeline that keeps the central data repository frequently updated, a publisher’s sales or customer service teams can offer an institution regular, automatically generated reports on new publications from their authors, including their impact or reach. Data holds all the cards!

We want to hear from you

Leave a Message

Our solutioning team is eager to know about your
challenge and how we can help.

Data Intelligence Strategies in Publishing

Publishers Must Leverage Data for Greater Revenues