You are using an outdated browser. Please upgrade your browser to improve your experience.

uspto patent assignment dataset

USPTO OCE Patent Assignment Data

location: https://www.uspto.gov/ip-policy/economic-research/research-datasets/patent-assignment-dataset

contributors: Alan C. Marco, Stuart J.H. Graham, Amanda F. Myers, Paul A. D'Agostino, Kirsten Apple

tags: patents, claims, assignment

timeframe : 1970-2020

terms of_use : USPTO’s online databases are not designed or intended to be a source for bulk downloads of USPTO data when accessed through the website’s interfaces. Individuals, companies, IP addresses, or blocks of IP addresses who, in effect, deny or decrease service by generating unusually high numbers of database accesses (searches, pages, or hits), whether generated manually or in an automated fashion, may be denied access to USPTO servers without notice. Bulk data products may be separately obtained from the USPTO, either for free or at the cost of dissemination. For details, see information on Electronic Bulk Data Products: https://www.uspto.gov/learning-and-resources/electronic-bulk-data-products

related publications : http://ssrn.com/abstract=2636461

description : The USPTO allows parties to record assignments of patents and patent applications to, as much as possible, maintain a complete history of claimed interests in a patent. The USPTO also permits recording of other documents that affect title (such as certificates of name change and mergers of businesses) or are relevant to patent ownership (such as licensing agreements, security interests, mortgages, and liens). The 2020 update to the Patent Assignment Dataset contains detailed information on 8.97 million patent assignments and other transactions recorded at the USPTO since 1970 and involving roughly 15.1 million patents and patent applications. It is derived from the recording of patent transfers by parties with the USPTO.

last edit : Fri, 01 Dec 2023 18:14:47 GMT

Skip header and go to main content

Open Data Portal beta

Open Data Portal will be moving

A new Open Data Portal (ODP) is launching soon! Developer Hub will continue to run in parallel with the new ODP through 2025. Thank you for your help in moving us out of beta. Learn more about the new Open Data Portal on data.uspto.gov .

USPTO Datasets

Protecting inventors and entrepreneurs fuels innovation and creativity, driving advances that can benefit society. as the federal agency that grants patents and registers trademarks, we hold a treasure trove of data. now we’re giving it to you - faster and easier than before..

  • 6 results found

Trademark daily XML file (TDXF) assignments

Patent assignment daily xml (front file), patent assignment economics data (stata (.dta) and ms excel (.csv)), trademark assignment economics data (stata (.dta) and ms excel (.csv)), patent assignment annual xml (backfile), trademark annual xml assignments.

NOTE: BDSS will be retiring soon

A new Open Data Portal (ODP) is coming soon, informed by the Developer Hub (Open Data Portal beta) and real Bulk Data Storage System (BDSS) customers. It will include BDSS datasets with improved features to find the data you need. The new ODP will run in parallel with BDSS to ensure that you have plenty of time and resources to acclimate to the new tool before BDSS is eventually retired, targeted for late 2024.

Sign up to be the first to know when the new ODP goes live, and consider volunteering for early access to ODP to provide feedback before it launches. Your feedback is critical to ensuring ODP meets your individual needs. Learn more about the new Open Data Portal and sign up for updates on data.uspto.gov

03/15/2017 NOTE: The disk space on BDSS is being restructured. URLs/links that contain data2 and data3 will be changing to data.

Issued patents (patent grants), *** note: 11/02/2023 the patent grant single-page tiff images (jul 31, 1790 - present) product will be discontinued and removed beginning january 1, 2024. ***, published patent applications (pre-grant publications or pgpubs), *** note: 11/02/2023 the patent application single-page tiff images (mar 15, 2001 - present) product will be discontinued and removed beginning january 1, 2024. ***, additional patent information, note: the trademark application images 24 hour box bulk data is unavailable. users will be notified when access has been restored., we apologize for the inconvenience., trademark application and registration images, trademark full text xml data (no images), research datasets (created/maintained by the uspto chief economist), browse by topic.

  • Learning & Resources
  • About the USPTO

About This Site

  • Accessibility
  • Privacy Policy
  • Terms of Use
  • Systems Status

USPTO Background

  • Federal Activity Inventory Reform Act (FAIR)
  • USPTO Budget and Performance
  • Freedom of Information Act
  • Information Quality Guidelines

Federal Government

  • Regulations.gov (link is external)
  • StopFakes.gov (link is external)
  • USA.gov (link is external)
  • Department of Commerce (link is external)
  • Strategy Targeting Organized Piracy
  • Statistical Analysis
  • Medical Engineering
  • Data Analysis
  • Biosignal Processing
  • Engineering

The USPTO Patent Assignment Dataset: Descriptions and Analysis

  • January 2015
  • SSRN Electronic Journal
  • This person is not on ResearchGate, or hasn't claimed this research yet.

Stuart J. H. Graham at Georgia Institute of Technology

  • Georgia Institute of Technology

Discover the world's research

  • 25+ million members
  • 160+ million publication pages
  • 2.3+ billion citations

No full-text available

Request Full-text Paper PDF

To read the full-text of this research, you can request a copy directly from the authors.

Lin Chaoran

  • PAP REG SCI

Bangjuan Wang

  • Weisheng Mao
  • Junxian Piao

Chengliang Liu

  • SMALL BUS ECON

Bingde Wu

  • Christian Helmers
  • Brian J. Love
  • Šimon Trlifaj

Gianluca Orsatti

  • J ACCOUNT ECON
  • Jinhwan Kim
  • Kristen Valentine
  • Nat. Clim. Change.

Vilhelm Verendel

  • Ding, Shaozhen
  • Ivan P. L. Png
  • Shen, Guowen

Sandro Montresor

  • Francesco Quatraro

Kevin J. Boudreau

  • Milan Miric
  • Yan Anthea Zhang
  • Zhuo (Emma) Chen
  • Yuandi Wang
  • Carlos J. Serrano
  • Recruit researchers
  • Join for free
  • Login Email Tip: Most researchers use their institutional email address as their ResearchGate login Password Forgot password? Keep me logged in Log in or Continue with Google Welcome back! Please log in. Email · Hint Tip: Most researchers use their institutional email address as their ResearchGate login Password Forgot password? Keep me logged in Log in or Continue with Google No account? Sign up

Patent Transactions in the Marketplace: Lessons from the USPTO Patent Assignment Dataset

Georgia Tech Scheller College of Business Research Paper No. 29

56 Pages Posted: 28 Nov 2015 Last revised: 21 May 2016

Stuart J.H. Graham

Georgia Institute of Technology - Scheller College of Business

Alan C. Marco

Georgia Institute of Technology - School of Public Policy

Amanda Myers

United States Patent and Trademark Office (USPTO)

Multiple version icon

Date Written: November 1, 2015

While records of the assignments (transactions) affecting US patents and patent applications have been maintained by the US Patent & Trademark Office (USPTO) for over 40 years, few researchers have used them. To help remedy this deficiency, the USPTO Office of Chief Economist is releasing research-ready data files. This paper describes the contents of the USPTO Patent Assignment Dataset, a database covering roughly 6 million assignments and other transactions recorded during 1970-2014 and affecting about 10 million US patents or patent applications published 1930-2014. Records include information on transferred patent and application numbers, the dates a transaction was executed by the parties and subsequently recorded at the USPTO, the assignor(s) and assignee(s), and the “nature of conveyance” (for instance, whether the transaction was an assignment, merger, security agreement, or license). This paper provides a comprehensive description and presents stylized facts to facilitate better understanding and motivate future research. Although the paper describes limitations inherent in the data, their release nevertheless offers researchers many novel avenues for conducting original research, particularly those related to the study of innovation, the markets for technology, and the financial collateralization of intellectual property and intangible assets.

Keywords: Intellectual Property, Patents, Markets for Technology, Innovation, Licensing, Finance

JEL Classification: O3, L2, G1, G2, G3

Suggested Citation: Suggested Citation

Stuart J.H. Graham (Contact Author)

Georgia institute of technology - scheller college of business ( email ).

800 West Peachtree St. NW Atlanta, GA 30308 United States 404-385-0953 (Phone) 404-894-6030 (Fax)

HOME PAGE: http://https://www.scheller.gatech.edu/graham

Georgia Institute of Technology - School of Public Policy ( email )

685 Cherry St. Atlanta, GA 30332-0345 United States

United States Patent and Trademark Office (USPTO) ( email )

Alexandria VA 22313-1451 United States

Do you have a job opening that you would like to promote on SSRN?

Paper statistics, related ejournals, georgia tech scheller college of business research paper series.

Subscribe to this free journal for more curated articles on this topic

Entrepreneurship & Law eJournal

Subscribe to this fee journal for more curated articles on this topic

IO: Productivity, Innovation & Technology eJournal

Intellectual property: patent law ejournal, intellectual property: empirical studies ejournal, law & society: private law - intellectual property ejournal, innovation law & policy ejournal, environment for innovation ejournal.

U.S. flag

An official website of the United States government Here’s how you know keyboard_arrow_down

An official website of the United States government

The .gov means it’s official. Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

The site is secure. The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Jump to main content

United States Patent and Trademark Office - An Agency of the Department of Commerce

Updated Patent Datasets now available

The United States Patent and Trademark Office’s (USPTO) Office of the Chief Economist released 2020 updates for two research datasets, the Patent Assignment Dataset and the Patent Examination Research Dataset (PatEx).

The Patent Assignment Dataset now contains detailed information on 8.97 million patent assignments and other transactions recorded at the USPTO since 1970—involving roughly 15.1 million patents and patent applications.

The Patent Examination Research Dataset (PatEx) now contains detailed information on over 16.5 million United States patent and Patent Cooperation Treaty (PCT) applications filed with the USPTO through April 2021. The dataset includes information on patent application characteristics, examination and continuation histories, and more.

For more information, visit the research datasets webpage on the USPTO website.

Additional information about this page

uspto patent assignment dataset

Dataset Summary

The Harvard USPTO Dataset (HUPD) is a large-scale, well-structured, and multi-purpose corpus of English-language utility patent applications filed to the United States Patent and Trademark Office (USPTO) between January 2004 and December 2014.

Google Colab Notebooks

You can also use the following Google Colab notebooks to explore HUPD:

Dataset Structure

Each patent application is defined by a distinct JSON file, named after its application number, and includes information about the application and publication numbers, title, decision status, filing and publication dates, primary and secondary classification codes, inventor(s), examiner, attorney, abstract, claims, background, summary, and full description of the proposed invention, among other fields. There are also supplementary variables, such as the small-entity indicator (which denotes whether the applicant is considered to be a small entity by the USPTO) and the foreign-filing indicator (which denotes whether the application was originally filed in a foreign country). In total, there are 34 data fields for each application. A full list of data fields used in the dataset is listed in the next section.

Source Data

HUPD synthesizes multiple data sources from the USPTO: While the full patent application texts were obtained from the USPTO Bulk Data Storage System (Patent Application Data/XML Versions 4.0, 4.1, 4.2, 4.3, 4.4 ICE, as well as Version 1.5) as XML files, the bibliographic filing metadata were obtained from the USPTO Patent Examination Research Dataset (in February, 2021).

A major feature of HUPD is its structure, which allows it to demonstrate the evolution of concepts over time. As we illustrate in the paper, the criteria for patent acceptance evolve over time at different rates, depending on category. We believe this is an important feature of the dataset, not only because of the social scientific questions it raises, but also because it facilitates research on models that can accommodate concept shift in a real-world setting.

Examples and Statistics

uspto patent assignment dataset

Three pages of the pre-grant version of an example patent document ( Method and Apparatus for Initiating a Transaction on a Mobile Device [Publication No: 2014-0207675 A1]). The highlighted sections show a subset of the 34 data fields that we include in the Harvard USPTO Patent dataset.

uspto patent assignment dataset

IPC distribution of accepted patent applications from 2011 to 2016 at the IPC subclass level. There are 637 IPC subclass labels in HUPD, of which the most common 20 codes make up half of the distribution. G06F- Electric Digital Data Processing is the largest IPC subclass, accounting for 10.4% of applications.

Licensing Information

HUPD is released under the CreativeCommons Attribution-NonCommercial-ShareAlike 4.0 International.

If your research makes use of our dataset, models, or findings, please consider citing our paper.

Click through the PLOS taxonomy to find articles in your field.

For more information about PLOS Subject Areas, click here .

Loading metrics

Open Access

Peer-reviewed

Research Article

Identify the digitalization technology opportunity of low-carbon energy technologies: Using the patent data and collaborative filtering

Roles Conceptualization, Methodology, Writing – original draft, Writing – review & editing

Affiliation School of Intellectual Property, Nanjing University of Science and Technology, Nanjing, Jiangsu, China

Roles Writing – original draft, Writing – review & editing

* E-mail: [email protected]

Affiliation School of Economics, Zhejiang University of Finance and Economics, Hangzhou, Zhejiang, China

ORCID logo

  • Jie Liu, 

PLOS

  • Published: September 3, 2024
  • https://doi.org/10.1371/journal.pone.0309420
  • Reader Comments

Fig 1

The digitalization of low-carbon energy technologies (LCET) provides important technical support for the transition to a greener energy system. Digitalization addresses the phenomenon of the growing application of information and communications technologies (ICT) across the economy, which is regarded as the technology convergence between ICT and other technologies. Scholars have revealed the signs that LCET and ICT are becoming increasingly interlinked, which raises the challenges for predicting and identifying the technology opportunities for innovations in the converged technology area. To address the challenges, this paper proposes a collaborative filtering approach to identify the digitalization technology opportunity of low-carbon energy technologies using patent classification and patent citation information. We applied the proposed collaborative filtering approach using a large LCET patent dataset derived from the United States Patent and Trademark Office (USPTO). The results indicate that the proposed method can effectively identify digitalization technology opportunities of LCET, and the current LCET digitalization technology opportunities identified based on this approach are mainly concentrated in the Energy storage field. The advantages of the proposed approach are that its underlying data are more readily available and its technical complexity is relatively lower, and thus, more replicable for other technology fields.

Citation: Liu J, Cai W (2024) Identify the digitalization technology opportunity of low-carbon energy technologies: Using the patent data and collaborative filtering. PLoS ONE 19(9): e0309420. https://doi.org/10.1371/journal.pone.0309420

Editor: Xingwei Li, Sichuan Agricultural University, CHINA

Received: October 8, 2023; Accepted: August 13, 2024; Published: September 3, 2024

Copyright: © 2024 Liu, Cai. This is an open access article distributed under the terms of the Creative Commons Attribution License , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

Data Availability: All relevant data are within the paper and its Supporting Information files.

Funding: This research was supported by the Fundamental Research Funds for the Central Universities (Grant No. 30924010411), the General Project of Zhejiang Provincial Department of Education (Grant No. 1T099323061), the Planning Project of Hangzhou Philosophy and Social Science (Grant No. Z23JC040). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.

Competing interests: The authors have declared that no competing interests exist.

Introduction

The pervasiveness and integration of digital technologies into the economy and society have profoundly impacted social life and emerged as a crucial driver for high-quality economic development. Digitalization addresses the phenomenon of the growing application of information and communications technologies (ICT) across various sectors of the economy [ 1 ]. In the energy system, rapid digitalization, such as the development of smart grids and the Energy Internet, provides critical technical support for the transition to a greener economy [ 2 , 3 ]. As the time window for limiting global warming to a manageable level is closing [ 4 ], and considering the need for cross-domain integration, energy digitalization–the technology convergence between energy technologies, especially Low-Carbon Energy Technologies (LCET), and ICT–has attracted broad attention [ 1 , 3 , 5 ]. Technology convergence refers to the phenomenon where technology domains overlap, and it has been viewed as a significant driver of technology change along with the development of increasingly complex products [ 6 , 7 ].

Following and identifying the research and development (R&D) directions for LCET digitalization is of great strategic importance for both firms and policy-makers. For firms in LCET and ICT industries, accurately capturing the technology convergence trajectory is crucial for adapting to the changing competitive landscape [ 7 ]. For policy-makers, it is essential to strategically plan innovation policy instruments to accelerate technology convergence, which can impact a country’s competitiveness in the technology markets [ 5 ], as well as accelerate the decarbonization of the energy system [ 1 ]. Although the convergence of ICT with different sectors, such as broadcasting, entertainment, and biotechnology, has been the subject of numerous studies using patent data [ 8 , 9 ], the study of ICT convergence with LCETs using patent data has received little attention, with few exceptions, which show the signs of technology convergence between LCET, such as solar PV, wind, and energy storage technologies and ICT [ 3 , 5 ]. Given that LCET and ICT are becoming increasingly interlinked, previous studies fail to provide specific and practical technology opportunities for LCET digitalization. It is still difficult to make decisions on the R&D direction of digitalization, and thus, it raises challenges for identifying digitalization technology opportunities in the converged technology fields [ 5 ].

To address the challenges, this paper presents an innovative approach to identifying digitalization technology opportunities in LCET by utilizing an adapted collaborative filtering method incorporating patent classification and patent citation data, from the perspective of technology convergence. The contribution of this paper is twofold: First, given the challenges of identifying and capturing the opportunity window of technology change derived from the digitalization transformation in LCET, this paper serves as an important supplement to existing research. Second, at the methodology level, the adapted collaborative filtering method proposed in this paper has advantages such as low technical complexity and novel recommendations. Compared to text-mining-based methods that may rely on researchers’ subjective judgment, this method has a stronger repeatability. Besides, while collaborative filtering has been applied in firm-level technology opportunity identification, this paper expands its application to the industry level, and thus broadens the application scope of this method.

Specifically, we initially empirically validate the effectiveness of the proposed collaborative filtering approach in identifying historical digitalization technology opportunities based on the LCET patents applied in the period 2011–2015. Subsequently, leveraging the LCET patents applied in the period 2016–2020, we dive deeper into current LCET digitization technology opportunities. Our findings reveal that the LCET digitalization technology opportunities identified by our method are predominantly concentrated in the field of Energy storage, accounting for over 50% of the identified LCET CPC codes. Policy implications could be derived from the results.

The rest of the paper is organized as follows: The “Literature Review” section shows the literature review; the “Methodology” section provides the details of the proposed method. The identification of digitalization technology opportunities in LCET domains is provided in the section “Empirical analysis: the LCET case”, the “Discussion” section shows the discussion, and the “Conclusions” section provides the conclusions.

Literature review

The digitalization of lcet.

The challenges to mitigate the influence of human-induced climate change have led to significantly increasing efforts to stimulate eco-innovations, i.e., innovations that contribute to reducing environmental burdens [ 10 , 11 ]. Along with the development and pervasiveness of digital technologies, many scholars have reached a wide consensus that eco-innovations have been linked to the technical change in the ICT domain [ 11 , 12 ]. Digitalization describes the growing application of ICT across the economy [ 1 ]. As a notable example, digitalization in the energy system is having profound impacts on both energy demand and supply, which could improve energy efficiency in the whole energy sector [ 1 ]. In this paper, we focus on the digitalization of one kind of specific eco-innovative technological solutions that has been regarded as the key to the transition to a sustainable economy [ 10 , 13 , 14 ], namely low-carbon energy technologies (LCET) that refers to technologies aimed at reducing greenhouse gas emissions, energy consumption, environmental impacts, as well as contribute to redesigning the global energy system [ 15 , 16 ].

The rapid digitalization in the energy sector, particularly the LCET domains, such as renewable energy production and energy storage domains, provides a promising pathway toward a sustainable energy system–one characterized by higher resilience and flexibility [ 5 , 17 , 18 ]. Along with this, the significance of emerging digital technologies, such as blockchain [ 19 ], energy big data, and cloud computing [ 20 ] has been recognized. Meanwhile, scholars have provided empirical evidence that justifies energy digitalization for environmental sustainability. For example, Shi et al. [ 21 ] find that energy digitalization exhibits a statistically significant ability to enhance regional carbon productivity in China.

However, the digitalization processes of LCET are not always linear. For example, Kangas et al. [ 5 ] proposed that the immature nature of solar PV technology shadowed its digitalization development. This shadowed digitalization trend is thought to be continued since there is considerable improvement potential in energy conversion efficiency and cost efficiency in basic material technologies. Meanwhile, the depth of digitalization may be not equal across different parts of a field [ 5 ]. In this regard, to foster comparative advantages in the information era, LCET firms need to follow and predict ICT developments and identify opportunities for digitalization development, for which the underlying theory is built on the more general technology convergence literature [ 5 ].

Technology convergence and monitoring LCET digitalization using patent data

Technology convergence has long been recognized as an important driver of technology change [ 6 , 7 ], which denotes the overlap between hitherto separate technology domains [ 22 ]. The concept of technology convergence naturally matches the digitalization dynamics well. According to the definition of digitalization, i.e., the growing application of ICT across the economy, the LCET digitalization processes could be regarded as the convergence between ICT and LCET technologies [ 3 , 5 ].

Following the main strand of convergence studies, this study monitors the LCET digitalization dynamics using patent data [ 22 ]. Patent data, which is regarded as the important carrier of technology innovation outputs, has been employed in technology evolution and convergence analysis in several previous studies [ 7 , 10 , 23 – 26 ]. Patent co-classification analysis is the most common patent-based technology convergence measurement method [ 5 ]. Patent co-classification refers to different patent classification codes being assigned to a single patent document, which denotes that the invention holds the technical features of different technology fields indicated by different patent classification codes. The increasing co-classifications of previously separated patent classification codes imply technology convergence [ 5 ]. Similarly, technology convergence can also be identified by the rise of patent citations between different technology domains [ 7 , 9 , 27 , 28 ].

Note that compared to the co-classification, patent citation measurement is thought to be more appropriate to describe the stretching process between different domains, rather than to the actual technology convergence event that signifies the creation of hybrid new technology [ 7 ]. Stimulated by this argument, in this paper, we identify the digitalization process, which is represented as the ICT convergence, based on patent citation data to capture the boundary-blurring process between technology domains. Besides, Caviggioli [ 7 ] proposed that the cross-citations can work as a predictive factor of the co-classification event. We posit that identifying digitalization technology opportunities based on patent citation information will be of higher farsightedness.

In terms of the application of patent-based technology convergence analysis in the theme of "digitalization", although the research on ICT convergence based on patent data has received attention for a long time, few studies focus on the convergence between ICT and LCET. To our knowledge, only a few exceptions have analyzed the ICT convergence trend of solar PV, wind, and energy storage fields using patent co-classification data [ 3 , 5 ]. Given the rapid development in basic technologies, as well as the shadowed digitalization processes, these available patent-based LCET digitalization studies suggest the importance of identifying LCET digitalization opportunities.

Collaborative filtering and its application in technology opportunity identification

Collaborative filtering is one of the most widely used recommendation methods, which aims to recommend items that are suitable for a target user based on the information of the user’s preference and the historical purchasing data [ 29 ]. The first automotive collaborative filtering system, known as GroupLens [ 30 ], aims to recommend news articles to target users. Its logic is rooted in the assumption that if a particular group of users has had consistent preferences for news in the past, their preferences will remain consistent in the future. GroupLens collects user preferences through rating, i.e., users rating the articles they have read (ratings range from integers 1 to 5, with higher scores indicating greater user preference for the article). The system then calculates the similarity of preferences among users, and selects a group of users with high similarity to the target user to predict the target user’s preference for new articles.

Compared to other recommendation systems, collaborative filtering has several advantages. First, collaborative filtering does not require understanding the item itself, as it does not depend on the item information. Second, the collaborative filtering technique can recommend unexpected items because this technique is based on other users’ historical data [ 29 , 31 , 32 ]. Collaborative filtering is known for its simplicity and effectiveness [ 29 ], and has been applied in many studies, such as facilitating knowledge collaboration between developers [ 33 ] and identifying new R&D ideas [ 34 ].

Technology opportunities are a set of opportunities with the possibility of technological progress [ 35 ]. Identifying technology opportunities has a profound impact on industries’ and firms’ innovation [ 36 , 37 ]. Technology opportunity discovery (TOD) refers to discovering and selecting the best opportunities for the industry or firm from a large amount of data [ 29 , 36 ]. It can supplement the subjective ideas of traditional researchers and engineers, ultimately enhancing innovation efficiency [ 36 ]. Collaborative filtering recommendations have been utilized to identify technology opportunities. Park et al. [ 29 ] developed a firm-level technology opportunity identification method based on patent classification and collaborative filtering, the effectiveness of which has been verified in empirical analysis. In this paper, based on the method of Park et al. [ 29 ], we construct an adapted collaborative filtering method for identifying industry-level digitalization technology opportunities.

Methodology

In this paper, a methodology for identifying the technology opportunities that have a high potential for integrating digital technology solutions is suggested, based on the industry’s current technological knowledge base. Following the prior work of Park et al. [ 29 ], this paper utilizes a set of patent classification codes to represent the technological knowledge base of the focal domain. In specific, the Cooperative Patent Classification (CPC) codes are employed to denote knowledge elements. It then recommends the potential classification codes using a collaborative filtering technique. The methodology, with its simple and automatic implementation process, is highly replicable in other technology domains. The methodology proposed in this paper consists of three major steps: (1) Collecting knowledge elements, (2) Representing potential technology opportunities, and (3) Identifying technology opportunities. The proposed implementation process is illustrated in Fig 1 .

thumbnail

  • PPT PowerPoint slide
  • PNG larger image
  • TIFF original image

https://doi.org/10.1371/journal.pone.0309420.g001

Collecting knowledge elements

A knowledge element refers to a self-standing embodiment of a core concept in a distinct scientific or engineering principle within a certain technology field [ 36 , 38 ]. The CPC codes of patents in the target technology field (TTF) are used to denote the knowledge elements in that field and are referred to as CPC TTF in the remainder of this paper. Based on the logic of collaborative filtering, then, it is necessary to calculate the similarity between knowledge elements. In this paper, since the convergence process is measured using patent citation information, we propose the citation-based measurement of similarity between CPC TTF to capture the logical consistency.

Specifically, consider N 1 as the number of TTF patents and L as the number of unique CPC TTF . Then, the binary N 1 ⊆L matrix A is defined as A il = 1 if TTF patent i contains CPC TTF l . Similarly, consider K unique CPCs which are assigned to N 2 patents cited by TTF patents, i.e., k unique CPC REF , the binary N 2 ⊆K patent-CPC matrix B can be defined as B jk = 1 if the cited patent j contains CPC REF k . The matrices A and B are coupled via citation relationships, which is represented as a binary N1 × N2 citation matrix M .

The l th row of the matrix A T M shows the number of citations from CPC TTF l to any cited patent j . In the same way, the l th row of the matrix O = A T MB gives the number of citations from CPC TTF l in TTF patents to any CPC REF k in the cited patents.

uspto patent assignment dataset

Representing potential technology opportunities

uspto patent assignment dataset

Identifying technology opportunities

uspto patent assignment dataset

Empirical analysis: the LCET case

Data source.

The patent dataset used in this analysis is derived from the PatentsView platform ( https://patentsview.org/download/data-download-tables ) in June 2024, which contains the granted patents in the United States Patent and Trademark Office (USPTO) since 1976. The application year is used as the indicator of time for each invention. The reason for this setting is that the application date is closer to the inventions’ actual creation time, which facilitates reflecting the temporal technology dynamics more accurately [ 39 ]. Besides, to focus the analysis on high-quality technology activities, only the utility patents are considered in this paper (for a similar setting, see [ 40 ]).

In this paper, the CPC codes are employed to identify LCET patents. Following Park et al. [ 29 ], the CPC main groups are used to denote the knowledge elements. The CPC system is divided into nine sections, A-H and Y, which are further subdivided into classes, subclasses, main groups, and subgroups [ 41 , 42 ]. Table 1 shows an example of the CPC structure. The CPC system was developed by the European Patent Office (EPO) and USPTO to harmonize patent classifications and to replace the former European Classification System (ECLA) and U.S. Patent Classification (USPC) system. The CPC system is similar to the International Patent Classification (IPC) but is more detailed and comprehensive [ 43 ]. A significant difference between CPC and IPC is that CPC contains the “Y” Section. The CPC codes in the “Y” section do not indicate separate technological classes but are additional tags attached to patents by examiners to tag some special technical subjects. The "tags" corresponding to the LCET are in CPC subclass "Y02E". Note that CPC in the “Y” section are not treated as knowledge elements in this paper, and are only used to identify LCET patents.

thumbnail

https://doi.org/10.1371/journal.pone.0309420.t001

To identify the ICT patents, the IPC code list of ICT patents employed by Kangas et al. [ 5 ] and Zhang et al. [ 3 ] is used ( Table 2 provides the IPC code list of ICT patents). We then use the CPC to IPC concordance table ( https://www.cooperativepatentclassification.org/cpcConcordances , accessed in June 2024) to identify the corresponding CPC codes for ICT patents. Patents that are assigned with those CPC codes are identified as ICT patents. The LCET-ICT patents are defined as LCET patents that cite ICT patents or that can be identified as ICT patents. An LCET-ICT patent is regarded as an instance of LCET digitalization.

thumbnail

https://doi.org/10.1371/journal.pone.0309420.t002

The overall analysis of LCET digitalization

This section provides a description of the LCET innovation and digitalization dynamics over the period 1986–2020. In the period 1986–2020, there were 173,486 granted LCET patents that were identified through the “Y02E” CPC tags, of which 52,709 were LCET-ICT patents. Fig 2(A) shows the evolution of the annual application of granted LCET and LCET-ICT patents, on a semi-log axis, indicating the exponential growth in the invention of LCET and LCET-ICT. However, compared with the growth rate of LCET inventions, the growth rate of LCET-ICT inventions is relatively slow. Thus, in Fig 2(B) , one can find a declining trend in the growth rate of the proportion of LCET-ICT patents during 1986–2010. The growth rate of the proportion of LCET-ICT patents increased after 2010, which could partly be explained by the decreased growth rate of LCET patent applications.

thumbnail

( a) presents the annual application number of granted LCET and LCET-ICT patents, as well as the number of annual unique CPC main groups in corresponding LCET and LCET-ICT patents during 1986–2020 (b) presents the share of granted LCET-ICT patents in the LCET patents applied during 1986–2020 (c) presents the share of unique CPC main groups of granted LCET-ICT patents in the unique CPC main groups of granted LCET patents applied during 1986–2020.

https://doi.org/10.1371/journal.pone.0309420.g002

Similarly, the number of unique CPC codes in LCET and LCET-ICT patents show an exponential growth trend, while the growth rate of the number of CPC codes in LCET-ICT patents is not as fast as that in LCET patents after 2004, which also leads to a decline in the growth rate of the proportion of CPC codes in LCET-ICT patents after 2004 (see Fig 2(C)) . In other words, the overall digitalization degree of LCET has not linearly improved alongside the development of LCET technology. It is necessary to further explore the opportunities and expand the scope of technology convergence between LCET and ICT domains.

Constructing CPC citation similarity matrix

The collaborative filtering proposed in this paper involves several parameters, including the threshold of the similarity to determine the neighbors of potential technology opportunities and the number of selected technology opportunities based on the LDS ranking. To determine the parameters, the dataset of LCET patents applied in the period 2011–2015 is employed to identify historical digitalization technology opportunities. The parameters are calibrated based on the accuracy of the technology opportunity identification, which is calculated by involving the LCET patents applied in the period 2016–2020. The calibrated parameters are then employed to identify the current digitalization technology opportunities, based on the dataset of LCET patents applied in the period 2016–2020.

To identify the historical digitalization technology opportunities, first, a total of 48,821 granted LCET utility patents applied in the period 2011–2015 are identified based on “Y02E” CPC tags. Then, the granted patents cited by those LCET patents are collected.

The CPC main groups are employed to represent the knowledge elements. The CPC main groups in LCET patents (CPC LCET ) and cited patents (CPC REF ) are used to construct the matrix O mentioned in the Methodology section. LCET patents that do not cite other granted utility patents and patents having no CPC information are excluded when constructing matrix O. Based on the 2011–2015 patent data, we construct the matrix O containing 5,011 rows (denoting CPC LCET ) and 7,701 columns (denoting CPC REF ). To rescale matrix O into five-point-scale values, following Park et al. [ 29 ], the parameters in the fuzzy logic algorithm are set as a = 2 and b = 1. The rescaled matrix O, then, is used to measure the similarity between CPC LCET , i.e., CSLCET.

Measuring the latent digitalization score

uspto patent assignment dataset

The higher the LDS, the more likely it is that CPC LCET,i will be used in LCET-ICT patents in subsequent inventions. Note that as long as the vectors corresponding to the two CPC LCET overlap, the similarity between the two CPC LCET is not 0. To identify neighbors with higher similarity to the focal CPC LCET , we set a threshold of similarity, and the similarity values less than the threshold in the CSLCET are set as 0. In this way, the LDS will be calculated based on the DS of neighbors with high similarity to the focal CPC LCET .

Accuracy of historical opportunity identification

The CSLCET constructed based on LCET patents with application years from 2011 to 2015 contains a total of 5,011 CPC main groups, of which 1,430 did not appear in LCET-ICT patents. That is, historical potential technology opportunities encompass 1,430 CPC main groups. To calculate the LDS of these 1,430 CPC main groups, we set the value of the similarity threshold s ranging from 0 to 1 with the interval of 0.02, and then the value of the selected technology opportunities number (parameter n ) ranging from 20 to 200 with the interval of 5. LCET patents applied in the period 2016–2020 are collected to identify whether patents in those CPC are used in LCET-ICT patents in the subsequent inventions. The proportion of the top n CPC main groups with the highest LDS scores that are involved in LCET-ICT inventions in 2016–2020 is regarded as the accuracy of the digitalization technology opportunity identification.

Fig 3 shows the accuracy of technology opportunity identification under different parameter combinations, from which we can see that the accuracy decreases as the two parameters increase. Considering the accuracy and the number of identified digitalization technology opportunities, we set 0.18 and 30 as the values of parameters s and n respectively. The accuracy in this setting is about 83.3%, which is significantly higher than the digitalization share (around 34.5%) of the total 1,430 CPC main groups in 2016–2020 (the Two Proportions Z-test p-value is less than 0.001).

thumbnail

(a) presents the accuracy under different parameter combinations, in which the similarity threshold ranges from 0 to 1, and the selected technology opportunity number ranges from 20 to 200 (b) presents the accuracy under different parameter combinations, in which the similarity threshold ranges from 0 to 0.4, and the selected technology opportunity number is set as 30, 40, 50, 60, and 70.

https://doi.org/10.1371/journal.pone.0309420.g003

Current digitalization technology opportunity identification

After determining the values of s and n, we once again identify the current digitalization technology opportunity of LCET based on the patent data applied in the period 2016–2020. When limiting the patent application years to 2016–2020, a total of 50,403 granted LCET patents are identified with the CPC "Y02E" tags. The CSLCET constructed based on the aforementioned LCET patents contains a total of 5,421 CPC LCET , of which 1,351 did not appear in the LCET-ICT patents. We then set CSLCET values less than 0.18 as 0, and calculate the LDS values for the 1,351 CPC main groups.

We also assign the technology opportunities to the specific LCET fields based on the LCET patent data applied during 2016–2020. This assignment focuses on a set of 11 distinct LCET fields identified by “Y02E” CPC tags: Geothermal (Y02E10/1), Hydro (Y02E10/2), Ocean (Y02E10/3), Solar thermal (Y02E10/4), Solar PV (Y02E10/5), Wind (Y02E10/7), Energy storage (Y02E60/1), Hydrogen (Y02E60/3), Fuel cells (Y02E60/5), Clean combustion (Y02E20), and Non-fossil fuel (Y02E50). Patents containing multiple Y02E tags that indicate different focal LCET fields are counted repeatedly for each field. For example, if one patent has two Y02E tags, Y02E10/5 and Y02E10/7, this patent would be recorded as one solar PV patent and one Wind patent. However, if one patent has two Y02E tags for one focal field, such as Y02E10/541 and Y02E10/542, this patent would only be recorded as one Solar PV patent. One issue that arose during the above data processing is that some patents may be classified in coarse CPC codes, indicating that these patents are multipurpose. In this paper, the coarse CPC codes include Y02E10/00 (Renewables excluding Non-fossil fuel), Y02E10/60 (Solar thermal and PV), and Y02E60/00 (Enabling technologies). Patents classified under coarse CPC codes are split, once into each related focal LCET field. The field of a certain technology opportunity is defined as the field with the highest proportion of patents.

The CPC main groups with the top 30 LDS are shown in Table 3 . According to the distribution of technology opportunities, the Energy storage field holds the largest part of digitalization technology opportunities, i.e., 16 of the 30 identified CPC main groups. Additionally, the LCET digitalization technology opportunities are mainly concentrated in the following CPC sections: "B. Performing operations; transporting", "C. Chemistry; metallurgy", and "F. Mechanical engineering; lighting; heating; weapons; blasting".

thumbnail

https://doi.org/10.1371/journal.pone.0309420.t003

Fig 4 presents the number of granted patents applied in the period 2016–2020 among different LCET fields. Fig 4 shows that more than 20,000 granted LCET patents filed during the focal time window are in the Energy storage field, which is significantly higher than other LCET fields. Given the large number of inventions, it is reasonable to expect considerable digitalization technology opportunities in the Energy storage field. However, the invention volume could only partly explain the distribution of digitalization technology opportunities. Comparing Fig 4 and Table 3 , it can be observed that although the number of granted Solar PV patents is the second highest among all LCET fields, there is only one opportunity in the Solar PV field in Table 3 . We posit that the nature of technological inventions in the Solar PV field, specifically, that technological improvements might mainly rely on the development of material science, could explain this result.

thumbnail

https://doi.org/10.1371/journal.pone.0309420.g004

Table 4 presents the LDS value for each technology opportunity, along with the typical CPC code of LCET-ICT patents (CPC LCET-ICT ) that exhibits a high degree of LDS contribution to each technology opportunity. Additionally, it lists the typical ICT CPC codes referenced by LCET patents in these CPC LCET-ICT , indicating potential digital technology solutions for inventions within each technology opportunity. For example, the technology opportunity "F23D2203” that indicates “Gaseous fuel burners” closely resembles "F23D14” (Burners for combustion of a gas). The ICT patents cited by LCET patents in "F23D14” are mainly in the fields like “G01F23” (Indicating or measuring liquid level or level of fluent solid material). One exemplary patent in " F23D2203” (primarily in the clean combustion field) is titled “Fuel combustion system with a perforated reaction holder”, which provides a solution for holding a combustion reaction that produces very low oxides of nitrogen. The invention involves a fuel and oxidant source to output and mix them into a combustion volume, and a perforated reaction holder with aligned perforations to hold the combustion reaction. The application of digital technologies, such as measuring the level of material, could potentially further enhance the combustion process.

thumbnail

https://doi.org/10.1371/journal.pone.0309420.t004

Despite the salient trend of energy system digitalization, it is still difficult to identify the R&D direction regarding the convergence between the two complex technology sectors, i.e., the energy sector and the ICT sector. Particularly, in pursuit of sustainability and green growth, LCET innovative agents need to follow ICT development and identify opportunities for digitalization development. In response, our methodology identifies digitalization technology opportunities customized to the current LCET field technology portfolio, so that the LCET innovative agents could potentially increase the possibility of success in digital R&D. In this study, a set of highly recommended LCET-related CPC codes were identified using a collaborative filtering technique. In addition, we assigned the identified CPC codes to different LCET fields based on the current LCET technological portfolio.

Over half of the identified CPC codes belong to the Energy storage field. This result makes sense because, first, the share of Energy storage patents is the largest in LCET. Along with the rapid digitalization trend, more inventions in basic technology might imply more ICT convergence opportunities. Second, our finding is in line with some previous studies concerning the digitalization trend of energy-storage systems. For example, Zhang et al. [ 3 ] found that the digitalization of energy storage system had accelerated significantly since 2018; Mejia et al. [ 44 ] found that industry research in the energy storage field had been directed toward electric digital data processing for multi-power systems. Moreover, the significantly larger volumes of energy storage patents and digitalization technology opportunities also correspond to previous studies that presented the importance of energy storage digitalization in enhancing system operation and maintenance [ 17 ].

Although there were also considerable Solar PV patents applied in the period 2016–2020, none of the identified CPC codes are in that field. This result is consistent with previous studies concerning the nature and digitalization of Solar PV technology. Solar PV, which has a high scale of production, follows the life-cycle pattern of mass-produced goods: early product innovations were followed by a surge of process innovations in solar cell production [ 45 ]. The improvement of the energy conversion efficiency and the decrease of solar cell production cost both rely on the advance of basic material technologies. However, the basic technologies may have little interaction with the digital solutions. In this way, given the rapid growth of investments and inventions in solar PV [ 39 ], ICT convergence opportunities are still scarce [ 5 ].

Note that although few digitalization opportunities in fields such as Solar PV and Wind are identified in this analysis, it does not mean that their digitalization tends to be stagnant. The digitalization of Energy storage is one key implement to support the development of renewable energy technologies. Renewables, such as Solar PV and Wind, are inherently intermittent. It is crucial to have enough flexibility in the power system for reliability and effectiveness when maintaining a high renewable market penetration [ 1 ]. Digitally enabled demand response and energy storage are expected to facilitate a higher share of solar PV and wind power and reduce CO 2 emissions [ 1 , 46 ].

The identified CPC codes in our analysis illustrate practical R&D directions, which facilitate LCET innovative agents to follow the rapid ICT convergence. For clean combustion field, the identified digitalization opportunities are mainly related to engines and burners. Typical ICT technologies associated with measuring, controlling, and material analyzing, e.g., G01F23 (indicating or measuring liquid level or level of fluent solid material), G05B13 (adaptive control systems), and G01N11 (investigating flow properties of materials; analysing materials by determining flow properties) could provide the potential digital solutions for clean combustion technology. For energy storage field, the identified digital technology opportunities are mainly related to electrode and electrolyte materials and energy storage devices. In addition to applying measuring, controlling, and material analyzing technologies to material preparation, sorting technology may also play a role in improving the overall performance of energy storage material processing, e.g., B07C5 (sorting according to a characteristic or feature of the articles or material being sorted) could be combined with D01F2 (monocomponent artificial filaments or the like of cellulose or cellulose derivatives) for battery separator. Similarly, measuring, controlling, and material analyzing technologies could also work as the digital solutions for material related digitalization opportunities in Fuel cells, Hydrogen, Non-fossil fuel, and Solar PV fields, while computing related technologies, e.g., G06F9 for arrangements for program control, could be involved to improve the overall performance of LCET system.

Besides, policy implications could be derived from the analysis. Given the importance of digitally enabled energy storage, as well as the salient digitalization technology opportunities in the energy storage field, it is necessary to encourage inter-sector R&D activities to foster interdisciplinary inventions. For example, policies or demonstration projects that facilitate the collaboration between energy storage firms and renewable energy firms, such as solar PV and wind power firms, are expected to accelerate LCET digitalization and energy system decarbonization. Moreover, along with the rapid development of emerging digital technologies, such as blockchain, big data, and cloud computing, it is also important for both innovative agents and policy-makers to strengthen the practical applications of digital solutions during product and process innovation, as well as throughout the entire chain of LCET.

Conclusions

Technology convergence has become increasingly relevant to technology changes, which provides the opportunity window for latecomers’ catch-up and can reshape the competitive landscape, especially with the trend of digitalization. The diffusion of ICT has profoundly impacted social life. In the energy sector, rapid digitalization, especially in the LCET, provides a reliable path for transition to a greener energy system. Given that the trend of digitalization of LCET has been empirically analyzed based on patent data, there are still challenges in identifying the technology opportunities of LCET digitalization, which is of strategic importance for both innovative agents and policy-makers in capturing the forthcoming changes.

To address the challenges, this paper proposes an adapted collaborative filtering using patent data, from the perspective of technology convergence. In this paper, the proposed collaborative filtering is applied to a large LCET patent dataset derived from the United States Patent and Trademark Office (USPTO). Specifically, we first empirically justify the effectiveness of the proposed collaborative filtering in the historical digitalization technology opportunity identification based on LCET patents applied in the period 2011–2015. Then, based on the dataset of 2016–2020, we identify the current digitalization technology opportunities further in the LCET domains. The results show that the LCET digitalization technology opportunities identified through the proposed method are primarily concentrated in the field of Energy storage, which accounts for 16 of the 30 identified CPC main groups. Besides, the identified digitalization technology opportunities are mainly found in the CPC "B. Performing operations; transporting", "C. Chemistry; metallurgy" and "F. Mechanical engineering; lighting; heating; weapons; blasting" Sections.

The proposed method is of high data availability and replicability. Researchers can further apply this method to other technologies to identify technology convergence opportunities. However, there are still some limitations in this paper. For example, the proposed methodology only considers the technical factor that drives the technology convergence, while ignoring potential market factors. Thus, future studies can incorporate dimensions such as market demand to pursue a more comprehensive method.

Supporting information

https://doi.org/10.1371/journal.pone.0309420.s001

  • 1. IEA. Digitalisation and Energy. Paris: IEA. https://www.iea.org/reports/digitalisation-and-energy , License: CC BY 4.0, 2017.
  • View Article
  • Google Scholar
  • 30. Resnick P, Iacovou N, Suchak M, Bergstrom P, Riedl J. GroupLens: an open architecture for collaborative filtering of netnews. Proceedings of the 1994 ACM conference on Computer supported cooperative work; Chapel Hill, North Carolina, USA: Association for Computing Machinery; 1994. p. 175–86.
  • 32. Koren Y. Factorization meets the neighborhood: a multifaceted collaborative filtering model. Proceedings of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining; Las Vegas, Nevada, USA: Association for Computing Machinery; 2008. p. 426–34.
  • 46. IEA. World Energy Outlook 2016. Paris: IEA. https://www.iea.org/reports/world-energy-outlook-2016 , Licence: CC BY 4.0, 2016.

COMMENTS

  1. Patent Assignment Dataset

    Patent Assignment Dataset. The USPTO allows parties to record assignments of patents and patent applications to, as much as possible, maintain a complete history of claimed interests in a patent. The USPTO also permits recording of other documents that affect title (such as certificates of name change and mergers of businesses) or are relevant ...

  2. PDF The USPTO Patent Assignment Dataset: Descriptions and Analysis

    3 notice of "equitable interests" or other matters pertaining to the ownership of a patent or patent application.5 The Dataset contains detailed information on roughly 6 million patent assignments and other transactions recorded at the USPTO between 1970 and 2014 involving over 10 million U.S. patents and U.S. patent

  3. USPTO Datasets

    Dataset Categories. Historical patent data files (7); Issued patents (patent grants) (patent grant data) (16) Patent and patent application classification information (current) available bimonthly (odd months) (3) Patent assignment economics data for academia and researchers (8); Patent assignment XML (ownership) text (AUG 1980 - present) (2) Patent official gazettes (1)

  4. USPTO Datasets

    Dataset Categories. Historical patent data files (7); Issued patents (patent grants) (patent grant data) (16) Patent and patent application classification information (current) available bimonthly (odd months) (3) Patent assignment economics data for academia and researchers (8); Patent assignment XML (ownership) text (AUG 1980 - present) (2) Patent official gazettes (1)

  5. United States Patent and Trademark Office

    United States Patent and Trademark Office. Select one. Enter assignment information in any field or combination of fields. The number assigned when a patent application is filed at USPTO. The person, group of persons, or organization that recieved ownership rights of the patent application or patent. The nature of the transfer of ownership of ...

  6. USPTO Datasets

    Patent examination research dataset (stata (.dta) and MS excel (.csv)) Contains detailed information on more than 13 million publicly viewable patent applications filed with the USPTO along with more than 1 million PCT applications through June 2023. The data files include information on each appli ... more. Updated: 2023-09-26.

  7. USPTO Datasets

    USPTO Datasets Protecting inventors and entrepreneurs fuels innovation and creativity, driving advances that can benefit society. As the federal agency that grants patents and registers trademarks, we hold a treasure trove of data. ... Patent assignment XML (ownership) text (AUG 1980 - present) (2) Patent official gazettes (1) Prosecution ...

  8. Trademark Assignment Dataset

    The 2023 update to the Trademark Assignment Dataset contains detailed information on more than 1.38 million assignments and other transactions recorded at the USPTO between March 1952 and January 2024, involving 2.39 million unique trademark properties (an individual application or registration). A working paper describing these data is ...

  9. Updated Patent Assignment Dataset

    The 2023 update to the Patent Assignment Dataset is now available. The latest update contains detailed information on 10.5 million patent assignments and other transactions recorded at the United States Patent and Trademark Office (USPTO) since 1970 and involving roughly 18.8 million patents and patent applications.

  10. iiindex -> USPTO OCE Patent Assignment Data

    A dataset of 8.97 million patent assignments and other transactions recorded at the USPTO since 1970. The dataset is derived from the USPTO online databases and can be accessed through the website or obtained as bulk data products.

  11. PDF USPTO Patent Assignment Dataset Schema

    Title: USPTO Patent Assignment Dataset Schema Author: U.S. Patent and Trademark Office Keywords: patent assignment dataset schema Created Date: 3/21/2022 12:59:20 PM

  12. Patent transactions in the marketplace: Lessons from the USPTO Patent

    This article describes the USPTO Patent Assignment Dataset (UPAD), a relational database of roughly 6 million assignments, licenses, securitizations, and other conveyances involving about 10 million U.S. patents and patent applications, recorded 1970-2014. To promote research uses, this article provides a comprehensive data description and ...

  13. The USPTO Patent Assignment Dataset: Descriptions and Analysis

    This paper describes the USPTO Patent Assignment Dataset, a database of roughly 6 million assignments and other transactions recorded during the 1970-2014 period and affecting about 10 million patents or patent applications. Since these data have not been commonly used, we provide a comprehensive description and present stylized facts to ...

  14. USPTO Datasets

    USPTO Datasets Protecting inventors and entrepreneurs fuels innovation and creativity, driving advances that can benefit society. ... (XML) in accordance with the Patent Assignment Daily XML (PADX) Version 0.3 Document Type Definition (DTD). Updated: 2023-07-20 Download (3.43 MB) Dates Available Sep 30, 2014 - Jul 20, 2023. Trademark ...

  15. United States Patent and Trademark Office

    Applications Pending and registered trademark text data (no images) to include word mark, serial number, registration number, filing date, registration date, goods and services, classification number (s), status code (s), and design search code (s). Trademark Daily XML File (TDXF) Applications Version 2.0 (JAN 2024 - DEC 2024) Trademark Annual ...

  16. Assignment Center

    The U.S. Patent and Trademark Office (USPTO) is streamlining the process for recording assignments and other documents relating to interests in patents and trademarks. Our new system will guide you through the steps of making a submission, provide easier editing capabilities, and allow you to see the progression and status of your submission.

  17. USPTO Patent Assignment Dataset with 2019 data files now available

    The USPTO provides a relational database of patent assignment and other transactions, derived from parties' recording with the office. The latest update, released in 2020, contains 8.6 million records since 1970, involving 14.9 million patents and applications.

  18. USPTO OCE Patent Assignment Dataset

    Detailed data patent assignments since 1970 (BigQuery) Detailed data patent assignments since 1970 (BigQuery) Kaggle uses cookies from Google to deliver and enhance the quality of its services and to analyze traffic. Learn more. OK, Got it. Something went wrong and this page crashed! ...

  19. The USPTO Patent Assignment Dataset: Descriptions and Analysis

    A legally valid assignment (generally a legal agreement) transfers all or part of the right, title, and interest in a patent or application from an existing owner (an assignor) to a recipient (an ...

  20. Patent Transactions in the Marketplace: Lessons from the USPTO Patent

    This paper describes the contents of the USPTO Patent Assignment Dataset, a database covering roughly 6 million assignments and other transactions recorded during 1970-2014 and affecting about 10 million US patents or patent applications published 1930-2014. Records include information on transferred patent and application numbers, the dates a ...

  21. Updated Patent Datasets now available

    The United States Patent and Trademark Office's (USPTO) Office of the Chief Economist released 2020 updates for two research datasets, the Patent Assignment Dataset and the Patent Examination Research Dataset (PatEx). The Patent Assignment Dataset now contains detailed information on 8.97 million patent assignments and other transactions ...

  22. The Harvard USPTO Patent Dataset (HUPD)

    Dataset Summary. The Harvard USPTO Dataset (HUPD) is a large-scale, well-structured, and multi-purpose corpus of English-language utility patent applications filed to the United States Patent and Trademark Office (USPTO) between January 2004 and December 2014. Google Colab Notebooks. You can also use the following Google Colab notebooks to ...

  23. Identify the digitalization technology opportunity of low-carbon energy

    In this paper, the proposed collaborative filtering is applied to a large LCET patent dataset derived from the United States Patent and Trademark Office (USPTO). Specifically, we first empirically justify the effectiveness of the proposed collaborative filtering in the historical digitalization technology opportunity identification based on ...