The harvard uspto patent dataset: A large-scale, well-structured, and multi-purpose corpus of patent applications

M Suzgun, L Melas-Kyriazi, S Sarkar… - Advances in …, 2024 - proceedings.neurips.cc
Innovation is a major driver of economic and social development, and information about
many kinds of innovation is embedded in semi-structured data from patents and patent …

Inventor gender and patent undercitation: Evidence from causal text estimation

Y Hochberg, A Kakhbod, P Li, K Sachdeva - 2023 - nber.org
Implementing a state-of-the-art machine learning technique for causal identification from text
data (C-TEXT), we document that patents authored by female inventors are under-cited …