data journey (Robsonphoto/

USPTO’s path from open data to AI

A foundation in open data laid the groundwork for the U.S. Patent and Trademark Office’s artificial intelligence journey, said Scott Beliveau, branch chief of advanced analytics and acting director of data architecture at USPTO.

“AI isn’t magic. There’s no wand to be waved that improves enterprise inefficiencies,” Beliveau said during an Oct. 21 Cognilytica’s webinar. “Having the technology alone simply isn’t enough,” he said. A collaborative environment is just as important.

He broke USPTO’s AI work into four chapters, beginning with open data, which former Director David Kappos leveraged to decrease a backlog of patent applications under the Obama administration. The idea of building an intelligent agent was raised, but it “didn’t get that far off the drawing board,” Beliveau said, mainly because it didn’t get much general support. It didn’t feel achievable, “like proposing the building of a hotel on Mars, but we didn’t have a ship to get there.”

The second chapter opened when Director Michelle Lee succeeded Kappos in 2015. She applied the agile mindset she learned in Silicon Valley and called on a USPTO team to work like a startup to create new ideas for using data. The strategy had three concepts: making tools available to the public to foster a data ecosystem, using public engagement to build external partnerships to drive and foster a data culture and building a platform internally to support the data culture.

“We were looking to empower innovation through better tools and better information so that the public could make smarter decisions and become better-quality applicants,” Beliveau said.

The approach was two-pronged: open data and big data. In 2015, USPTO would share data on a set of Blu-Ray discs that cost buyers $10,000. To change that, the agency hosted an open data roundtable with New York University’s Governance Lab, which brought together industry representatives, the user community, data academics and civic hackers to understand how to target resources and projects. The result was the Developer Hub, USPTO’s open data portal.

“We went from in 2015, $10,000 for a Blu-Ray we’ll mail to you to this giant portal,” Beliveau said. “Within this portal, we gave people all the datasets, catalogs of [application programming interfaces], visualizations -- there was a community platform where people could share and discuss data. We made this treasure trove available to the public. We also provided step-by-step guides about innovation and how they could share.”

The biggest impact the agency had on open data, he added, was its work with the White House Cancer Moonshot in 2016, when then-President Barack Obama called on the country to harness innovation to diagnose and treat cancer. USPTO released related patent data and challenged people to develop visualizations to drive policy funding decisions based on the most promising cancer treatments. “We connected this patent dataset with cancer research drugs,” Beliveau said.

Today, the agency supports about 200 million annual users for open data.

For big data, however, the agency focused on patent quality and looked to create a service model so that supervisors could more easily evaluate the quality of examiners’ work. Early on, they had to sort through many dashboards and reports to pull data together to make their decisions.

USPTO built an index to let supervisors become “data fisherman,” he said, and search documents. For instance, if the same examiner name appears on multiple patent rejections, the supervisor could more easily dig deeper.

The third chapter was written under the leadership of Director Andrei Iancu between 2018 and now. USPTO issued a request for information on techniques that use AI to improve the patent examination process. Because the data was open, RFI respondents had plenty of patent data on which to train their proposed models.

Internally, USPTO developed a cognitive assistant called Unity to use AI and machine learning to build on existing tools.

“The tool is intended to allow patent examiners, through a single-click, to conduct a ‘federated search’ across patents, publications, non-patent literature, and images,” Iancu said in 2019. “And, through AI and machine learning-based algorithms, this would present to the examiner the results in the form of a ‘pre-search’ report.”

Additionally, USPTO enabled employees to create beta tools that would use the agency’s data foundation, Beliveau said.

“It was about the empowerment of employees to be able to homebrew things they found useful that they could share with their friends and neighbors through an analytics sandbox,” he said. “What it did was it actually opened the eyes of a lot of people in the building as well as some folks outside to the art of the possible.”

The fourth chapter is about the future of AI at USPTO and building on lessons learned from the first three: buy-in is never optional, road maps will be fluid and innovation injections work.

“Transformational change can’t be made without a genuine commitment on the part of those most affected, and at some point in the journey, you need to take a break and hand the keys to a passenger maybe sitting in the backseat,” Beliveau said.

About the Author

Stephanie Kanowitz is a freelance writer based in northern Virginia.


  • Records management: Look beyond the NARA mandates

    Pandemic tests electronic records management

    Between the rush enable more virtual collaboration, stalled digitization of archived records and managing records that reside in datasets, records management executives are sorting through new challenges.

  • boy learning at home (Travelpixs/

    Tucson’s community wireless bridges the digital divide

    The city built cell sites at government-owned facilities such as fire departments and libraries that were already connected to Tucson’s existing fiber backbone.

Stay Connected