Skip to main content
Artificial Intelligence

Training AI: How IP.com’s AI Engine Learned to Accelerate Innovation

As AI becomes a well-established business practice across innovative companies, some tasks are better suited than others for the implementation of artificial intelligence. For R&D and IP teams pushing their organizations through the innovation lifecycle, AI reduces the average non-patent literature search time from 10 hours to just two. Semantic search powered by natural language processing democratizes the patent search process, expanding access to powerful patent and non-patent literature beyond trained IP professionals. These benefits—and their impact on the cost of innovation—cannot be ignored or discounted, regardless of society’s (and the C-suite’s) general distrust of AI.

There are artificial intelligence use cases that make the distrust of AI understandable. Self-driving cars have accidents. Facebook’s algorithms amplify dangerous cognitive biases. However, we have accepted AI in applications that make our lives easier: Instagram-ready portrait mode from Apple, Spotify and Netflix recommendations, and voice search to name just a few.

The antidote to employers’ and employees’ distrust of AI is explainable, trained, tested, and observed models—like the IP.com Semantic Gist® engine—that make life easier. When a searcher puts part of an invention disclosure into the search bar in our software suite, the AI engine identifies the most relevant existing technical literature. That searcher may be hesitant to accept those results as comprehensive because, to the average user, AI’s fundamental “black box” nature makes explainability difficult. However, IP.com’s commitment to training AI models in-house and testing, observing, and maintaining them over time with unstructured, up-to-date real-world data increases the transparency of our AI engine overall.

Training AI

IP.com started training its NLP engine almost two decades ago to understand the language structure contained within patents and used by the people searching these types of technical documents. The AI was (and continues to be) trained using a high-quality dataset comprised of hundreds of millions of data points from corporations, litigation records, patents, and other technical literature. This commitment to uniform, quality data is key to training AI. The artificial intelligence learns from the information it’s given; if inputs are low-quality, outputs will be too.

Many general-use AI engines, including other NLP algorithms like Natural Language Toolkit, learn from more general datasets. Training an engine in-house allowed IP.com to teach its AI modules using application-specific data—in this case, technical documents like patents and publications.

Testing AI

Like human (and even animal) learners, there is a chance that AI is memorizing patterns during training, rather than learning to make predictions and uncover insights. Without testing novel inputs, an AI engine’s author would be unaware of this shortcoming. The testing and evaluation process must use unstructured, real-world data and identify any implicit biases, inconsistencies, or bugs.

At IP.com, testing ensures the AI models that power the workflow solutions are working as intended, delivering relevant patent search results in a fraction of the time it takes trained human searchers to complete the same tasks. Users can add the software suite to their innovation workflow with confidence, knowing IP.com’s AI understands exactly what they’re looking for.

Screenshot of breaking patent apart, highlighting what the AI deems key info

Observing AI

How people interact with searching AI is changing. Think about how you used Google a couple of decades ago. While it wasn’t strictly Boolean logic, it probably was made up of fairly simple phrases. Now, we type (or speak) complete sentences into Google and expect the AI engine to understand what we’re looking for. The same is true for patent searching!

With hundreds of thousands of patents issued each year, as well as new litigation records and technical literature, the AI engine is constantly evolving. This evolution, the ability to continue learning based on new inputs, is part of what makes AI powerful. It can also end poorly without observation. Just spend a few minutes browsing AI Weirdness and you’ll quickly understand how AI can get sidetracked without supervision. IP.com observes and maintains its AI engine to meet users’ needs now and increasing expectations for what an AI tool can deliver. This final piece of the AI training process is what ensures IP.com continuously delivers top-of-class results