This invention measures similarities between sets of data. The data could be natural-language documents or articles, product descriptions, queries, computer code, metadata, or measurements from any real-world objects or processes.
The technology can determine similarities between data sets without needing to know how they interact. Omitting duplicate pieces of data allows the technology to provide more accurate results. Additionally, this technology can provide patterns over time of the data entered. This invention takes a holistic view of the data to make recommendations that are more accurate than commonly used methods.
Potential applications include: pattern analysis for websites or applications – social network analysis, focused advertising, genetic analysis, and forensic accounting.
- Measures similarity between sets without knowing the intersection of the sets
- US patent 8,799,339 available for license
- Potential for collaboration with NSA scientists and engineers