taxonomise uses embedding-based similarity matching to classify documents against hierarchical taxonomies. It provides a Python API and CLI tool for applying custom taxonomies to any text corpus.
Abstract: Semantic search engines are continuously beset by lexical vagueness, fuzzy mapping of intent, and computational inefficiency, typically leading to suboptimal recall and relevance. In this ...