Sort by Fuzzy Key Match¶
Example
>>> import sys
>>> from io import StringIO
>>> from techminer2.thesaurus.organizations import CreateThesaurus, SortByFuzzyKeyMatch
>>> # Redirecting stderr to avoid messages
>>> original_stderr = sys.stderr
>>> sys.stderr = StringIO()
>>> # Create thesaurus
>>> CreateThesaurus(root_directory="example/", quiet=True).run()
>>> # Create and run the sorter
>>> sorter = (
... SortByFuzzyKeyMatch()
... #
... # THESAURUS:
... .having_pattern("Texas")
... .having_match_threshold(70)
... #
... # DATABASE:
... .where_root_directory_is("example/")
... )
>>> sorter.run()
>>> # Capture and print stderr output
>>> output = sys.stderr.getvalue()
>>> sys.stderr = original_stderr
>>> print(output)
Reducing thesaurus keys
File : example/thesaurus/organizations.the.txt
Keys reduced from 90 to 90
Keys reduction completed successfully
Sorting thesaurus by fuzzy match
File : example/thesaurus/organizations.the.txt
Keys like : Texas
Match thresold : 70
3 matching keys found
Thesaurus sorting completed successfully
Printing thesaurus header
File : example/thesaurus/organizations.the.txt
Texas AandM Univ Sch of Law (USA)
Texas AandM University School of Law, United States
Univ of North Texas (USA)
Department of Information Technology & Decision Sciences, University of N...
Univ of Texas at Austin (USA)
McCombs School of Business, University of Texas at Austin, United States
[UKN] Brussels, Belgium (BEL)
Brussels, Belgium
[UKN] CESifo, Poschingerstr. 5, Munich, 81679, Germany (DEU)
CESifo, Poschingerstr. 5, Munich, 81679, Germany
[UKN] FinTech HK, Hong Kong (HKG)
FinTech HK, Hong Kong
[UKN] Hochschule für Wirtschaft Fribourg, Switzerland (CHE)
Hochschule für Wirtschaft Fribourg, Switzerland
[UKN] Information Technol, Univeril, Germany (DEU)
Information Technology, Univeril, Germany