Sort By Exact Key MatchΒΆ

Example

>>> import sys
>>> from io import StringIO
>>> from techminer2.thesaurus.user import CreateThesaurus, SortByExactKeyMatch
>>> # Redirecting stderr to avoid messages during doctests
>>> original_stderr = sys.stderr
>>> sys.stderr = StringIO()
>>> # Reset the thesaurus to initial state
>>> CreateThesaurus(thesaurus_file="demo.the.txt", field="raw_descriptors",
...     root_directory="example/", quiet=True).run()
>>> # Creates, configures, an run the sorter
>>> sorter = (
...     SortByExactKeyMatch()
...     .with_thesaurus_file("demo.the.txt")
...     .having_pattern(
...         [
...             "BUSINESS_INFRASTRUCTURE",
...             "BUSINESS_OPPORTUNITIES",
...         ]
...     )
...     .having_case_sensitive(False)
...     .having_regex_flags(0)
...     .having_regex_search(False)
...     .where_root_directory_is("example/")
... )
>>> sorter.run()
>>> # Capture and print stderr output to test the code using doctest
>>> output = sys.stderr.getvalue()
>>> sys.stderr = original_stderr
>>> print(output)
Reducing thesaurus keys
  File : example/thesaurus/demo.the.txt
  Keys reduced from 1729 to 1729
  Keys reduction completed successfully

Sorting thesaurus file by exact key match
     File : example/thesaurus/demo.the.txt
  Pattern : ['BUSINESS_INFRASTRUCTURE', 'BUSINESS_OPPORTUNITIES']
  2 matching keys found
  Thesaurus sorting by exact key match completed successfully

Printing thesaurus header
  File : example/thesaurus/demo.the.txt

    BUSINESS_INFRASTRUCTURE
      BUSINESS_INFRASTRUCTURE; BUSINESS_INFRASTRUCTURES
    BUSINESS_OPPORTUNITIES
      BUSINESS_OPPORTUNITIES
    A_A_THEORY
      A_A_THEORY
    A_BASIC_RANDOM_SAMPLING_STRATEGY
      A_BASIC_RANDOM_SAMPLING_STRATEGY
    A_BEHAVIOURAL_PERSPECTIVE
      A_BEHAVIOURAL_PERSPECTIVE
    A_BETTER_UNDERSTANDING
      A_BETTER_UNDERSTANDING
    A_BLOCKCHAIN_IMPLEMENTATION_STUDY
      A_BLOCKCHAIN_IMPLEMENTATION_STUDY
    A_CASE_STUDY
      A_CASE_STUDY