Miners (Experimental)

class FilingsNarrativeMiner[source]
__init__(theme_labels, start_date, end_date, llm_model, fiscal_year, sources=None, rerank_threshold=None)[source]

This class will track narratives in filings.

Parameters:
  • theme_labels (List[str]) – List of strings which define the taxonomy of the theme. These will be used in both the search and the labelling of the search result chunks.

  • start_date (str) – The start date for searching relevant documents (format: YYYY-MM-DD)

  • end_date (str) – The end date for searching relevant documents (format: YYYY-MM-DD)

  • llm_model (str) – Specifies the LLM to be used in text processing and analysis.

  • fiscal_year (int) – The fiscal year for which filings should be analyzed

  • sources (List[str] | None) – Used to filter search results by the sources of the documents. If not provided, the search is run across all available sources.

  • rerank_threshold (float | None) – Enable the cross-encoder by setting the value between [0, 1].

mine_narratives(document_limit=10, batch_size=10, freq='3M', export_to_path=None)[source]

Mine narratives by searching against filings

Parameters:
  • document_limit (int) – Maximum number of documents to analyze

  • batch_size (int) – Size of batches for processing

  • freq (str) – Frequency for analysis (‘M’ for monthly)

  • export_to_path (str | None) – Optional path to export results to Excel

Returns:

Dictionary containing analysis results

Return type:

Dict

save_to_excel(file_path)[source]

Save the analysis results to an Excel file.

Parameters:

file_path (str) – Path where the Excel file should be saved

Return type:

None

class NewsNarrativeMiner[source]
__init__(theme_labels, start_date, end_date, llm_model, sources, rerank_threshold)[source]

This class will track narratives in the news.

Parameters:
  • theme_labels (List[str]) – List of strings which define the taxonomy of the theme. These will be used in both the search and the labelling of the search result chunks.

  • start_date (str) – The start date for searching relevant documents (format: YYYY-MM-DD)

  • end_date (str) – The end date for searching relevant documents (format: YYYY-MM-DD)

  • llm_model (str) – Specifies the LLM to be used in text processing and analysis.

  • sources (List[str] | None) – Used to filter search results by the sources of the documents. If not provided, the search is run across all available sources.

  • rerank_threshold (float | None) – Enable the cross-encoder by setting the value between [0, 1].

mine_narratives(document_limit=50, batch_size=10, freq='M', export_to_path=None)[source]

Mine narratives by searching against news

Parameters:
  • document_limit (int) – Maximum number of documents to analyze

  • batch_size (int) – Size of batches for processing

  • freq (str) – Frequency for analysis (‘M’ for monthly)

  • export_to_path (str | None) – Optional path to export results to Excel

Returns:

  • index: int

  • Columns:
    • Time Period

    • Date

    • Document ID

    • Headline

    • Quote

    • Motivation

    • Label

If no relevant content is found, returns None.

Return type:

DataFrame with schema

save_to_excel(file_path)[source]

Save the analysis results to an Excel file.

Parameters:

file_path (str) – Path where the Excel file should be saved

Return type:

None

class TranscriptsNarrativeMiner[source]
__init__(theme_labels, start_date, end_date, llm_model, fiscal_year, sources=None, rerank_threshold=None)[source]

This class will track narratives in transcripts.

Parameters:
  • theme_labels (List[str]) – List of strings which define the taxonomy of the theme. These will be used in both the search and the labelling of the search result chunks.

  • start_date (str) – The start date for searching relevant documents (format: YYYY-MM-DD)

  • end_date (str) – The end date for searching relevant documents (format: YYYY-MM-DD)

  • llm_model (str) – Specifies the LLM to be used in text processing and analysis.

  • fiscal_year (int) – The fiscal year for which executive transcripts should be analyzed.

  • sources (List[str] | None) – Used to filter search results by the sources of the documents. If not provided, the search is run across all available sources.

  • rerank_threshold (float | None) – Enable the cross-encoder by setting the value between [0,1].

mine_narratives(document_limit=10, batch_size=10, freq='3M', export_to_path=None)[source]

Mine narratives by searching against transcripts

Parameters:
  • document_limit (int) – Maximum number of documents to analyze

  • batch_size (int) – Size of batches for processing

  • freq (str) – Frequency for analysis (‘M’ for monthly)

  • export_to_path (str | None) – Optional path to export results to Excel

Returns:

  • index: int

  • Columns:
    • Time Period

    • Date

    • Document ID

    • Headline

    • Quote

    • Motivation

    • Label

If no relevant content is found, returns None.

Return type:

DataFrame with schema

save_to_excel(file_path)[source]

Save the analysis results to an Excel file.

Parameters:

file_path (str) – Path where the Excel file should be saved

Return type:

None