Miners (Experimental)¶

class FilingsNarrativeMiner[source]¶

__init__(theme_labels, start_date, end_date, llm_model, fiscal_year, sources=None, rerank_threshold=None)[source]¶

This class will track narratives in filings.

Parameters:

theme_labels (List[str]) – List of strings which define the taxonomy of the theme. These will be used in both the search and the labelling of the search result chunks.
start_date (str) – The start date for searching relevant documents (format: YYYY-MM-DD)
end_date (str) – The end date for searching relevant documents (format: YYYY-MM-DD)
llm_model (str) – Specifies the LLM to be used in text processing and analysis.
fiscal_year (int) – The fiscal year for which filings should be analyzed
sources (List[str] | None) – Used to filter search results by the sources of the documents. If not provided, the search is run across all available sources.
rerank_threshold (float | None) – Enable the cross-encoder by setting the value between [0, 1].

mine_narratives(document_limit=10, batch_size=10, freq='3M', export_to_path=None)[source]¶

Mine narratives by searching against filings

Parameters:

document_limit (int) – Maximum number of documents to analyze
batch_size (int) – Size of batches for processing
freq (str) – Frequency for analysis (‘M’ for monthly)
export_to_path (str | None) – Optional path to export results to Excel

Returns:

Dictionary containing analysis results

Return type:

Dict

save_to_excel(file_path)[source]¶

Save the analysis results to an Excel file.

Parameters:: file_path (str) – Path where the Excel file should be saved
Return type:: None

class NewsNarrativeMiner[source]¶

__init__(theme_labels, start_date, end_date, llm_model, sources, rerank_threshold)[source]¶

This class will track narratives in the news.

Parameters:

theme_labels (List[str]) – List of strings which define the taxonomy of the theme. These will be used in both the search and the labelling of the search result chunks.
start_date (str) – The start date for searching relevant documents (format: YYYY-MM-DD)
end_date (str) – The end date for searching relevant documents (format: YYYY-MM-DD)
llm_model (str) – Specifies the LLM to be used in text processing and analysis.
sources (List[str] | None) – Used to filter search results by the sources of the documents. If not provided, the search is run across all available sources.
rerank_threshold (float | None) – Enable the cross-encoder by setting the value between [0, 1].

mine_narratives(document_limit=50, batch_size=10, freq='M', export_to_path=None)[source]¶

Mine narratives by searching against news

Parameters:

document_limit (int) – Maximum number of documents to analyze
batch_size (int) – Size of batches for processing
freq (str) – Frequency for analysis (‘M’ for monthly)
export_to_path (str | None) – Optional path to export results to Excel

Returns:

index: int
Columns:
- Time Period
- Date
- Document ID
- Headline
- Quote
- Motivation
- Label

If no relevant content is found, returns None.

Return type:

DataFrame with schema

save_to_excel(file_path)[source]¶

Save the analysis results to an Excel file.

Parameters:: file_path (str) – Path where the Excel file should be saved
Return type:: None

class TranscriptsNarrativeMiner[source]¶

__init__(theme_labels, start_date, end_date, llm_model, fiscal_year, sources=None, rerank_threshold=None)[source]¶

This class will track narratives in transcripts.

Parameters:

theme_labels (List[str]) – List of strings which define the taxonomy of the theme. These will be used in both the search and the labelling of the search result chunks.
start_date (str) – The start date for searching relevant documents (format: YYYY-MM-DD)
end_date (str) – The end date for searching relevant documents (format: YYYY-MM-DD)
llm_model (str) – Specifies the LLM to be used in text processing and analysis.
fiscal_year (int) – The fiscal year for which executive transcripts should be analyzed.
sources (List[str] | None) – Used to filter search results by the sources of the documents. If not provided, the search is run across all available sources.
rerank_threshold (float | None) – Enable the cross-encoder by setting the value between [0,1].

mine_narratives(document_limit=10, batch_size=10, freq='3M', export_to_path=None)[source]¶

Mine narratives by searching against transcripts

Parameters:

document_limit (int) – Maximum number of documents to analyze
batch_size (int) – Size of batches for processing
freq (str) – Frequency for analysis (‘M’ for monthly)
export_to_path (str | None) – Optional path to export results to Excel

Returns:

index: int
Columns:
- Time Period
- Date
- Document ID
- Headline
- Quote
- Motivation
- Label

If no relevant content is found, returns None.

Return type:

DataFrame with schema

save_to_excel(file_path)[source]¶

Save the analysis results to an Excel file.

Parameters:: file_path (str) – Path where the Excel file should be saved
Return type:: None