Miners (Experimental)¶
- class FilingsNarrativeMiner[source]¶
- __init__(theme_labels, start_date, end_date, llm_model, fiscal_year, sources=None, rerank_threshold=None)[source]¶
This class will track narratives in filings.
- Parameters:
theme_labels (List[str]) – List of strings which define the taxonomy of the theme. These will be used in both the search and the labelling of the search result chunks.
start_date (str) – The start date for searching relevant documents (format: YYYY-MM-DD)
end_date (str) – The end date for searching relevant documents (format: YYYY-MM-DD)
llm_model (str) – Specifies the LLM to be used in text processing and analysis.
fiscal_year (int) – The fiscal year for which filings should be analyzed
sources (List[str] | None) – Used to filter search results by the sources of the documents. If not provided, the search is run across all available sources.
rerank_threshold (float | None) – Enable the cross-encoder by setting the value between [0, 1].
- mine_narratives(document_limit=10, batch_size=10, freq='3M', export_to_path=None)[source]¶
Mine narratives by searching against filings
- Parameters:
document_limit (int) – Maximum number of documents to analyze
batch_size (int) – Size of batches for processing
freq (str) – Frequency for analysis (‘M’ for monthly)
export_to_path (str | None) – Optional path to export results to Excel
- Returns:
Dictionary containing analysis results
- Return type:
Dict
- class NewsNarrativeMiner[source]¶
- __init__(theme_labels, start_date, end_date, llm_model, sources, rerank_threshold)[source]¶
This class will track narratives in the news.
- Parameters:
theme_labels (List[str]) – List of strings which define the taxonomy of the theme. These will be used in both the search and the labelling of the search result chunks.
start_date (str) – The start date for searching relevant documents (format: YYYY-MM-DD)
end_date (str) – The end date for searching relevant documents (format: YYYY-MM-DD)
llm_model (str) – Specifies the LLM to be used in text processing and analysis.
sources (List[str] | None) – Used to filter search results by the sources of the documents. If not provided, the search is run across all available sources.
rerank_threshold (float | None) – Enable the cross-encoder by setting the value between [0, 1].
- mine_narratives(document_limit=50, batch_size=10, freq='M', export_to_path=None)[source]¶
Mine narratives by searching against news
- Parameters:
document_limit (int) – Maximum number of documents to analyze
batch_size (int) – Size of batches for processing
freq (str) – Frequency for analysis (‘M’ for monthly)
export_to_path (str | None) – Optional path to export results to Excel
- Returns:
index: int
- Columns:
Time Period
Date
Document ID
Headline
Quote
Motivation
Label
If no relevant content is found, returns None.
- Return type:
DataFrame with schema
- class TranscriptsNarrativeMiner[source]¶
- __init__(theme_labels, start_date, end_date, llm_model, fiscal_year, sources=None, rerank_threshold=None)[source]¶
This class will track narratives in transcripts.
- Parameters:
theme_labels (List[str]) – List of strings which define the taxonomy of the theme. These will be used in both the search and the labelling of the search result chunks.
start_date (str) – The start date for searching relevant documents (format: YYYY-MM-DD)
end_date (str) – The end date for searching relevant documents (format: YYYY-MM-DD)
llm_model (str) – Specifies the LLM to be used in text processing and analysis.
fiscal_year (int) – The fiscal year for which executive transcripts should be analyzed.
sources (List[str] | None) – Used to filter search results by the sources of the documents. If not provided, the search is run across all available sources.
rerank_threshold (float | None) – Enable the cross-encoder by setting the value between [0,1].
- mine_narratives(document_limit=10, batch_size=10, freq='3M', export_to_path=None)[source]¶
Mine narratives by searching against transcripts
- Parameters:
document_limit (int) – Maximum number of documents to analyze
batch_size (int) – Size of batches for processing
freq (str) – Frequency for analysis (‘M’ for monthly)
export_to_path (str | None) – Optional path to export results to Excel
- Returns:
index: int
- Columns:
Time Period
Date
Document ID
Headline
Quote
Motivation
Label
If no relevant content is found, returns None.
- Return type:
DataFrame with schema