Changelog

All notable changes to the bigdata-client package will be documented in this file.

The format is based on Keep a Changelog, and this project adheres to Semantic Versioning.

[2.13.0] - 2025-04-23

Changed

  • Request to Transcript and Filing types generate new request parameter

Added

  • New search class DocumentVersion

  • Tracking parameter for chat.ask requests

[2.12.0] - 2025-04-09

Added

  • The method chat.ask supports streaming with the parameter streaming. More details in Chat API Reference

  • Filter watchlist by owned parameter

[2.11.1] - 2025-03-28

Changed

  • HTTP header support

[2.11.0] - 2025-03-27

Added

  • Inline attributions to chat responses

  • Inline attribution formatters: DefaultInlineAttributionFormatter

  • Support for sort by date in asc order. SortBy.DATE_ASC

[2.10.1] - 2025-03-17

Fixed

  • Typo in ChatScope

[2.10.0] - 2025-03-13

Added

  • Added scope parameter to Chat.ask method

  • Method to find ETFs in the Knowledge Graph: find_etfs

[2.9.1] - 2025-03-04

Fixed

  • Creating a new Clerk session from scratch when an error happens during the refreshing process.

[2.9.0] - 2025-02-26

Added

  • Chat experience via API

  • Enhance Bigdata class to manage rate limits, optimizing bandwidth.

  • Enhance autosuggest response with company market Identifiers

  • Exception BigdataClientIncompatibleStateError is raised when trying to tag or share a file which upload and classification process is not COMPLETED

  • New exception BigdataClientSimilarityPayloadTooLarge to notify users when the input from a Similary query is too large.

[2.8.0] - 2025-02-13

Added

  • Feature to query by chunks instead of by documents. Specify more precise limits to your queries. ChunkLimit

Fixed

  • In multithreading scenarios, more than one thread could start a refreshing token request.

[2.7.1] - 2025-02-03

Added

  • LICENCE file

[2.7.0] - 2025-01-30

Added

  • Added rerank_threshold param to bigdata_client.search.new.

  • Limit before the client raises RequestMaxLimitExceeds raised from 8KB -> 64KB.

  • Support for downloading very large documents (> 6MB) with bigdata_client.document.download_annotated_dict method.

  • If a method to delete a file is called and it has not been fully processed yet it will raise BigdataClientIncompatibleStateError.

[2.6.0] - 2025-01-17

Added

  • Added methods for checking the tags of your files: bigdata_client.uploads.list_my_tags. (List my tags).

  • Added methods for checking the tags of files shared with you: bigdata_client.uploads.list_tags_shared_with_me. (List tags shared with me).

  • Added methods for company retrieval as part of the Knowledge Graph service:

    • get_companies_by_isin

    • get_companies_by_cusip

    • get_companies_by_sedol

    • get_companies_by_listing

  • The response from subscription.get_details() will include information about uploaded pages of PDF files: pdf_upload_pages. Check Monitor usage.

Fixed

  • Retrieved watchlists were not setting the company_shared_permission field.

  • The response from subscription.get_details() will correctly inform about uploaded pages of files (other than PDF): file_upload_pages.

  • Scripts batch_file_analytics_download.py and batch_file_upload.py now correctly handles BigdataClientRateLimitError.

[2.5.0] - 2024-12-20

Added

  • Added new query component FileTag, usage here.

  • Added exceptions for controlling Authentication errors: BigdataClientAuthFlowError, BigdataClientTooManySignInAttemptsError

[2.4.0] - 2024-11-26

Added

  • Enhance API usage monitoring for private files uploads with bigdata_client.subscription.get_details() method.

  • Added methods add_tags, remove_tags and set_tags for File class objects. These methods allow modifying the tags on existing files.

  • Added pagination to bigdata.upload.Uploads.list method to retrieve all the files uploaded by the user.

    Example of usage:

      from bigdata_client import Bigdata
      
      bigdata = Bigdata()
      # New usage to retrieve all of your files
      for n in itertools.count(start=1):
          files = bigdata_cli.uploads.list(page_number=n)
          do_stuff_with_files(files)
          if not files:   
              break
    

Fixed

  • Validation errors when parsing entities of type bigdata_client.models.entities.Concept in the knowledge graph service since v2.1.0.

  • Entities of type bigdata_client.models.entities.Topic were incorrectly parsed as bigdata_client.models.entities.Concept.

  • Validation errors when parsing edge cases of entities of type bigdata_client.models.sources.Source in the knowledge graph service.

[2.3.0] - 2024-11-15

Added

  • Add new method subscription.get_details() for API usage monitoring.

  • Add new method get_usage() at the class bigdata_client.search.Search that returns the API Query units used for each search instance.

  • Enhance exception handling. Check Exceptions.

[2.2.0] - 2024-10-30

Added

  • Enhance error handling experience. Customers will see error messages besides the HTTP code.

  • Enhance methods in the knowledge_graph service to provide more concise answers.

Changed

  • The following methods from the knowledge_graph service: autosuggest, find_concepts, find_companies, find_people, find_places, find_organizations, find_products, find_sources and find_topics should now be used with a single parameter instead of a list. E.g:

    Example of current usage:

    from bigdata_client import Bigdata
    
    bigdata = Bigdata()
    bigdata.knowledge_graph.autosuggest(["Company 1", "Company 2"])
    

    After the update:

    from bigdata_client import Bigdata
    
    bigdata = Bigdata()
    bigdata.knowledge_graph.autosuggest("Company 1")
    bigdata.knowledge_graph.autosuggest("Company 2")
    

The old implementation is now flagged as deprecated and compatibility with it will be removed in the future. This update fixes problems from clients using the methods above when using concurrency. A more detailed guide is included in the knowledge graph documentation and how to guides.

Fixed

Enhance methods in the knowledge_graph service to avoid errors when customers explore the Knowledge Graph using a multithreading environment.

[2.1.0] - 2024-10-09

Added

  • Added url property to Document class

  • Queries using AbsoluteDateRange accept now to be created with a timezone.

  • Added validation for post, patch and put requests that json request body not exceeds 8KB

  • Added verify_ssl parameter for bigdata_client.Bigdata class to be able to skip ssl verification for proxy

[2.0.0] - 2024-09-19

Changed

  • The name of the Python package changes from bigdata to bigdata_client. We are doing this to ensure that there are no conflicts with other commonly used Python packages. This change will require action on your part.

    What you need to do:

    Currently, you import classes from the package bigdata. Once you update the package to version 2.0.0, be sure to modify your Python scripts to import classes from bigdata_client instead of bigdata to avoid any issues.

    Example of current imports:

    from bigdata import Bigdata
    from bigdata.query import Entity, TranscriptTypes
    

    After the update:

    from bigdata_client import Bigdata
    from bigdata_client.query import Entity,TranscriptTypes
    

[1.6.1] - 2024-09-13

Added

  • Added download_annotated_dict method to Document class

[1.6.0] - 2024-09-02

Added

  • New environment to customize the number of parallel requests that can be made to the API. The environment variable is BIGDATA_MAX_PARALLEL_REQUESTS.

  • Added new parameter skip_metadata: Optional[bool] to Uploads.upload_from_disk, it allow skip loading file metadata when uploading file.

[1.5.1] - 2024-08-26

Changed

  • Expose FilingTypes in bigdata.query

[1.5.0] - 2024-08-26

Added

  • New query component FilingTypes. Allow filter the current SEC Filings types.

[1.4.1] - 2024-08-19

Changed

  • Internal components used to parse knowledge graph responses have been moved to the /api module.

[1.4.0] - 2024-08-12

Added

  • New feature: Define your primary entity when uploading a file.

  • Knowledge graph get_entities(), get_sources(), and get_topics() now return analytic descriptions when available.

  • The environment variable used to login into Bigdata BIGDATA_USER has been renamed to BIGDATA_USERNAME. Old version is still supported but marked as deprecated.

Changed

  • Improved keyword search capability to make phrase searches more accurate.

  • Comention searches respect the date ranges defined in Bigdata.search.new().

[1.3.0] - 2024-07-17

Added

  • New key cluster can now be present in the Document object. It contains related Document objects.

  • New method search.find_concepts to return the first concepts from the autosuggest service.

  • Improved query import. Import Company, Concept, Facility, Landmark, Organization, OrganizationType, Person, Place, Product, ProductType from bigdata.

  • Added timezone information to the datetimes in files.

Fixed

  • There was an error when working with queries that have an empty list as the value for the field date.

[1.2.0] - 2024-06-26

Added

  • New key cluster can now be present in the Document object. It contains related Document objects.

  • Support for custom proxies.

[1.1.0] - 2024-06-10

Added

  • Share/Unshare files directly from the File object or using bigdata.uploads new methods.

  • File objects now have property company_shared_permission to know if they are being shared or not.

  • Files shared with the user can now be retrieved with bigdata.uploads.list_shared.

[1.0.1] - 2024-06-07

Fixed

  • Queries with Watchlists were not displaying results and taking a very long time to complete.

[1.0.0] - 2024-06-06

Added

  • New feature, share/unshare watchlists.

  • Feature: New expression for queries: ReportingEntity.

  • Feature: New expression for queries: SentimentRange

Changed

  • Watchlists now contain field company_shared_permission to know if they are being shared or not.

  • Shared queries now return SharePermission.READ instead of SearchSharePermission.READ

  • Attribute Bigdata.content_search renamed to Bigdata.search

  • Method ContentSearch.new_from_query renamed to ContentSearch.new

  • Helper functions to group query expressions any_, all_ renamed to Any, All

[0.4.0] - 2024-05-28

Added

  • Document query component available for making queries and running searches. e.g: Document("BFA16B80ED117EAA5693E8BA")

  • Knowledge Graph filters: find_companies, find_people, find_places, find_organizations, find_products, find_sources, find_topics.

  • Method for getting results from a search as a list -> Search.run

  • New field volume is now returned for each entity in comentions.

  • New method to_dict for formatting a comentions object to a dictionary.

Changed

  • Renamed DocumentTypes -> TranscriptTypes.

  • Renamed FileType -> DocumentType.

  • TranscriptTypes enum now can be used to create queries. e.g: TranscriptTypes.EARNINGS_CALL

  • SectionMetadata enum can now be used to create queries. e.g: SectionMetadata.QUESTION

  • Renamed Search.limit_stories -> Search.limit_documents

  • Renamed Story -> Document as well as all of its components to be named like “Document”

  • Improved string representation of a Document

Fixed

  • Key people was not being returned by comentions.

[0.3.0] - 2024-05-10

Added

  • Added Search.share_with_company and Search.unshare_with_company to share and “unshare” a search with the user’s company ([#11]).

  • Added method File.get_analytics_dict to get the analytics directly in memory, as a dictionary ([#14]). That is consistent with other methods like File.get_annotated_dict.

  • New expressions to interact with transcripts: SectionMetadata, DocumentType, FiscalQuarter, FiscalYear.

  • Added chunk-level sentiment to objects in Story.chunks ([#13]).

Fixed

  • Parsing backend response from discovery-panel endpoint when there are no results raised an error.

Changed

  • Changed the range of Story.sentiment and Chunk.sentiment to be between -1 and 1 ([#13]).

[0.2.0] - 2024-05-03

Added

  • The chunks in the search results now have a relevance attribute, which is a float bigger than 0. You can access it on story.chunks[x].relevance ([#10]).

Removed

  • Removed the simplified version of the search Bigdata.content_search.new_search, in favor of the more powerful Bigdata.content_search.new_from_query ([#9]).

Fixed

  • Fixed a bug where the scope parameter of Bigdata.content_search.new_from_query was being ignored, causing the search to always have a scope FileType.ALL ([#12]).

[0.1.0] - 2024-04-26

Added

  • Implemented Bigdata.internal_content to support interacting with internal content (uploading documents, listing, getting, deleting, downloading original document/annotations/analytics, etc) ([#7]).

[0.0.8] - 2024-04-22

Added

  • Implemented Bigdata.watchlists to create, update, delete get and list watchlists ([#5]).

Fixed

  • Autosuggest API response now makes use of a discriminator.

[0.0.7] - 2024-04-17

Fixed

  • Internal change to stop showing emails in the description in pypi

  • Error when receiving some Product objects in search responses, some of which raised a ValidationError.

[0.0.6] - 2024-04-12

Fixed

  • Fixed a bug using autosuggest in a jupyter notebook, caused by asyncio ([#4]).

[0.0.5] - 2024-04-11

Added

  • New way to inspect the knowledge graph and get autosuggestions on entities, topics, etc. ([#2]).

  • New way of searching and creating queries, using And/Or/Not operators and the method new_from_query from the Search class. ([#3]).

Changed

  • Sentiment added to Story.__str__ so it’s printed when using print(story)

  • Updated the content search to use the breaking change introduced in the latest API version. This means that older versions of the package won’t work with the latest versions of the API

[0.0.4] - 2024-03-22

Added

  • Added missing parameter sortby to the new_search method

  • The search results (stories) now implement the __str__ to have a better looking output.

Changed

  • Changed endpoint used for searching to have more accurate search results

  • Changed the authentication used internally to Clerk ([#1]).

  • Method get_related renamed to get_comentions

  • AbsoluteDateRange’s __init__ now accepts both types datetime and str.

Removed

  • The classmethod from_strings has been removed in favor of __init__ passing strings.

[0.0.3] - 2024-03-20

Fixed

  • Fixed crash on Python 3.10 or lower, caused by a wrong import. Did not affect Python 3.11 and above.

[0.0.2] - 2024-03-19

Changed

  • Enum items are now uppercase

[0.0.1] - 2024-03-19

Added

  • First public release.

  • Basic search functionality via bigdata.content_search.new_search().

  • CRUD operations for saved searches

  • Co-mentions

  • Authentication via username and password