Search Quickstart Guide

Welcome to the Bigdata.com API Quickstart guide!

It will only take you 5 minutes and will guide you through:

  • Install bigdata-client package

  • Authenticate to bigdata.com

  • Query bigdata.com

  • Examine search results

Note

We recommend to try this Quickstart guide directly on Google Colab, or download the Jupyter Notebook

Install bigdata-client package

First of all, let’s install bigdata-client package in a python virtual environment.

Open the terminal and create a virtual environment with the following command:

$ python3 -m venv bigdata_venv

Activate the virtual environment, every time you want to use it:

$ source ./bigdata_venv/bin/activate

And install the bigdata-client within the environment bigdata_venv.

(bigdata_venv) $ pip install bigdata-client

Authenticate to bigdata.com

Enter the python interpreter with the following command

(bigdata_venv) $ python3

Now you can import the Bigdata object from the bigdata_client package,

>>> from bigdata_client import Bigdata

And initiate it with your bigdata.com personal credentials.

>>> bigdata = Bigdata("YOUR_USERNAME", "YOUR_PASSWORD")

Query bigdata.com

bigdata.com processes millions of documents detecting entities (companies, people, organizations, products, places, topics, etc) and the events those entities play a role in. You can query all that data to get direct insights about what you care about. You will be able to create workflows that will optimize your day-to-day activities.

Query example As an example, I want to look for positive news about the company that powered bigdata.com, RavenPack, in the month of June 2024.

from bigdata_client.query import Entity, SentimentRange
from bigdata_client.daterange import AbsoluteDateRange

# RavenPack ID: bigdata.com is powered by RavenPack. In another tutorial
#    we will see how to look for entities IDs (Companies, People, etc).
RAVENPACK_ENTIY_ID="2BE1DC"

# Query positive news about RavenPack
query = Entity(RAVENPACK_ENTIY_ID) & SentimentRange([0, 1])

# Define a date range for the month of June 2024
in_june = AbsoluteDateRange("2024-06-01T08:00:00", "2024-07-01T00:00:00")

# Create a bigdata.com search
search = bigdata.search.new(query, date_range=in_june)

# Run the search and retrieve the two top documents
documents = search.run(2)
for doc in documents:
    print(f"\nDocument headline: {doc.headline}")

Output:

Document headline: Asia Awards 2024: Best alternative data provider-RavenPack

Document headline: Fixed Income Portfolio Manager Jobs

Examine search results

Congratulations! 🎉 You have successfully analyzed millions of documents and found two with the information that you are interested in.

Let’s examine the results more closely. The following command will print the document headline, number of chunks that matched your query, and chunk details.

# Read all retrieved documents and print some details
for doc in documents:
    print(f"\nDocument headline: {doc.headline}")
    print(f"Number of chunks: {len(doc.chunks)}")
    for chunk in doc.chunks:
        # Print the sentiment detected in the text and the text itself
        print(f"  Chunk sentiment [-1, 1]: {chunk.sentiment}")
        print(f"  Chunk text: {chunk.text}")

Summary

You have experienced using the bigdata-client package to extract insights from millions of unstructured data that bigdata.com has processed.

Next steps

We recommend exploring the following key concepts to empower your searches.

  • Knowledge Graph: It helps you find Entity IDs to query.

  • Query filters: It describes all the possible filters you can use in a query.

  • Search Results: It describes the Document and Chunk structure with many other parameters.

  • Upload your own content: You can also upload your private data. Bigdata.com will process them and consult them to answer only your queries.