Search Quickstart Guide¶
Welcome to the Bigdata.com API Quickstart guide!
It will only take you 5 minutes and will guide you through:
Install
bigdata-client
packageAuthenticate to bigdata.com
Query bigdata.com
Examine search results
Note
We recommend to try this Quickstart guide directly on Google Colab, or download the Jupyter Notebook
Install bigdata-client
package¶
First of all, let’s install bigdata-client
package in a python virtual environment.
Open the terminal and create a virtual environment with the following command:
$ python3 -m venv bigdata_venv
Activate the virtual environment, every time you want to use it:
$ source ./bigdata_venv/bin/activate
And install the bigdata-client
within the environment bigdata_venv
.
(bigdata_venv) $ pip install bigdata-client
Authenticate to bigdata.com¶
Enter the python interpreter with the following command
(bigdata_venv) $ python3
Now you can import the Bigdata object from the bigdata_client package,
>>> from bigdata_client import Bigdata
And initiate it with your bigdata.com personal credentials.
>>> bigdata = Bigdata("YOUR_USERNAME", "YOUR_PASSWORD")
Query bigdata.com¶
bigdata.com processes millions of documents detecting entities (companies, people, organizations, products, places, topics, etc) and the events those entities play a role in. You can query all that data to get direct insights about what you care about. You will be able to create workflows that will optimize your day-to-day activities.
Query example As an example, I want to look for positive news about the company that powered bigdata.com, RavenPack, in the month of June 2024.
from bigdata_client.query import Entity, SentimentRange
from bigdata_client.daterange import AbsoluteDateRange
# RavenPack ID: bigdata.com is powered by RavenPack. In another tutorial
# we will see how to look for entities IDs (Companies, People, etc).
RAVENPACK_ENTIY_ID="2BE1DC"
# Query positive news about RavenPack
query = Entity(RAVENPACK_ENTIY_ID) & SentimentRange([0, 1])
# Define a date range for the month of June 2024
in_june = AbsoluteDateRange("2024-06-01T08:00:00", "2024-07-01T00:00:00")
# Create a bigdata.com search
search = bigdata.search.new(query, date_range=in_june)
# Run the search and retrieve the two top documents
documents = search.run(2)
for doc in documents:
print(f"\nDocument headline: {doc.headline}")
Output:
Document headline: Asia Awards 2024: Best alternative data provider-RavenPack
Document headline: Fixed Income Portfolio Manager Jobs
Examine search results¶
Congratulations! 🎉 You have successfully analyzed millions of documents and found two with the information that you are interested in.
Let’s examine the results more closely. The following command will print the document headline, number of chunks that matched your query, and chunk details.
# Read all retrieved documents and print some details
for doc in documents:
print(f"\nDocument headline: {doc.headline}")
print(f"Number of chunks: {len(doc.chunks)}")
for chunk in doc.chunks:
# Print the sentiment detected in the text and the text itself
print(f" Chunk sentiment [-1, 1]: {chunk.sentiment}")
print(f" Chunk text: {chunk.text}")
Summary¶
You have experienced using the bigdata-client
package to extract insights from millions of unstructured data that bigdata.com has processed.
Next steps¶
We recommend exploring the following key concepts to empower your searches.
Knowledge Graph: It helps you find Entity IDs to query.
Query filters: It describes all the possible filters you can use in a query.
Search Results: It describes the Document and Chunk structure with many other parameters.
Upload your own content: You can also upload your private data. Bigdata.com will process them and consult them to answer only your queries.