Running Millions of Searches with Bigdata API

The True Scale of Market Intelligence in the AI Era

While OpenAI’s Deep Research makes headlines with its ability to scan hundreds of sources in just 10 minutes, the Bigdata API operates at a fundamentally different scale—searching across billions of documents and delivering up to 1 million relevant results for comprehensive analysis.

In a world where OpenAI’s solution might skim the surface with limited document retrieval, true market intelligence requires industrial-strength pipelines capable of processing orders of magnitude more data with deeper reasoning capabilities. The difference isn’t just quantitative—it’s qualitative. When your competitors are making decisions based on thousands of documents, you’ll be identifying patterns, anomalies, and opportunities across millions.

We demonstrate how to harness the Bigdata.com API to retrieve and process 1 million documents for Russell 1000 companies in minutes transforming raw data into actionable intelligence at a scale that traditional search paradigms simply cannot match. Whether you’re performing sentiment analysis, tracking emerging market trends, or building sophisticated entity relationship networks, the ability to process data at this magnitude represents the new frontier of competitive advantage.

We’ll show you how to: - Process documents across the Russell 1000 companies focusing on Trump tariff impacts - Analyze sentiment by sector to provide comprehensive macro color - Visualize which sectors have the highest percentage of companies negatively impacted

Welcome to data processing at true enterprise scale.

from IPython.display import display, HTML

html_code = """
<section class="reveal-container">
  <ul class="code">
    <li tabindex="0" class="digit">
      <span>1</span>
    </li>
    <li tabindex="0" class="digit">
      <span>,</span>
    </li>
    <li tabindex="0" class="digit">
      <span>0</span>
    </li>
    <li tabindex="0" class="digit">
      <span>0</span>
    </li>
    <li tabindex="0" class="digit">
      <span>0</span>
    </li>
    <li tabindex="0" class="digit">
      <span>,</span>
    </li>
    <li tabindex="0" class="digit">
      <span>0</span>
    </li>
    <li tabindex="0" class="digit">
      <span>0</span>
    </li>
    <li tabindex="0" class="digit">
      <span>0</span>
    </li>
  </ul>
</section>
"""

css_code = """
<style>
.reveal-container {
  display: grid;
  gap: 4rem;
  align-items: center;
  justify-content: center;
  font-family: "SF Pro Text", "SF Pro Icons", "AOS Icons", "Helvetica Neue", Helvetica, Arial, sans-serif, system-ui;
  padding: 2rem;
  background: hsl(0 0% 0%);
  border-radius: 1rem;
  margin: 2rem 0;
}

.reveal-container .code {
  font-size: 3rem;
  display: flex;
  flex-wrap: nowrap;
  color: hsl(0 0% 100%);
  border-radius: 1rem;
  background: hsl(0 0% 6%);
  justify-content: center;
  box-shadow: 0 1px hsl(0 0% 100% / 0.25) inset;
  list-style: none;
  padding: 0;
  margin: 0;
}

.reveal-container .code:hover {
  cursor: grab;
}

.reveal-container .digit {
  display: flex;
  height: 100%;
  padding: 5.5rem 1rem;
}

.reveal-container .digit:focus-visible {
  outline-color: hsl(0 0% 50% / 0.25);
  outline-offset: 1rem;
}

.reveal-container .digit span {
  scale: calc(var(--active, 0) + 0.5);
  filter: blur(calc((1 - var(--active, 0)) * 1rem));
  transition: scale calc(((1 - var(--active, 0)) + 0.2) * 1s), filter calc(((1 - var(--active, 0)) + 0.2) * 1s);
}

.reveal-container .digit:first-of-type {
  padding-left: 5rem;
}

.reveal-container .digit:last-of-type {
  padding-right: 5rem;
}

.reveal-container {
  --lerp-0: 1; /* === sin(90deg) */
  --lerp-1: calc(sin(50deg));
  --lerp-2: calc(sin(45deg));
  --lerp-3: calc(sin(35deg));
  --lerp-4: calc(sin(25deg));
  --lerp-5: calc(sin(15deg));
  --lerp-6: calc(sin(10deg));
  --lerp-7: calc(sin(5deg));
  --lerp-8: calc(sin(1deg));
}

/* Initial visibility class */
.reveal-container.initial-visible .digit span {
  scale: 1.0;
  filter: blur(0);
}

/* These hover styles need to have higher specificity than the initial-visible class */
.reveal-container .digit:is(:hover, :focus-visible) {
  --active: var(--lerp-0) !important;
}
.reveal-container .digit:is(:hover, :focus-visible) + .digit,
.reveal-container .digit:has(+ .digit:is(:hover, :focus-visible)) {
  --active: var(--lerp-1) !important;
}
.reveal-container .digit:is(:hover, :focus-visible) + .digit + .digit,
.reveal-container .digit:has(+ .digit + .digit:is(:hover, :focus-visible)) {
  --active: var(--lerp-2) !important;
}
.reveal-container .digit:is(:hover, :focus-visible) + .digit + .digit + .digit,
.reveal-container .digit:has(+ .digit + .digit + .digit:is(:hover, :focus-visible)) {
  --active: var(--lerp-3) !important;
}
.reveal-container .digit:is(:hover, :focus-visible) + .digit + .digit + .digit + .digit,
.reveal-container .digit:has(+ .digit + .digit + .digit + .digit:is(:hover, :focus-visible)) {
  --active: var(--lerp-4) !important;
}
.reveal-container .digit:is(:hover, :focus-visible) + .digit + .digit + .digit + .digit + .digit,
.reveal-container .digit:has(+ .digit + .digit + .digit + .digit + .digit:is(:hover, :focus-visible)) {
  --active: var(--lerp-5) !important;
}
.reveal-container .digit:is(:hover, :focus-visible) + .digit + .digit + .digit + .digit + .digit + .digit,
.reveal-container .digit:has(+ .digit + .digit + .digit + .digit + .digit + .digit:is(:hover, :focus-visible)) {
  --active: var(--lerp-6) !important;
}
.reveal-container .digit:is(:hover, :focus-visible) + .digit + .digit + .digit + .digit + .digit + .digit + .digit,
.reveal-container .digit:has(+ .digit + .digit + .digit + .digit + .digit + .digit + .digit:is(:hover, :focus-visible)) {
  --active: var(--lerp-7) !important;
}
.reveal-container .digit:is(:hover, :focus-visible) + .digit + .digit + .digit + .digit + .digit + .digit + .digit + .digit,
.reveal-container .digit:has(+ .digit + .digit + .digit + .digit + .digit + .digit + .digit + .digit:is(:hover, :focus-visible)) {
  --active: var(--lerp-8) !important;
}
</style>
"""

js_code = """
<script>
// Execute immediately for Jupyter environment
(function() {
  // Add short delay to ensure elements are rendered
  setTimeout(function() {
    // Get the container element
    const container = document.querySelector('.reveal-container');

    // Initially make all digits visible by adding class to container
    container.classList.add('initial-visible');

    // After milliseconds, remove the visible class to let them fade to blurred state
    setTimeout(function() {
      container.classList.remove('initial-visible');
    }, 2000);
  }, 100);
})();
</script>
"""

# Display the HTML, CSS, and JavaScript in Jupyter Notebook
display(HTML(css_code + html_code + js_code))
  • 1
  • ,
  • 0
  • 0
  • 0
  • ,
  • 0
  • 0
  • 0

Step 0: Prerequisites

We need to import the Bigdata client library with the supporting modules:

import base64
import time
from io import BytesIO
from datetime import timedelta


import matplotlib.pyplot as plt
import matplotlib.colors as mcolors
import pandas as pd
from IPython.display import display, HTML
from bigdata_client import Bigdata
from bigdata_client.query import Similarity, Entity
from bigdata_research_tools.search import run_search

Step 1: Initialization

We begin by initializing the Bigdata client. The authentication is handled through environment variables or can be passed directly:

# Initialize the Bigdata client
# Make sure BIGDATA_USERNAME and BIGDATA_PASSWORD are set in the environment
# Alternatively, you can pass your credentials directly to the Bigdata class
bigdata = Bigdata()

Step 2: Creating Queries for Russell 1000 Companies

This is where the magic begins. We’ll create queries for every company in the Russell 1000 index. Each query combines an entity search with a similarity search for relevant content:

RUSSELL_1000_ENTITIES = [
    'A0B7C8', '03B8CF', 'B4703C', 'A94637', '520632', '665D7D', 'FC1A6E',
    '91B8B1', '66A667', 'C9881C', 'C19D82', 'F39E1E', 'A403CF', '69345C',
    '03596A', '9C8BC3', '30E01D', '4B65EE', 'AD3C93', 'BBBB41', 'ED79D9',
    '903AB4', '728737', '9E5C2C', 'DA48E4', '09E31A', 'BB5271', 'CF6A5A',
    '8E82A6', '228924', 'D93A25', '56EA2B', '221AD7', '9A3FF4', 'AED763',
    'F93C8A', '9EA947', 'E1C16B', '3B20AB', '2C7505', '4A6F00', 'CEF875',
    '2BA977', '0157B1', 'ET1RKC', '45D153', '2336CD', 'IDK7OE', '1Y4363',
    '789A7D', '13C3E0', 'D9B1C9', '4C37C5', '7286BE', 'ED3CA8', '0BC29E',
    'A80FE0', '6DBBBC', '7F2C3E', '9A02C4', '35F4B5', '76F067', '5C8D61',
    'BB07E4', 'E68C3D', '0B4D10', '2158DF', '1A9CAF', '20D00A', 'C5EF8D',
    '9A0429', 'E845D9', 'B290A2', 'WDJL5K', 'D06996', 'D8442A', 'A4BCDE',
    '9A4ECA', '5D43A7', '64CEBB', '2F1299', '61BCA7', '2B7A40', '32E873',
    '3DC887', 'FFAB4A', 'F4F46F', 'AADE0B', 'E6A53A', 'D80QUD', 'DD682D',
    'FAAE77', 'SY36OD', '251988', 'D1173F', 'RDU3ZQ', '7F9C74', '64E346',
    '66ECFD', '88350F', 'ECF709', '2FEA66', '4638EE', 'B6F7F7', '662682',
    '7F3A5F', 'DF6FDD', '45CF5C', '8DB8F5', 'D2AC74', '8F63E7', 'FC01C0',
    '940C3D', 'FEE4B0', '990AD0', '10C19B', '45CF4C', '1FAF22', '873DB9',
    'FD83D1', '85E0A1', 'PXMZTS', 'CA3CB9', '5E7E82', '7176FB', '9D5FA4',
    '85DE00', 'CEDFD5', '830BDB', 'DE4VPR', '1C2593', 'HJ95FV', 'C08256',
    '56EFC7', '7A3633', '8C954B', 'EF5BED', '55438C', '67529E', '034B61',
    'D598E7', '1791E7', 'B275C6', 'C97B2D', '3BB616', '72C2CE', '7A51FE',
    '94637C', '30AB63', '09DE1F', '24C48B', 'DEC749', 'C598D7', 'GIWX7I',
    '859D62', '94983E', '07CA6A', 'B2256A', 'U8X3QM', 'A72AB6', 'D9E036',
    '420168', 'C659EB', 'AC7C4F', 'CC6FF5', '34A959', 'DC5299', '543900',
    '055018', '467C65', '3587B4', 'C70520', '4AC574', 'AA7FE0', '067779',
    '32BBAA', '6E8349', 'A72AEF', '12F98C', '767F86', 'FA9260', 'F1529C',
    '51D876', '289238', '87B81A', '370C50', '1F9258', '7D85A9', 'F5D410',
    '58B46F', '690347', '41F885', 'B13B68', 'D33D8C', 'D88EF3', 'EBC84C',
    '73A71F', 'B7BDA3', 'D54E62', 'A63820', '5D63F1', '160825', 'DE27F9',
    '789ADD', '896771', '0E431C', '86A1B9', '51E682', 'BFAEB4', 'F8B149',
    '12DE76', '58CA9A', 'BFE02C', '1279ED', '5AB53E', 'A398F8', '56765E',
    'VIBYGZ', '4C7DB5', '719477', 'E2866E', '5DD486', '7BAAE7', '2CD1E0',
    'F7B8AC', 'EEA6B3', '97AF94', 'DKDEQ2', '07EC43', 'C0BA36', 'D69946',
    'EB6965', '822E25', '5D0337', 'C83B88', '8CF6DD', '319BE2', '423279',
    'FA40E2', 'QOO7LB', '9F998F', 'FE89E0', '97AAF6', '1D1B07', 'HTA3J9',
    'CD2DA4', '8B4A45', '86RLSL', 'F40EE2', 'D29B44', '3FB145', 'ED0402',
    'B8EF97', '388E00', 'F18844', '382B0C', '24E1EC', '3I4816', 'CFPUMY',
    '57634F', '16AD58', '0C355F', '36ECA4', '275300', '92B047', 'E26FC3',
    '3E15F6', '131443', '384CD3', '69CE71', '06EF42', 'E124EB', '9BBFA5',
    '06C826', 'D6C356', 'EFD406', '94208D', '4595EF', '431B74', '5F2FF7',
    '33AD83', '2BF36E', '14BA06', 'B840EF', 'E15736', 'B01111', '1490F3',
    '4ECD1A', 'A8CBDA', '7A0EC4', '7B6C88', 'DC2B00', '143C52', '977A1E',
    '24D81E', 'DE5611', 'CF4517', '08C87C', '636639', 'BA2D83', '303CE3',
    'HGGE2U', '0A32EA', 'VGTRWJ', '633054', 'DB5CA5', '5DAF89', '493F45',
    'C8A248', 'B73BW7', '095294', 'B34137', '7262F2', 'F9CAF8', '6474BA',
    'E6EDED', 'D4070C', 'C4073B', '972356', '9A602D', 'E10D31', '6137BF',
    '551EEF', '54D11D', 'DB06B0', '583223', '7AB859', '52015A', '4030E2',
    'BE14CF', '9FD2D9', 'C44D01', '6A091A', '6E7060', '420CE9', 'A43906',
    '88923D', '42823F', '366A08', 'AD23DE', '315EB0', 'AD6141', '38FB49',
    'AE4EEB', 'FV5LS8', '2D485F', 'F67165', '14ED2B', 'FF4C20', 'E00373',
    'CD4DA8', 'F1FA25', 'AFD7DD', '2BAE5F', 'FQNOZZ', '45BC35', '85CDC9',
    'B303A6', '66CB62', '39692D', 'F5C8AB', '4FB770', 'E70531', '41B0E2',
    '12A3A3', 'A247F5', '92D3A0', 'DB7014', '6844D2', '316E5D', '9BB35A',
    '7B0BA6', '8377DB', '80D744', '270305', '76CEFB', 'CD06A2', '31D9CC',
    '061366', '7B1E50', '190B91', 'C5D687', '192727', 'F51DB0', 'CA99D7',
    '4017AD', '188394', 'C32A39', 'A6213D', '63F892', '567F3D', '1DGHEG',
    '21DED4', '7BFF81', '5ED6E4', '5B6C11', 'D4463B', '61A586', 'E7CF49',
    'A473AE', '300AC5', 'D42DBA', '6F0A63', '817ED9', 'D21EF3', 'F57F6F',
    '2D8972', '1921DD', 'LJPA1L', 'QRHIPR', '89F693', '304C94', 'DCD97F',
    '9CA619', '1BC12C', 'A5C69D', 'CC339B', 'C15DB3', 'F6E248', '6CE666',
    '1D9E55', 'SUZM4J', '64F2C1', 'E90C84', '2DA651', '0A0D9E', '50070E',
    '4D8313', '7F32C4', '676FFD', '2F5256', '122D09', '279916', 'MSER6L',
    '2B49F4', 'A70BF5', 'DC486E', '766047', 'AA98ED', '1129EA', '1406B8',
    '34B97A', 'D9164D', 'A4386C', '55DD5E', '80B32C', 'D60BB2', '9F03CF',
    'D0909F', 'DBB28E', 'B604DF', 'KQZMPH', '601785', '66E04A', 'C951A2',
    'ACDF88', 'FF6644', 'C4A432', '6BF593', '724F84', 'S0DPD8', 'EC821B',
    '4AC91D', 'E6E012', '4458AA', '00067A', 'C9E107', 'A1EAC8', '8FCA78',
    'D2B9E5', '4B3676', '8D4486', 'EC7FDC', 'E8B21D', 'B7EB38', '353DBB',
    'E6D89E', '6CC55E', 'HMAG7F', 'EEEA9F', 'KEK4ZA', 'F85CC0', 'DC1405',
    '9C5174', '17EDA5', 'DF532D', '9F71E5', '485445', '8E0E32', 'E30B34',
    'BDEC1E', 'F11638', '18EC17', '0BF4BA', '6B236C', '925759', 'A23747',
    '7E3F8F', 'A398B9', '6284B5', '726EEA', '26CC63', '15ABD0', 'C3484D',
    'AEA57C', '72DF04', 'DA199F', 'A6828A', '99333F', '491E56', '619882',
    'EC99D0', '099C88', '9AF3DC', '46D790', '8AM205', '14C7B2', '24CB56',
    'E4CE73', '326EDD', '3DE4D1', '159AE4', 'AA5C8E', 'D03C7A', 'ABBAD1',
    'THC8Q8', '507AE7', 'ED9576', '0F0440', 'C356AC', '95DC1F', 'F164FH',
    'EE6F1C', '9F6B1A', '0B57D7', '55CD6F', '5CC29D', 'FD4E8D', 'C9B932',
    '0079CC', 'C71AD9', 'DAFED3', '9D4EC7', '504FE2', '91C82E', '2E902B',
    'BC948D', '60778B', 'B803B1', 'B1A85D', 'F30508', 'EB61C4', '5C7601',
    '278DC5', '35ZR2C', '5A9F54', 'D06755', '9C25FF', 'A5151E', 'JQUWFX',
    '96F126', 'CE5EB0', '14A113', '76E80F', '009397', 'BD3834', '97693A',
    '092C47', '2EB04E', 'C87ECC', 'D1AE3B', '5F1B7B', 'C5C137', '9E1755',
    '031025', 'C0200F', '4CF10A', '2D2D43', 'F0B2B5', '60DD84', '385DD4',
    'D25249', '9B5968', 'D6534D', 'E7D47B', '3F4497', '1220D2', 'E2F66E',
    '55C9B5', 'D20C8F', '622DBE', 'F4E882', '9D56F2', '954E30', '4A5C8D',
    '135B09', '9972E6', 'A47F2E', 'E68733', '1EBF8D', '12E454', '810E30',
    '6B0784', 'E28F22', '8E8E6E', 'CDFCC9', '49BBBC', '228D42', 'C72B8F',
    '8EF425', 'ACF0B4', '1BDB2A', '8EA478', '69E8E1', '1F716B', '78F9ED',
    '7A10FF', '3A0C6D', '6ABCB8', 'D09938', '3461CF', '9196A2', '0C136E',
    '9C5BA5', 'E49AA3', '23EA2A', 'E9C061', '74E288', '3ED92D', '8DCBBB',
    '25102A', 'CBDB4D', '8B1F37', 'B934BF', 'A2BKRU', '934CC3', 'ECD263',
    '367E1C', '22AB4B', '875F41', 'C3BCD5', 'BAAA60', '911AB8', 'DD1BA1',
    'HO74MH', '9C82E1', '2CB4C9', 'D64C6D', '6ED519', '2893E8', 'D56D6D',
    '422CE3', '2D643C', '3CCC90', 'FC1B7B', '56CC0A', '5D02B7', 'FD39EB',
    'AXWAKS', '986AF6', '59872F', 'FC4652', 'E09E2B', 'C29715', '6E1E61',
    'CA212F', 'F83279', '0BF528', '119CB6', 'F56922', 'C16A8F', '99FC27',
    '44ED36', 'C8257F', '790C34', 'AAEE21', '09F623', 'C5C0E9', 'D6489C',
    'VUA3RK', 'BD8517', 'C20B75', 'D56E4C', '47752F', 'ACF77B', 'B6082A',
    'F1C69A', 'A52B6B', 'UY4OJK', '3C7F5F', '2F7A7E', '6B5379', '15A388',
    '6F6559', '7F9984', 'QK2LOR', '6ADA0F', '03CF95', '31AA84', '2EB88B',
    '1DA44F', '322A44', 'E96E0B', '013528', '9FA83B', '0E5223', '0A9D0A',
    '267718', '652E62', 'CA1620', '646785', '8DBE73', 'A746F6', '342218',
    'ADA50C', 'D71FE7', '72C1A5', '61B81B', 'A8B137', '5DE5A5', 'FC4550',
    'CEC5B9', '39FB23', '5A9A82', 'B9764A', '2F94A5', '3770E8', '59B229',
    'D1706A', '2E61CC', '8FF2EF', '8C5519', '16C7F0', 'FEC475', 'D437C3',
    'AFEC35', 'B560AF', '7D5FD6', 'C3DE7D', '414FFF', '78798F', '7C62A3',
    'CFF15D', '253B2F', '04C4BE', '5F9CE3', 'CMJ18O', 'D69D42', '1782D5',
    'B3CB74', 'E1E36F', '118A3F', '82FD6D', '003B70', 'EA62FC', 'B5DE80',
    '589803', '5EE4A7', '1E68B3', 'E05DC8', '6D4D62', 'A7F7C1', 'D96202',
    '434F38', 'C062D4', 'EFDADF', 'C19D5A', 'A6FB29', '85DA04', '92D9C9',
    'AE8A7E', '962E74', 'A4D173', 'BB036E', '818072', '4EB77D', '416C55',
    '08B4B5', '9F47E2', 'CBA33B', '5E6959', '8C6C1B', '751A74', '263216',
    '58A62D', 'F5D059', '2667B6', '41406A', 'AD9F1D', 'CFF97C', 'C230FE',
    '2E360C', 'BB127B', '9CE4C7', '32F943', '1E169E', 'A7102F', '164D72',
    '4885C8', 'F3FCC3', 'C2E426', 'D8DA3D', '68974B', '992E92', 'B642E8',
    'A9A026', '33A0F1', '51F541', 'BB88B6', 'SYPA5E', '408089', 'D23A30',
    '0E439E', '900356', 'D3C794', '4C6C63', '3224CC', 'FFE543', 'EB5E78',
    '44DA52', '2F98A5', '757034', 'E94704', 'D3D781', 'C85E94', '9X9IT8',
    'F702CA', 'F0C2C3', '147C38', '1D4A78', 'E866D2', '16AAE0', 'CBBFBF',
    '86170D', 'E0207A', 'E12A6E', 'GGK9BT', 'CE1002', '3CBA2A', 'CE2791',
    '5BC2F4', '20BEEA', 'LIVBLM', '553949', 'C81E00', '159739', '30A565',
    '3D9999', '7E1D5D', 'FE7A63', '9D2790', '57DDB9', 'A01664', 'BB0787',
    'EAEBF3', 'F6DCE4', 'D8E003', '890C4E', '9D30A8', 'FD6926', 'C0030A',
    '43A74A', '793C11', '93F143', 'DD3BB1', '722DE3', '39BFF6', '18311E',
    '5088A5', 'E0339F', '8665BA', 'FDF1C0', 'D2E553', '66749D', '40B903',
    'CB1E3E', '5B3A71', 'D75910', 'F3016C', 'DB9829', 'FOUOG7', 'FA4263',
    '0E698B', 'EC7AE2', '5A6336', '9F18FA', 'B37FB4', '0F90E1', 'E206B0',
    '352F2C', 'D4AC65', 'D82EF3', '1A3E1B', 'EE3068', 'F179ED', '442769',
    'AD1ACF', '31643A', '6166D1', '4E2D94', 'D90F43', '883D82', '37727A',
    'B0FE08', '86F3CB', 'CF7292', 'C564E4', '1R86Y0', '41EC04', 'D8F347',
    '6E705B', 'FAE021', '32CB22', '594402', '342D9E', '205AD5', '7843D0',
    '415188', '7448A3', 'AFF7B4', 'FDF28E', '0B73A7', '641F17', '6F0096',
    '3A3447', 'A1E3B3', '7999F3', 'G47HNQ', 'A5B913', '2E0496', '8A8E41',
    '57CAAB', '96D34D', 'PWK2H3', 'D78CCD', '508CFD', '106394', '24FA23',
    '69CDDC', '157D9F', '93D207', 'D64EDF', '6B613D', '860AB6', 'C2609A',
    '57B174', '1F43A1', 'B8F71F', '6EB9DA', '945DE6', 'FACF19', '713810',
    '6CGTFN', 'ADF092', '616E3B', '1F9D90', 'T7PWQW', 'E35610', 'AE7EC7',
    'B5766D', '343996', 'E8846E', 'PL177A', '71E2EF', '0BC853', 'BF79F5',
    'AA247D', 'CE96E7', 'A16DEA', 'A4B899', '8605B0', 'FF4BA4', 'BDD12C',
    '6B549D', 'E21871', 'F748C7', '2DBF98', 'AC642C', '21022F', '272704',
    '0555FF', 'E7A2A5', '4B7006', '564F3E', '9E91C6', '1151F4', '421A1A',
    'E3E68E', 'F46EC9', 'C66A8C', 'B4C673', '070B45', '6BC6F9', '704A09',
    '56277A', '4D371E', 'C03C8B', '362955', '50784E', 'D44D33']
portfolio_entities = bigdata.knowledge_graph.get_entities(
    RUSSELL_1000_ENTITIES)
# Create a combined query for each entity
queries = [Entity(entity_id) & Similarity('Trump 2.0 tariffs impact')
           for entity_id in RUSSELL_1000_ENTITIES]
print(f'Number of queries: {len(queries)}')
Number of queries: 1000

Why a Million Documents Matter

Processing vast amounts of data unlocks critical advantages:

  1. Comprehensive Market Coverage – Capture signals from every corner of the market.

  2. Statistical Significance – Identify trends with greater confidence.

  3. Rare Event Detection – Catch the 0.1% of documents that could make or break your strategy.

  4. Real-Time Insights – Process breaking news across the entire market in minutes, not days.

Bonus: Macro-level Analysis of the Results

With a million documents at your disposal, you can build sophisticated sector-wide sentiment analysis. Thanks to Bigdata API, we can compute the sentiment at the text chunk-level:

company_sentiments = {}
for entity, documents in zip(portfolio_entities, results.values()):
    # Calculate the sentiment ––at the chunk level–– for each queried company
    weighted_sum = sum(chunk.sentiment * chunk.relevance
                       for doc in documents
                       for chunk in doc.chunks)
    sum_of_weights = sum(chunk.relevance
                         for doc in documents
                         for chunk in doc.chunks)
    try:
        company_sentiments[entity.id] = weighted_sum / sum_of_weights
    except ZeroDivisionError:
        company_sentiments[entity.id] = 0

Now we build the DataFrame to further our analysis:

df = pd.DataFrame(vars(pe) for pe in portfolio_entities)
df = df.dropna(subset=['sector', 'industry_group', 'industry'])
df['sentiment'] = df['id'].map(company_sentiments)
# Total number of companies per sector
total_by_sector = df.groupby('sector').size()
# Count companies with a [meaningful] negative sentiment
negative_by_sector = df[df['sentiment'] < -0.45].groupby('sector').size()
# Compute percentage of negatively impacted companies
percentage = (negative_by_sector / total_by_sector * 100).fillna(0)
# Sort the percentages descending (largest percentage first)
percentage = percentage.sort_values(ascending=False)
# Define a color map from the dark red to pure red
custom_red = mcolors.LinearSegmentedColormap.from_list('custom_red',
                                                       [(0.5, 0, 0),
                                                        (1, 0, 0)])
# Normalize the percentage values so they map to [0, 1]
norm = mcolors.Normalize(vmin=percentage.min(), vmax=percentage.max())
# Map each percentage value to a color using the custom colormap
colors = [custom_red(norm(val)) for val in percentage]

Below we just prepare to show the visualization, feel free to skip code if you just want to see the visualization:

# Plot the bar chart
fig, ax = plt.subplots(figsize=(10, 6))
percentage.plot(kind='bar', color=colors, ax=ax)
ax.set_ylabel('Percentage of Negatively Impacted Companies')
ax.set_title('Percentage of Most Negatively Impacted Companies by Sector')
ax.set_xticklabels(percentage.index, rotation=70)
plt.tight_layout()

# Save figure to BytesIO (No actual file saved)
buffer = BytesIO()
fig.savefig(buffer, format="png", dpi=300)
buffer.seek(0)

# Convert to Base64
img_base64 = base64.b64encode(buffer.getvalue()).decode("utf-8")
buffer.close()
plt.close(fig)

# Generate HTML with embedded image
html_code = (f'<img src="data:image/png;base64,{img_base64}"'
             f'alt="Negative Chart"/>')
display(HTML(html_code))
Negative Chart

Based on this bar char, the Telecom, Consumer goods, and Tech sectors appear to be the most negatively impacted by the tariffs introduced during the second administration of President Trump.

Practical Applications

This kind of enterprise-strength data processing opens up possibilities that simply aren’t available when you’re limited to a few thousand documents:

  1. Comprehensive Market Sentiment: Track sentiment across the entire Russell 1000 in near real-time.

  2. Supply Chain Monitoring: Detect early warnings across global supply networks.

  3. Competitive Intelligence: Monitor every competitor and adjacent industry simultaneously.

  4. Regulatory Impact Assessment: Analyze how policy changes affect every sector at once.

Conclusion

In the pursuit of alpha—or the next business breakthrough—you need unmatched power and agility. This example illustrates Bigdata.com’s capability to search billions of news articles, corporate filings, and transcripts at an extraordinary scale.

The ability to process millions of searches and documents in minutes isn’t just a technical milestone—it’s a fundamental shift in how financial analysts, researchers, and decision-makers engage with market data.

Ready to process your first million documents? Visit Bigdata.com to get started with our API today.

Happy data processing! 🚀