Data Extraction Consultant, theboardiQ

United States

Job Type

Part Time

About the Role

The project will entail the following steps:

1. Understand the core tenets of the proposed technology of the boardiQ.

2. Understand the functional requirements - terminologies (Diversity and Inclusion Parameters), Statutory Requirements, Understanding Proxy Statements, Definitions etc.

3. Understand the technical requirements - Data Extraction, Public APIs and similar

4.Populate Explicit Input Parameters

To be populated for every Board Member for Russell 3000 Boards

- Independent Director - Yes / No
- Committees Serving On - [type: prescribed dropdown]
- Other Boards Served in Last 5 Years - [type: free text]
- Other Committees Served in Last 5 Years - [type: prescribed dropdown]
- Board Skills (up to 6) - [type: prescribed dropdown]
- Gender - [type: prescribed dropdown]
- LGBTQ+ - Self declared [type: prescribed dropdown]
- Race & Ethnicity [type: prescribed dropdown]
- Total Compensation (Base, Short Term Incentives -
- Bonuses, Long Term Incentives - Equity, Total
- Compensation] [type: free text]
- Domain [type: prescribed dropdown]
- Candidate description [type: free text]
- Designation [type: free text]
- Age [type: number]
- Years of exp. [type: number]
- Years of BOD exp. [type: number]

5. Audit of random sampling of data to check accuracy; Internal and External Audit; Extremely important aspect of The Project to ensure fairness of core data that will feed into creating knowledge graphs of candidates, theboardiQ Composite Score and other workflows

6. Analytics - derive core data insights - proprietary theboardiQ Research Report

7. Conclude.


- Data Extraction, API Integration, ETL Pipeline, Web Scraper

- Data Extraction Tools

- Scrapy, pandas, html2text, Beautiful Soup, Selenium

- Data Extraction Languages

- Python PHP SQL C#


- Data Extraction, Data Scraping, Microsoft Excel, Web Scraper, Data Science Extract, Transform and Load, Tableau Spreadsheets, Data Visualization, Statistics Data Analysis


Create proprietary data for theboardiQ Data Platform - extract context aware data from Public Statutory Documents for Russell 3000 Public Boards (and other US Public Listings) that will serve as Explicit Input Parameters for the Search, Discovery and Match Recommendation Engine for theboardiQ Platform

About the Company

At theboardiQ, we are on a mission to enable the creation of Inclusive Boards for Businesses