Software Engineer / Technical Lead Track (Data & Cloud)


Company Background

Burbio is a business intelligence platform that delivers K-12 spending and operations data to companies selling into school districts. Our platform provides critical market intelligence trusted by industry leaders like Houghton Mifflin (HMH), Johnson Controls, Powerschool, and McKinsey Consulting. The company leverages AI extensively to gather, analyze, and deliver insights, and operates with an AI-first approach across all functions.

Job Summary

We are seeking a talented, growth-minded Software Engineer (Data & Cloud) with 3–4 years of professional experience who is ready to step up, take ownership, and grow into a Technical Lead role. In this role, you will help maintain and extend a robust, highly interconnected data ecosystem, focusing on complex data ingestion, extraction, and database management.

Burbio is an AI-forward organization where every product we build is AI-powered. We gather millions of pages of documents from the web, process them, and transform unstructured data into actionable insights delivered to our clients via customized dashboards, custom APIs, and automated alerts. We don't expect you to invent machine learning models from scratch. Instead, we want a strong, modern engineer who excels at building with AI—utilizing LLMs and RAG (Retrieval-Augmented Generation) frameworks—while actively adopting the latest AI productivity tools to accelerate development velocity.

Our Tech Stack & Ecosystem

While we welcome experience with analogous cloud and scripting technologies, our primary stack includes:

  • Languages: Python & SQL (Both Required)
  • Cloud Platform: Google Cloud Platform (GCP)
  • Database: PostgreSQL (Postgres)

Key Responsibilities

Technical Ownership & AI Engineering

  • Build with AI (LLMs + RAG): Design and implement data pipelines that leverage LLMs and RAG frameworks to extract, structure, and transform unstructured web data into actionable insights.
  • AI Tool Integration: Actively utilize and advocate for modern AI-assisted development tools (e.g., GitHub Copilot, Cursor, agentic workflows) to maximize development velocity and code quality.
  • Drive Best Practices: Champion industry-standard engineering practices within the team, learning to balance speed with long-term code health and avoiding technical debt.
  • Code Quality & Collaboration: Participate in technical design meetings, help break down development specs, and drive the team's feedback loops through structured code reviews and GitHub pull requests.
  • CI/CD & Testing: Implement and maintain automated testing pipelines, ensuring rigorous regression testing so new features don't break legacy systems.
  • Documentation: Maintain clear architecture diagrams and technical write-ups to ensure team maintainability and smooth knowledge handoffs.

Web Scraping, Database Architecture & Cloud Operations

  • Information Collection & Processing: Maintain and extend dense, sprawling web scraping and document processing pipelines that ingest millions of pages of documents from the web.
  • Data Modeling: Design, organize, and optimize relational database tables (Postgres) and complex joins, keeping a sharp eye on data integrity and minimizing duplication.
  • AI-Ready Data Pipelines: Help ensure our data schemas and API endpoints are cleanly structured and optimized for downstream LLM consumption and AI agent integration (e.g., via Model Context Protocol).
  • Cloud Infrastructure: Help manage compute levels on Virtual Machines (VMs), structure cloud jobs, and maintain secure Identity and Access Management (IAM) permissions within GCP.

Qualifications & Skills

Technical Requirements

  • Experience: 3–4 years of professional software engineering experience, with a focus on data-heavy or cloud-based applications.
  • Core Languages: Proficient in SQL (required) and Python (or strong experience in an equivalent language with a willingness to master Python quickly).
  • Database Fundamentals: Exceptional proficiency in relational database design, data schema mapping, and query/join performance optimization (experience with Postgres is a major plus).
  • Cloud Computing: Hands-on experience with cloud architecture, specifically GCP (Google Cloud Platform) or AWS.
  • Data & AI Pipelines: Experience with web scraping, text/document processing, or integrating data with LLMs/RAG frameworks.
  • Engineering Tooling: Solid knowledge of Git, GitHub (PR workflows), and automated testing frameworks.

Soft Skills & Growth Potential

  • Systemic Thinker: Ability to think about how code should be sustainably organized and engineered, rather than just writing quick fixes.
  • Collaborative Partner: Excited to work closely in a multi-developer environment, collaborating as a peer with internal developers to share knowledge and upskill the team collectively.
  • Domain Context: Familiarity with or quick adaptability to the mechanics of the U.S. Educational System (high schools, districts, etc.) is preferred to understand our data structures at a basic level.

Location: Indiana or Kentucky residency required. Remote.


APPLY BY EMAILING DENNIS@BURBIO.COM

 

 

 

 

 

 

Burbio’s Data has been featured across a wide variety of media outlets including: