Active TS/SCI clearance required.

We’re hiring a Data Engineer to assess, modernize, and operate production data pipelines that pull from enterprise operational systems and turn them into reliable, integrated datasets analysts and decision-makers can trust. A lot of this work is detective work: source systems often have limited documentation, non-standard schemas, and quirky APIs. We’re looking for someone with strong fundamentals and the curiosity to reverse-engineer a messy system until it makes sense — not someone who needs a clean spec to get started. You’ll work hands-on with Python, SQL, and orchestration frameworks like Airflow or Prefect, design dimensional models, and build the integrations that make the data usable.

Highlights

Strong Python and advanced SQL — complex queries, optimization, performance tuning
ETL/ELT from diverse sources: SaaS platforms, databases, files, streams
Comfortable working with semi-structured data (JSON, XML) and unfamiliar schemas
Cloud data platforms — AWS, Azure, or GCP
Dimensional modeling and data warehouse fundamentals
Data quality, validation, and pipeline observability built in from the start
Production pipelines with Airflow, Prefect, or similar orchestration
API integrations — OAuth/SSO, rate limiting, pagination, retry logic, error handling
Git-based workflows and solid software engineering habits

Bonus

Experience extracting data from IT operations or service management systems (ServiceNow, network management platforms, monitoring tools)
PySpark for distributed processing; dbt for transformations
Terraform or CloudFormation; CI/CD for data pipelines
Streaming (Kafka, Kinesis); data quality tooling (Great Expectations, Soda, Monte Carlo)
Docker/Kubernetes for data workloads

Location: Washington, DC metro area

To apply for this job email your details to jobs@cleverdba.com