"Our tailored course provided a well rounded introduction and also covered some intermediate level topics that we needed to know. Clive gave us some best practice ideas and tips to take away. Fast paced but the instructor never lost any of the delegates"
Brian Leek, Data Analyst, May 2022
AI failure modes caused by bad data:
real case studies showing how upstream data problems destroy downstream model quality
Ingestion architecture lab:
building working connectors for databases, REST APIs, CSV files, and email attachments in a single pipeline
Cleaning pipeline build:
handling nulls, duplicates, encoding errors, and format normalisation with tested, reusable code
Schema validation workshop:
writing contracts, routing bad records to a rejection queue, and alerting on unexpected format changes
Live source connection lab:
participants connect the pipeline to a real or simulated data source relevant to their actual work environment
Orchestration setup:
configuring scheduling, dependency management, and retry logic in Airflow or Prefect with working examples
Lineage logging:
tagging every data record with source, timestamp, and version so any model run is fully traceable back to its input
Chaos testing:
deliberately injecting bad records, missing fields, and schema breaks to verify error handling works as expected
Pipeline monitoring:
dashboards covering run success rates, row counts, processing latency, and early indicators of data drift
Data contract documentation:
writing a clear specification that an AI or ML team can consume without needing to ask you questions
Data Engineer
"Our tailored course provided a well rounded introduction and also covered some intermediate level topics that we needed to know. Clive gave us some best practice ideas and tips to take away. Fast paced but the instructor never lost any of the delegates"
Brian Leek, Data Analyst, May 2022
Sign up for the JBI Training newsletter to receive technology tips directly from our instructors - Analytics, AI, ML, DevOps, Web, Backend and Security.
This hands-on course teaches how to build reliable data pipelines that support high-quality AI and machine learning systems.
Participants will explore how poor data quality impacts AI performance and learn techniques for creating robust data ingestion workflows.
The course covers connecting to databases, APIs, files, and other data sources while building scalable ingestion pipelines.
Learners will implement data cleaning, validation, schema enforcement, and error-handling processes to improve data reliability.
Practical labs include orchestration, lineage tracking, monitoring, and testing pipelines under real-world failure scenarios.
The course emphasises traceability, data governance, and proactive monitoring to detect issues before they affect downstream AI models.
By the end of the course, participants will be able to design, build, document, and maintain production-ready data pipelines for AI and analytics workloads.
CONTACT
+44 (0)20 8446 7555
Copyright © 2025 JBI Training. All Rights Reserved.
JB International Training Ltd - Company Registration Number: 08458005
Registered Address: Wohl Enterprise Hub, 2B Redbourne Avenue, London, N3 2BS
Modern Slavery Statement & Corporate Policies | Terms & Conditions | Contact Us
POPULAR
AI training courses CoPilot training course
Threat modelling training course Python for data analysts training course
Power BI training course Machine Learning training course
Spring Boot Microservices training course Terraform training course