CUSTOMISED
Expert-led training for your team
Dismiss

Pentaho Data Integration training course

A 3 day introduction to data transformation using Pentaho Data Integration (PDI). From the very beginning all the way to developing an ETL framework to ingest files of varying structure.

JBI training course London UK

"The content made the course enjoyable and I liked that the course structure allowed me to spend a lot of time "hands-on" with the programming interface. Having only used Pentaho DI briefly up to this point, the hands-on nature of this course helped improved my confidence in using the toolset.
The trainer was very engaging and his knowledge of the subject matter was very impressive. He spent time answering questions throughout the course - which was much appreciated.
It would have been impossible to cover everything in a 3 day course, but I think we covered most topics, but more impressive was that the course material allowed me to implement some quite advanced solutions in a relatively small amount of time!
All in all, I think this a good, well structured course.."

DF, Software Engineer Lead, Nov 2021

Public Courses

27/01/25 - 3 days
£2000 +VAT
10/03/25 - 3 days
£2000 +VAT
21/04/25 - 3 days
£2000 +VAT

Customised Courses

* Train a team
* Tailor content
* Flex dates
From £1200 / day
EDF logo Capita logo Sky logo NHS logo RBS logo BBC logo CISCO logo
JBI training course London UK

  • Transformations
    • Input and output steps
    • Field transformations
    • Joins and lookups
    • Set transformations
    • JSON and XML inputs
    • Variables and portability
    • Logging and performance
    • Metadata injection
  • Jobs
    • Basic orchestration
    • File and database management
    • Iteration and looping in jobs

Introduction

  • Installing and starting PDI. The user interface

Part I – Transformations

Input and output steps; 

  • Exploration of the various ways to read data into, and write data out of, PDI: CSV files, Excel files SQL queries, etc. Installing JDBC drivers
  • Lab 1: CSV Input, MySQL output

Field transformations

  • Overview of various transformation steps: Calculator, string manipulation, adding counters, value mapping, handling nulls, javascript and regular expressions.

Joins and lookups

  • Merging two or more data streams and combining the data: managing slowly changing dimensions dimensions, in-memory and database lookups, querying HTTP services/apis, merge joins, row diff, etc.
  • Lab 2: Joins and lookups (enriching data stream)

Set transformations

  • Operations on groups of rows: sorting, grouping, splitting fields into rows, normalising/denormalising data, cloning, appending.
  • Lab 3: Grouping data

JSON and XML inputs

  • Reading XML data via Xpath and using the very fast performing StaX parser. JSON parsing via JSONpath
  • Lab 4: JSON and XML inputs (Xpath, Stax parser, Jsonpath)

Variables and portability

  • Setting and getting variables; global variables, runtime variables, parameters; portable connections, file paths, and other best practices
  • Lab 5: Portable transformations

Logging and Performance

  • Reading PDI logs; analysing performance and runtime metrics; examples of fast and slow streps, identifying bottlenecks; step copies in parallel

Metadata injection

  • Use cases for metadata injection. Modifying metadata in runtime. Advanced metadata injection options.
  • Lab 6: Flexible CSV loading

Part II – Jobs

Basic orchestration

  • Usage of PDI jobs to orchestrate tasks; overview of job entries :sub-jobs, sub-transformations, SQL, shell scripts, conditions, error handling, getting/putting files, etc. Wrapper jobs.

File and DB management

  • Using lock files; downloading and archiving files; checking database connections; conditionally create/drop/modify database structure; error handling; recording execution results
  • Lab 7: Building a simple job

Iteration and looping in jobs

  • Run job/transformation for each file in folder; handling different file types in one go; iterating over API results; loop until condition met; running .sh or .bat scripts depending on OS
  • Lab 8: Developing a powerful ETL framework
JBI training course London UK

New users charged with using Pentaho & or existing users looking to formalise their knowledge - Business Analysts, Data Analyst & ETL developers.


5 star

4.8 out of 5 average

"The content made the course enjoyable and I liked that the course structure allowed me to spend a lot of time "hands-on" with the programming interface. Having only used Pentaho DI briefly up to this point, the hands-on nature of this course helped improved my confidence in using the toolset.
The trainer was very engaging and his knowledge of the subject matter was very impressive. He spent time answering questions throughout the course - which was much appreciated.
It would have been impossible to cover everything in a 3 day course, but I think we covered most topics, but more impressive was that the course material allowed me to implement some quite advanced solutions in a relatively small amount of time!
All in all, I think this a good, well structured course.."

DF, Software Engineer Lead, Nov 2021

“The Trainer was very knowledgeable and was able to help with every issue throughout the entire course.  ”

JS - Software engineer - Nov 2022

 

 

JBI training course London UK

Newsletter

 

Sign up for the JBI Training newsletter to stay updated with world-class technology training opportunities, including Analytics, AI, ML, DevOps, Web, Backend and Security. Our Power BI Training Course is especially popular.  Gain new skills, useful tips, and validate your expertise with an industry-leading organisation, all tailored to your schedule and learning preferences.



Our Alteryx training course shows data analysts how to use data smarter and more effortlessly.

The training will enable you to enrich, clean and transform your data.

You will learn how to automate your work without having prior programming skills.

This course will walk you through the basics of the product, and you will get hands-on experience by way of practical lab work.

Alteryx will allow you to achieve more in a smaller amount of time.

This course will enable you to become proficient with the Alteryx tool quickly.

BEGIN:VCALENDAR PRODID:-//Google Inc//Google Calendar 70.9054//EN VERSION:2.0 CALSCALE:GREGORIAN METHOD:PUBLISH X-WR-CALNAME:Alteryx Training Course X-WR-TIMEZONE:Europe/London X-WR-CALDESC:Alteryx Training Course BEGIN:VEVENT DTSTART;VALUE=DATE:20220512 DTEND;VALUE=DATE:20220513 RRULE:FREQ=WEEKLY;WKST=MO;INTERVAL=6;BYDAY=TH DTSTAMP:20241221T121132Z UID:[email protected] CREATED:20220413T083803Z DESCRIPTION:Alteryx Training Course - Learn how Alteryx can empower you to prep\, blend and analyze your corporate data faster LAST-MODIFIED:20220413T083803Z LOCATION:London\, UK SEQUENCE:0 STATUS:CONFIRMED SUMMARY:Alteryx Training Course TRANSP:TRANSPARENT END:VEVENT END:VCALENDAR

CONTACT
+44 (0)20 8446 7555

[email protected]

SHARE

 

Copyright © 2024 JBI Training. All Rights Reserved.
JB International Training Ltd  -  Company Registration Number: 08458005
Registered Address: Wohl Enterprise Hub, 2B Redbourne Avenue, London, N3 2BS

Modern Slavery Statement & Corporate Policies | Terms & Conditions | Contact Us

POPULAR

Rust training course                                                                          React training course

Threat modelling training course   Python for data analysts training course

Power BI training course                                   Machine Learning training course

Spring Boot Microservices training course              Terraform training course

Kubernetes training course                                                            C++ training course

Power Automate training course                               Clean Code training course