Continuous integration testing for Pentaho Kettle projects
This training course teaches automated testing techniques for Pentaho Kettle (PDI) ETL projects.
A well-designed test suite is a valuable tool to guard against regressions, track project stability, and transparently communicate the architecture and design of the project to new team-members and stakeholders alike.
Goals
The training course is focused on transferring hands-on working knowledge.
Participants learn to track high level result trends giving insight into the health and progress of a project. The chart below shows how a project evolves after a test suite is introduced. The amount of failing tests declines with each build as the team is working to eliminate uncovered defects.
Participants learn to generate and navigate detailed result reports that allow developers to drill down to problematic sections of the solution.
After taking the course, participants are able to:
- structure Kettle solutions to be testable
- manage configuration for different execution environments
- formulate a test strategy
- implement tests using Kettle jobs
- implement tests using the ruby scripting language
- orchestrate tests in a test suite based on the rspec testing framework
- set up a Jenkins server for continuous integration
- deploy solutions from git version control
Target Audience
Intermediate users of Pentaho Kettle. Light scripting experience in any language is beneficial, but not required.
Training outline
Day 1
Testing methodology overview
- different kinds of tests
- designing a solution to be testable
- minimizing external infrastructure dependencies
- fixtures and helpers
- declarative testing
ETL-based testing
- kettle techniques for unit tests
- kettle techniques for integration tests
- practical limitations of ETL-based tests
Day 2
Script based testing
- JRuby as a scripting language
- scripting helpers for command execution and fixture loading
Organizing the test suite
- The rspec testing framework
- rspec features
- rspec reports
Day 3
Continuous testing with Jenkins
- installing the Jenkins CI server
- access control in Jenkins
- useful Jenkins plugins
- setting up Jenkins to run ETL tests
- test reports
Deployment
- deploying from version control
- migration scripts
Tailoring options
We are happy to extend or condense sections of the training material to spend more time in sections you value most.
Would you like a tailored training?
Prerequisites
Participants bring their own laptops. They will need Pentaho Kettle (PDI) 5.x-7.x, MySQL or MariaDB, and git installed. Internet access and administrator privileges on participant’s development machines or VMs is required.
The exercise material is prepared for macOs and Linux. If participants work on Windows, it is recommended to install a Linux Desktop VM and verify that Spoon, git, and MySQL run successfully.
Delivery
The course is delivered by an instructor. It is split into three days of approximately 6 hours of material each. The course is given as a series of online sessions, or alternatively at customer offices as per customer preferences. Delivering at customer offices adds the additional cost of travel expenses and a per diem fee.
Deliverables
The slide-deck and a demonstration project with solutions to all exercises are shared with the participants.
Pricing
The course is priced at € 1200 per participant.