diethardsteiner.github.io diethardsteiner.github.io

Project Hop: Create Environment Definitions

THIS ARTICLE IS OUTDATED: A new version is available here The team behind Project Hop has done an excellent job in cleaning up and improving the configuration of what was previously know as KETTLE/PDI. Finally we have built-in environment management available, a feature I’ve been waiting for for a decade. I must say Hop is shaping up really nicely: There is a lot of focus on Hop as a product instead of ticking marketing/sales...

diethardsteiner.github.io diethardsteiner.github.io

Project Hop: Hop on Kubernetes

This is a follow-up article to Hop on Docker. In this article we will explore how to create a simple Kubernetes Job for short-lived data processes and a Kubernetes Deployment to run an always-on server that can execute data processes on request. Our tool of choice for data processing in this case is Hop. Short-lived processes: Kubernetes Job Sources: “Running pods that perform a single completable task”, Kubernetes in...

diethardsteiner.github.io diethardsteiner.github.io

Converting PDI Repositories to PDI Standalone Files

This article explains how to convert PDI repositories (PDI managed files) to PDI standalone files (PDI un-managed files). This might not be a complete set of instructions but most common conversion steps should be covered. Note: The term repository does not refer to a Git repo here but a PDI repo. Historically PDI has been able to store jobs and transformations in 4 different ways, so of which are deprecated now. PDI File storage...