The problem with Ketlle/PDI

It’s no secret that I’m a fan of Kettle (Pentaho Data Integration) but there is one big problem – It’s missing more than a bit of functionality – it is the show stopper for widespread adoption by many customers.

For most needs, Kettle is faster better and allows faster and more agile development than Informatica. Spend a week with it and you’ll see. (not dismissing the other commercial players, but, realistically, Infa rules the space where Kettle wants and needs to be)

A feature of Kettle in some tactical situations is the fact that Kettle neither needs or wants a monolithic centralized ETL server – if you’ve got a little RAM to dedicate to a JVM, it is off to the races. The other day, I was able to enjoy the benefits of a tools over had coded scripts when pulling data from an ERP, from a MES and from some standalone sensors on robotic welders. There is visibility into the transformation/integration logic and much of the work will be reused in the future. Cool.  A great use of Kettle as it is today

Where Informatica excels and Pentaho doesn’t show for the party (or apparently even know there is a party) is in the monitoring of independent and related processes on a constant basis – The Pentaho crowd will say “oh that’s easy, Kettle captures all of the operation meta data to do it” (it does), but they miss the point!  As simple as it may seem to the Jave devs recording it, using the repository data is daunting to many.  Few of their customers will start from scratch to build the reports and analytic tools to monitor dependencies, jobs and transformations – instead they go to Informatica.

A call to Matt Casters or some clever opensource developer (related or not to the Pentaho crowd) – Build a dashboard and reports (in Pentaho’s framework) that does half of what WorkFlow Manager does and you will soar!!!

2 Comments »

  1. [...] here: The problem with Ketlle/PDI [...]

  2. Sadly you are mistaken as this functionality is present in the PDI Pentaho enterprise management console. The fact that it is available to our Enterprise Edition only isn’t really relevant in any comparison to Informatica. It is interesting to see though that things like monitoring, trending, alerting isn’t easily developed by an open source community. Perhaps in the longer run this will change but at this time EE will fill the gap.

RSS feed for comments on this post · TrackBack URI

Leave a Comment