Pentaho Data Integration 4 Cookbook
Getting data from a database by providing parameters
Getting data from a database by running a query built at runtime
Inserting or updating rows in a table
Inserting new rows where a simple primary key has to be generated
Inserting new rows where the primary key has to be generated based on stored values
Creating or altering a database table from PDI (design time)
Creating or altering a database table from PDI (runtime)
Inserting, deleting, or updating a table depending on a field
Changing the database connection at runtime
Reading several files at the same time
Reading files having one field by row
Reading files with some fields occupying two or more rows
Providing the name of a file (for reading or writing) dynamically
Using the name of a file (or part of it) as a field
Getting the value of specific cells in an Excel file
Writing an Excel file with several sheets
Writing an Excel file with a dynamic number of sheets
Specifying fields by using XPath notation
Validating well-formed XML files
Validating an XML file against DTD definitions
Validating an XML file against an XSD schema
Generating a simple XML document
Generating complex XML structures
Generating an HTML page using XML and XSL transformations
Copying or moving one or more files
Getting files from a remote server
Putting files on a remote server
Copying or moving a custom list of files
Deleting a custom list of files
Looking for values in a database table
Looking for values in a database (with complex conditions or multiple tables involved)
Looking for values in a database with extreme flexibility
Looking for values in a variety of sources
Looking for values by proximity
Looking for values consuming a web service
Looking for values over an intranet or Internet
Splitting a stream into two or more streams based on a condition
Merging rows of two streams with the same or different structures
Comparing two streams and generating differences
Generating all possible pairs formed from two datasets
Joining two or more streams based on given conditions
Interspersing new rows between existent rows
Executing steps even when your stream is empty
Processing rows differently based on the row number
Executing and Reusing Jobs and Transformations
Executing a job or a transformation by setting static arguments and parameters
Executing a job or a transformation from a job by setting arguments and parameters dynamically
Executing a job or a transformation whose name is determined at runtime
Executing part of a job once for every row in a dataset
Executing part of a job several times until a condition is true
Moving part of a transformation to a subtransformation
Integrating Kettle and the Pentaho Suite
Creating a Pentaho report with data coming from PDI
Configuring the Pentaho BI Server for running PDI jobs and transformations
Executing a PDI transformation as part of a Pentaho process
Executing a PDI job from the Pentaho User Console
Generating files from the PUC with PDI and the CDA plugin
Populating a CDF dashboard with data coming from a PDI transformation
Getting the Most Out of Kettle
Sending e-mails with attached files
Programming custom functionality
Generating sample data for testing purposes
Getting information about transformations and jobs (file-based)
Getting information about transformations and jobs (repository-based)