Thursday, October 31, 2013

How to capture Talend Job Activity into databases - Activity Monitoring Console

It is often useful to capture the statistics of a job to monitor the job performance and error codes. Talend has three ways to capture this information.

  1. On console
  2. In FileSystem
  3. On Databases
My personal choice is to capture the logging information in a database since it will be very useful to store history and easy to track the trends of the job activity with a query. This could be very handy to identify any performance bottlenecks.

So here are the table definitions for the logs, stats and volume catchers used by the AMC. You can easily get these definitions by exporting the meta-data from tlogcatcher, tflowmeter and tstatcatcher to Generic Schemas in the repository and reusing them later. Here they are served without going through the pain of creating the metadata again :)

Create these tables on your database and configure your project to capture this information in the database so that the AMC can retrieve this info from database and display it for you.


CREATE TABLE [dbo].[logCatcher](
[moment] [datetime] NULL,
[pid] [varchar](20) NULL,
[root_pid] [varchar](20) NULL,
[father_pid] [varchar](20) NULL,
[project] [varchar](50) NULL,
[job] [varchar](255) NULL,
[context] [varchar](50) NULL,
[priority] [int] NULL,
[type] [varchar](255) NULL,
[origin] [varchar](255) NULL,
[message] [varchar](255) NULL,
[code] [int] NULL
) ON [PRIMARY]

GO

CREATE TABLE [dbo].[flowMeter](
[moment] [datetime] NULL,
[pid] [varchar](20) NULL,
[father_pid] [varchar](20) NULL,
[root_pid] [varchar](20) NULL,
[system_pid] [bigint] NULL,
[project] [varchar](50) NULL,
[job] [varchar](50) NULL,
[job_repository_id] [varchar](255) NULL,
[job_version] [varchar](255) NULL,
[context] [varchar](50) NULL,
[origin] [varchar](255) NULL,
[label] [varchar](255) NULL,
[count] [int] NULL,
[reference] [int] NULL,
[thresholds] [varchar](255) NULL
) ON [PRIMARY]

GO


CREATE TABLE [dbo].[statCatcher](
[moment] [datetime] NULL,
[pid] [varchar](20) NULL,
[father_pid] [varchar](20) NULL,
[root_pid] [varchar](20) NULL,
[system_pid] [bigint] NULL,
[project] [varchar](50) NULL,
[job] [varchar](50) NULL,
[job_repository_id] [varchar](255) NULL,
[job_version] [varchar](255) NULL,
[context] [varchar](50) NULL,
[origin] [varchar](255) NULL,
[message_type] [varchar](255) NULL,
[message] [varchar](255) NULL,
[duration] [bigint] NULL
) ON [PRIMARY]

GO

These table definitions are for SQL Server.

Happy Talending!
Praveena

No comments:

Post a Comment