Talend Interview Questions

31. Differentiate between Repository and Built-In ?
Answer : In Built-In we can manually edit the data as data is kept locally in the job whereas in repository all the data is stored there only. We can extract only Read-only-information into the job from repository.

32. Define tDenormalizeSortedRow?
Answer : tDenormalizeSortedRow is bundled in a group of all input sorted rows. It helps in saving the memory by synthesizing sorted input flow.

33. What is the use of String Handling Routines?
Answer : It allow us to take out many operations and test on alphanumeric expressions relay on Java methods.

34. Difference between tMap and tJoin component in Talend .
Answer :
It accepts more than one input one is main and rests of the lookups.
We can create more than one output.
tMap has “inner join ” and ” left outer join” joining model.
tMap offers three match model.
Unique Match.
First Match.
All Matches.
tMap allows to store data on file option for lookup data processing.
In tMap you can filter data using filter expression.
You can write transformation using expression builder at each column level.
It accepts only two inputs and only one is main and other one is lookup.
It has two default outputs one is “Main” and another one is ” Inner join reject”.
tJoin offer`s only “inner join”.
tJoin defaulted with Unique match.
tJoin doesn`t offer this feature.

35. Difference between built in schema and Repository.
Answer :
Built-in: all information is stored locally in the Job. You can enter and edit all information manually.
Repository: all information is stored in the repository.
You can import read-only information into the Job from the repository. If you want to modify the information, you must take one of the following actions:
Convert the information from Repository to Built-in and then edit the built-in information.
Modify the information in the Repository. Once you have made the changes, you are prompted to update the changes into the Job. 36.

36. What is XMS and XMX parameter in Talend?
Answer :
You can modify the memory allocated to Talend Studio by modifying the relevant Studio .ini configuration file according to your system, such as TOS_DI-win32-x86.ini for 32-bit Windows systems. For Linux / Solaris / Windows system, the relevant .ini configuration file is located in your Studio installation folder.
By default the ini file includes the following JVM parameters:
The memory that you can allocate to your Talend Studio depends mostly on your system memory availability. However, from our experience, we can recommend the following settings based on the most usual system memory values.
With 2 GB of memory available on a 32-bit system, bounds can be changed as follows:

37. How to improve the performance of Talend job having complex design?
Answer :
1.Remove Unnecessary fields/columns ASAP using tFilterColumns component.
2. Remove Unnecessary data/records ASAP using tFilterRows component.
3. Use Select Query to retrieve data from database
4. Use Database Bulk components
5. Store on Disk Option
6. Allocating more memory to the Jobs
7. Parallelism
8. Use Talend ELT Components when required
9. Use SAX parser over Dom4J whenever required
10. Index Database Table columns
11. Split Talend Job to smaller Subjobs

38. How to share a database connection
Answer :
If you have various Jobs using the same database connection, you can factorize the connection by using the Use or register a shared DB Connection option so that the connection can be shared between parent and child Jobs.
Assume that you have two related Jobs (a parent Job and a child Job) that both need to connect to your remote MySQL database. To use a shared database connection in the two Jobs, to the following:
1. Add a tMysqlConnection (assuming that you work with a MySQL database) to both the parent and the child Job, if they are not using a database connection component.
2. Connect each tMysqlConnection to the relevant component in your Jobs using a Trigger > On Subjob Ok link.
3. In the Basic settings view of the tMysqlConnection component that will run first, fill in the database connection details if the database connection is not centrally stored in the Repository.
4. Select the Use or register a shared DB Connection check box, and give a name to the connection in the Shared DB Connection Name field.
5. In the Basic settings view of the other tMysqlConnection component, which is in the other Job, simply select Use or register a shared DB Connection check box, and fill the Shared DB Connection Name field with the same name as in the parent Job.

39. What is context variable and context group?
Answer : Context describes the user-defined parameters that are passed to your Job at runtime. Context Variables are the the values that may change as you promote your Job from Development, through to Test and Production. Values may also change as your environment changes, for example, passwords may change from time to time.
Every Talend Job has a Default Context. As you add Context Variables to your Job, they are added to the Default Context. You may also add your own contexts, for example, Test and Production. Adding multiple contexts allows you to easily switch the context of your Job.
Context defines a collection of Context Variables.
When you create a String Context Variable, either directly within a Job or as part of a Context Group, Talend assigns a value of null. This appears to imply that the Context Variable is, in fact, a null String pointer; however, this is not the case.

40. How to generate surrogate key by using Talend?
Answer : Perform the following steps to add a new column to the target for the generated key values:
Open the properties window of the target and select the Columns tab.
On the Columns tab, click the New column icon. A new column appears at the bottom of the list.
Type the name of the new column. This sample uses the name CUSTOMER_GEN_KEY.
In the Type column, change the type of the new column to Numeric.
To reposition the surrogate key column, select its column number in the list and drag the column up to position 1. The following display depicts the completed Columns tab for the sample job.