Interview Questions - J2EE

Interview Questions - J2EE

Struts2 mainly originates form pervious Webwork2 project.

The main concepts in Struts2 are ActionContext, Interceptors, Action, Result, ValueStack and OGNL.

First every request must be handled by interceptor stack, these interceptors are provided and configured by Struts2, the main pre-configured interceptors includes, params interceptor, validation, fileupolad etc.

Interceptor is a key part of the Struts 2 framework; these interceptors are invoked both before and after the action.
Interceptor allows common, cross-cutting tasks to be defined in clean, reusable components that can keep separate from the action code.

Then Struts2 will call Action the request corresponds, developers can implement whatever logic, usually Action will call business service method to do real work.

At last Action would return a control string that tells Struts2 which results would render the view that’ll be returned in the request response.
Then Struts2 would return the response to client.

In the process, Struts 2 uses the ValueStack as a storage area for application data that will be needed during the processing of a request.
And OGNL is used to reference and manipulate properties on the ValueStack.

Struts2 VS Struts1

1.Action is POJO, Action does not need include any Struts2 specific API, even implanting the interface is optional, thus Action is easy to test, we don't need to mock HTTP objects.

2.There is no ActionForms any more.

3.Enhanced Results, except JSP, Struts2 also supports other view technologies such

as Free-marker, Velocity, PDF etc.

4.Struts2 provides easy Spring integration,

5.Struts2 also provides Intelligent Defaults, we can use annotations and conventions to drastically reduce XML-based configuration.

6.Struts2 provides easy plugging mechanisms, we can easily extend Struts2 functionality, and there are already many Struts2 plugin, such as restful, Json etc.

Compared with Ant, Maven is very easy to use, Using ant, we have to write its build file build.xml, from scratch, all by ourselves, it is very painful, and boring,

1.First, Maven provides Archetypes, we can use just one command to generate java project, and its pom.xml file in seconds, and the pom.xml file is much simpler than ANt's build.xml.

2.Second, Maven follows best practices, it defines Standard Project Layout, and developers can quickly understand the project's layout.

3.Third, Use Maven to manage project's dependencies, in the past, if one project need third-party libraries, developers directly put jars to the project lib directory, Using maven, we just need to decalre the dependency in pom.xml, maven will download the jars, and put it into local repository, this can save disk space, and if we need to update the jar to new version, we just need to modify the pom.xml.

4.Fourthly, maven introduces the concept, standard life cycles. user just need to learn a small set of commands to invoke defined goals to build any Maven project,Maven provides many built-in life-cycles: such as validatecompiletestpackageinstalldeploy.
Maven is a plugin execution framework, it is easy to write maven plugin to extend maven's functionality, there are already hundreds of maven plugin, such as mvn eclipse:eclipse.

Also we can run mvn ant:ant to generate ant build files from pom.xml directly, also maven can invoke ant target.

EJB 3 learns and benefits a lot from several technologies such as POJO, IOC, AOP, Spring and Hibernate.
EJB3's chief advantages include:

1.EJB3 supports AOP, that is aspect oriented programming, First we can use Interceptors to encapsulate common cross-cutting concerns in one place. Secondly, developers can use annotation to declare transaction and security.

2.EJB3 supports Dependency Injection, we can use annotation @EJB, @Resource to inject and refer another EJB, or external resource directly.

3.EJB2 demand that SessionBean and EntityBean implement many unnecessary callback methods, such as ejbActivate,ejbPassivate,ejbCreate etc, normally these methods are empty, in EJB3, developers don't need to write these methods at all, and if they really want to provide this function, they can annotate any method with annotation @PreDestroy,@PrePassivate etc.

4.Home interface is removed from EJB3; instead EJB3 enables simple lookup process.

5.EJB3 entity bean is totally redesigned, JB3.0 entity bean doesn't have home, local or remote interfaces, it is just a POJO with newly introduced annotation such as @Entity, @Id and uses entity manager to make java Objects persisted and detached, and also there is no Data Transfer Object any more in EJB3, and Persistence has its own specification: Java Persistence. Other OR mapping project such as Hibernate, Toplink supports JPA.

6.Annotations are used in EJB3, EJB3 use it as a "replacement" for XML Deployment Descriptors, Using Annotation can simplify configuration drastically. Also developers can choose to use all annotations, all XML or a mix of annotations and a XML, and EJB deployment descriptors are not required in ejb3.0.

Spring is a full Stack Web Framework, its main purpose is to make J2EE easier to use, and it provides many features.

Two key components of Spring are IOC or DI that is inverse of control, or dependency injection.
Spring supports AOP, that is aspect oriented programming.
Spring supports declarative transaction management through the Spring AOP framework.
Spring abstracts and integates with a number of related frameworks, Spring makes it easy to work with JDBC, JPA, Hibernate, or struts, and EJB, JMS, Email, Scheduling, Quartz, Test support and many others.
Aslo Spring has its own web framewrok, SpringMVC, and Security framework, Spring Security.
In one word, Spring is a very important framework for j2ee enterprise application.
AOP means Aspect-oriented programming.
It is used to complement traditional OOP; OOP is good for implementing the core business logic, but not the crosscutting concerns.
Crosscutting concern is a functionality that spans multiple modules of an application.
In applications, there are core business concerns and crosscutting concerns.
Crosscutting concerns are very common in enterprise applications such as logging, validation, authentication, and transaction.
Traditional object-oriented approach would cause two problems.

1.First is code tangling. Methods would include core logic and other crosscutting concerns; this will lead to poor code maintainability and reusability.

2.Another problem is code scattering, we have to repeat same statements and logics multiple times in multiple modules to fulfill a single requirement, if later we need to change the logic; we have to change in all places.

AOP provides another way to implement crosscutting concerns.
Instead of the classes and interfaces of OOP, the main programming elements of AOP are aspects.
We implement the crosscutting concerns as aspects, and then through XML or Annotation, we weave aspects into various objects, in this way, we place the crosscutting concern in only one place.
Spring has its AOP implementation, and AspectJ is the most complete and popular AOP framework in the Java community.
Hibernate is an object/relational mapping project of Jboss, Hibernate maps Database Table to Java Class, Table's column to Class Field.
Hibernate lets us develop persistent classes following OO idiom - including association, inheritance, polymorphism, composition, and collections.
Hibernate allows us to use its portable HQL to express queries  in object-oriented terms—using classes and properties of classes, Using HQL, we don't need write different SQL for different Databse any more. we also it supports native SQL.
Hibernate supports Annotation and XML configuration.
Also Hibernate supports JPA specification and allows us to write queries with the standardized JPA Query Language.


Dojo is mainly divided into three projects: Dojo, Dijit, and Dojox.

Dojo module provides many wonderful basic and core functions, such as Language and Browser Utilities, DOM Utilities, String Utilities, and asynchronous requests (dojo.xhrGet,dojo.xhrPost), JSON support, Drag-and-Drop(dojo.dnd), Animation and Special Effects,.

And Dojo normalizes browser, allowing the same source code to work in several browsers. Dojo also fixes several gross browser errors such as memory leaks in IE's event system.

Dijit is Dojo widget framework, it packs a fantastic library of widgets, such as Form Widgets, Layout Widgets, such as tab/split containers, Tree, Calendar, TimePicker, Dialog, Slider, Progress Bar, AutoComplete etc...

We can create Dijit widgets declaratively or programmatically.

Declarative widgets use nonstandard HTML attributes such as dojoType=.

We can also create same widgets through JavaScript, These are called programmatic widgets.

DojoX is a collection of subprojects for Dojo Extensions and Experimental, such as

Grid, Charting, highlight, FishEye, ColorPicker etc.


Dojo also provides several tools like ShrinkSafe, Checkstyle.

Dojo's loader, packaging and build system is very powerful.

It modularizes Dojo - a large project into any number of independent source files yet packages it into only a few highly compressed files.

dojo.require maps a module name to a URL, downloads, and then evals that resource.


Main API:


Asynchronous Request



dojo.connect(myButton, "click", myFunction)

Selenium is a portable software testing framework for web applications. Selenium provides a record/playback tool for authoring tests without learning a test scripting language. 

Selenium IDE is a Firefox extension, and allows recording, editing, and debugging tests.

Selenium Remote Control (RC) makes it possible to write automated tests for a web application in any programming language, including Java, .NET, Python, Ruby,  Perl, and PHP, and allows for better integration of Selenium in existing unit test frameworks.

Interview Question - Iterator vs Enum...

Interview Question - Iterator vs Enumeration

Iterator vs Enumeration
Enumeration is a legacy interface; it is from Java version 1.0. In Java version 1.2, the Collections framework was added, which essentially replaced the old collection classes with new ones (for example: Vector -> ArrayList, Hashtable -> HashMap, Enumeration -> Iterator etc.).

Enumeration can be applied to Vector and HashTable. Iterator can be used with most of the Collection objects.
Interface difference
Enumeration is the old Interface for legacy classes like Hashtable and Vector.
Enumeration gives a static read-only view; you can only "read" the contents of the collection.
It contains 2 methods namely hasMoreElements() & nextElement().

Iterator is for the new classes most of the Collection objects like HashSet,HashMap.
Iterator contains three methods namely hasNext(), next(),remove().
Using remove() method, we can delete the object safely when iterate collection, but Enumeration interface does not support this feature, it doesn't provide "no safe way" to remove elements from a collection while traversing.

Also Iterator's sub Interfac ListIterator provides previous and hasPrevious function for backward traversal, and add, set method to insert and modify object.
Because Iterator needs to check for modifications, this would decrease performance slightly, but normally that's negligible.
Iterator is fail-safe, because it always denies other threads to modify the collection object which is being iterated by it. Whenever a second thread tries for that Iterator will throw a ConcurrentModificationException. Iterators that do this are known as fail-fast iterators, as they fail quickly and cleanly.

If possible, we should always use Iterator.


Interview Questions – Databases

Interview Questions – Databases

Why Normalize?
1. Reduce Redundancy
I. Redundancy would waste more space.
One obvious drawback of data repetition is that it consumes more space and resources than is necessary.
II. Redundancy may cause inconsistency.
Moreover, redundancy introduces the possibility for error, and exposes us to inconsistencies whenever the data is not maintained in the same way. for example, these redundant data are not updated at same time.
2. Prevent insert, delete, update anomaly.
Insert Anomaly
The insert anomaly refers to a situation wherein one cannot insert a new row into a relation because of an artificial dependency on another relation.
Delete Anomaly
It refers to a situation wherein a deletion of data about one particular entity causes unintended loss of data that characterizes another entity.
Update Anomaly
The update anomaly refers to a situation where an update of a single data value requires multiple rows of data to be updated.
Example: Product, Customer, Invoice.
First Normal Form: Eliminating Repeating Data
A relation is said to be in first normal form when it contains no multivalued attributes.
To transform unnormalized relations into first normal form, we must move multivalued attributes and repeating groups to new relations
Second Normal Form: Eliminating Partial Dependencies
Arelationissaidtobein second normal formif itmeets both the following criteria:
The relation is in first normal form.
All non-key attributes are functionally dependent on the entire primary key.
Second normal form only applies to relations where we have concatenated primary keys.
Once we find a second normal form violation, the solution is to move the attribute(s) that is (are) partially dependent to a new relation where it depends on the entire key instead of part of the key.
Attribute B is functionally dependent on attribute A if at any moment in time, there is no more than one value of attribute B associated with a given value of attribute A.
In another word, it means A determines attribute B, or that A is a determinant (unique identifier) of attribute B.
In the INVOICE relation,we can easily see thatCustomerNumber is functionally dependent on Invoice Number because at any point in time, there can be only one value of Customer Number associated with a given value of Invoice Number.
INVOICE LINE ITEM: # Invoice Number, # Product Number, Product Description(only dependent on product number), Quantity, Unit Price, Extended Amount
Third Normal Form: Eliminating Transitive Dependencies
Arelationissaidtobein third normal formif itmeets both the following criteria:
The relation is in second normal form.
There is no transitive dependence (that is, all the non-key attributes depend only on the primary key).
To transforma second normal formrelation into third normal form, simplymove any transitively dependent attributes to relations where they depend only on the primary key. Be careful to leave the attribute on which they depend in the original relation as a foreign key.Youwill need it to reconstruct the original user viewvia a join.
An attribute that depends on another attribute that is not the primary key of the relation is said to be transitively dependent.
INVOICE: # Invoice Number, Customer Number, Customer Name(dependent on primary key attribute: Customer Number), Customer Address..., Terms, Ship Via, Order Date, Total Order Amount
The third normal form violation—a non-key attribute determining another non-key attribute
Boyce-Codd Normal Form - BCNF Form
Boyce-Codd normal form (BCNF) is a stronger version of third normal form. It addresses anomalies that occur when a non-key attribute is a determinant of an attribute that is part of the primary key (that is, when an attribute that is part of the primary key is functionally dependent on a non-key attribute).
The Boyce-Codd normal form has two requirements:
The relation must be in third normal form.
No determinants exist that are not either the primary key or a candidate key for the table. That is, a non-key attribute may not uniquely identify (determine) any other attribute, including one that participates in the primary key.
The solution is to split the unwanted determinant to a different table.
#Customer Number, #Product Line, Support Specialist Number
Fourth Normal Form
An additional anomaly surfaces when two or more multivalued attributes are included in the same relation.
Normalization leads tomore relations,which translates tomore tables and more joins.When database users suffer performance problems that cannot be resolved by other means, then denormalization may be required. Most database experts consider denormalization a last resort, if not an act of desperation.

Possible denormalization steps include the following:
Recombining relations that were split to satisfy normalization rules
Storing redundant data in tables
Storing summarized data in tables
What Is a Transaction?
A transaction is a unit of work that is composed of a discrete series of actions that must be either completely processed or not processed at all.
Transactions have properties ACID (Atomicity, Consistency, Isolation, Durability).
A transaction must remain whole. That is, it must completely succeed or completely fail. When it succeeds, all changes that were made by the transaction must be preserved by the system. Should a transaction fail, all changes that were made by it must be completely undone.
A transaction should transform the database from one consistent state to another consistent state.
Each transaction should carry out its work independent of any other transaction that might occur at the same time.
Changes made by completed transactions should remain permanent
Read Committed Isolation Level
Read Committed is the default isolation level in PostgreSQL. When a transaction runs on this isolation level, a SELECT query sees only data committed before the query began; it never sees either uncommitted data or changes committed during query execution by concurrent transactions.
In effect, a SELECT query sees a snapshot of the database as of the instant that that query begins to run. Notice that two successive SELECT commands can see different data, even though they are within a single transaction, if other transactions commit changes during execution of the first SELECT.
The level Serializable provides the strictest transaction isolation. This level emulates serial transaction execution, as if transactions had been executed one after another, serially, rather than concurrently.
In REPEATABLE READ, one transaction will not read committed change of another transaction.
set session transaction isolation level serializable;
Client 1:
mysql> start transaction;
mysql> select * from tt;
mysql> update tt set d=2 where d=4;
ERROR 1205 (HY000): Lock wait timeout exceeded; try restarting transaction
This time, client 2 cannot update the row. After a few seconds of waiting, the attempt to update a row ends with the error message “Lock wait timeout exceeded”.
Both REPEATABLE READ and SERIALIZABLE don’t allow dirty reads, nonrepeatable reads and phantom reads to happen.
While REPEATABLE READ still allows another client to modify data, while he performs a transaction. The change made by concurrent transaction would be invisible to current transaction.
SERIALIZABLE strictly disallows other transaction change data current transaction operates on.
If the table has no primary key, the query select * from tt where d between 1 and 4 would lock all rows.
Serializable vs. Snapshot
They are SERIALIZABLE and SNAPSHOT. They are both made available in order to avoid dirty, non-repeatable or phantom reads, but they do so using different methods.
The SERIALIZABLE level prevents phantom reads by using range locks.
SERIALIZABLE transactions take range locks in order to prevent Phantom Reads.
While SERIALIZABLE uses locks, instead SNAPSHOT uses a copy of committed data. Since no locks are taken, when subsequent changes are made by concurrent transactions, those changes are allowed and not blocked.
If you try to make a change to some data and that data has already been changed by concurrent transaction, you will get update conflict error message.
Non-unique indexes
A non-unique index is one in which any key value may occur multiple times. This type of index is defined with the keyword INDEX or KEY.
ALTER TABLE department ADD INDEX dept_name_idx (name);
Unique indexes
Unique indexes are indexes that help maintain data integrity by ensuring that no two rows of data in a table have identical key values, the exception is that NULL values may occur multiple times.
ALTER TABLE department ADD UNIQUE dept_name_idx (name);
A PRIMARY KEY also is a unique-valued index. It is similar to a UNIQUE index, but has additional restrictions:
A table may have multiple UNIQUE indexes, but at most one PRIMARY KEY.
A UNIQUE index can contain NULL values, whereas a PRIMARY KEY cannot.
A FULLTEXT index is specially designed for text searching.
SHOW INDEX FROM department \G
Index implementation
B-tree indexes
Branch nodes are used for navigating the tree, while leaf nodes hold the actual values and location information.
Bitmap indexes
Although B-tree indexes are great at handling columns that contain many different values, they can become unwieldy when built on a column that allows only a small number of values.
For columns that contain only a small number of values across a large number of rows (known as low-cardinality data), Oracle Database includes bitmap indexes, which generate a bitmap for each value stored in the column.
CREATE BITMAP INDEX acc_prod_idx ON account (product_cd);
Clustered Indexes
A clustered index determines the physical storage order of the data in a table.
A table can only have a single clustered index.
The leaf nodes of a clustered index contain the data pages of the underlying table.
By default, primary key creates a clustered index.
Query on clustered index is fast, as it only requires one lookup.
Insert is slower, as the insert must be added in the exact right place in the clustered index.
Non-clustered Index
Non-clustered indexes have the same B-tree structure as clustered indexes, except:
The data rows of the underlying table are not sorted and stored in order based on their non0clustered keys.
A table can contain multiple (up to 255) non-clustered index.
The leaf nodes of non-clustered Index do not consist of the data pages, only contain index pages:
If clustered index present, non-clustered index points back to the data pages in the clustered index.
If no clustered index, non-clustered indexes point to the actual data in the table.
Logical order of the index does not match the physical stored order of the rows on disk.
A unique key by default creates a non-clustered index.
A non-clustered Index has separate storage space different from the table storage.
This means that non-clustered indexes require a lot more hard disk space compared to clustered indexes.
Query on non-clustered indexes requires two lookups. First a lookup for the non-clustered index itself, then a lookup for the primary key.
Advantages and disadvantages of indexes
The optimizer chooses an index scan if the index columns are referenced in the SELECT statement and if the optimizer estimates that an index scan will be faster than a table scan.
Advantages of Index
Query optimization: Indexes make search queries much faster, because normally Index files generally are smaller and require less time to read than an entire table, particularly as tables grow larger. In addition, the entire index may not need to be scanned. The predicates that are applied to the index reduce the number of rows to be read from the data pages.
Uniqueness: Indexes like primary key index and unique index help to avoid duplicate row data.
Disadvantages of indexes
Every time a row is added, removed or updated, all indexes on that table must be modified. Therefore, the more indexes you have the more work the server needs to do to keep all schema objects up-to-date, which tends to slow things down.
Each index requires storage or disk space. The exact amount depends on the size of the table and the size and number of columns in the index.
Each index potentially adds an alternative access path for a query for the optimizer to consider, which increases the compilation time.
The best strategy is to add an index when a clear need arises. If you need an index for only special purposes, such as a monthly maintenance routine, you can always add the index, run the routine, and then drop the index until you need it again.
Constraint is a rule which can not be violated by end users.
There are five types of constraints:
Not null constraints:-which does not allows NULL values.
Unique constraints:-which does not allow duplication but allows NULL values.
Primary key constraints:-the key which does not allow duplication and null values. One table can only have one primary key.
Foreign key constraints:-the key used to refer primary key defined field in another table and it allows duplication.
Check constraint
Restrict the allowable values for a column
CONSTRAINT fk_product_type_cd FOREIGN KEY (product_type_cd)
  REFERENCES product_type (product_type_cd),
CONSTRAINT pk_product PRIMARY KEY (product_cd)

ADD CONSTRAINT pk_product PRIMARY KEY (product_cd);
ADD CONSTRAINT fk_product_type_cd FOREIGN KEY (product_type_cd)
  REFERENCES product_type (product_type_cd);
Constraints and Indexes
Constraint creation sometimes involves the automatic generation of an index. However, database servers behave differently regarding the relationship between constraints and indexes.
Cascading Constraints
ADD CONSTRAINT fk_product_type_cd FOREIGN KEY (product_type_cd)
  REFERENCES product_type (product_type_cd)
A view consists of a stored query accessible as a virtual table composed of the result set of a query. Unlike ordinary tables (base tables), a view is a virtual table, does not form part of the physical schema: it is a dynamic, virtual table computed or collated from data in the database.
Advantages of View
Views can provide advantages over tables:  
* Views can be used to make complex queries easy. A user can use a simple query on a view to display data from multiple tables without having the knowledge of how to join tables in queries.
* Views can simplify Statements for User, we can define frequently used joins, projections, and selections as views so that users do not have to specify all the conditions and qualifications each time an operation is performed on that data.
* Focus on the data that interests them and on the tasks for which they are responsible. Data that is not of interest to a user can be left out of the view.
* Different views can be created from the same data per the requirements, in this way we can display different data for different users.
* Provide additional level of security, through a view, users can query and modify only the data they can see. The rest of the database is neither visible nor accessible, so we can hide sensitive data from certain groups of users.
    * Views take very little space to store; the database contains only the definition of a view, not a copy of all the data it presents.
Disadvantages of Views
    * View affects performance, querying from view takes more time than directly querying from the table.
    * View depends on the table objects, when table is dropped, the view becomes inactive.
    * Rows available through a view are not sorted and are not ordered either, we can not use order by when define view.
Read-only vs. updatable views
We can define views as read-only or updatable.
Oracle, DB2 supports materialized views - pre-executed, non-virtual views commonly used in data warehousing.
In materialized view, the query result is cached as a concrete table that may be updated from the original base tables from time to time. This enables much more efficient access, at the cost of some data being potentially out-of-date. The accuracy of a materialized view depends on the frequency or trigger mechanisms behind its updates.
It is most useful in data warehousing scenarios, where frequent queries of the actual base tables can be extremely expensive.
Stored Procedure
A stored procedure is a precompiled subroutine on in database. Stored procedures are used to consolidate and centralize logic that was originally implemented in applications. Large or complex processing that might require the execution of several SQL statements can be moved into stored procedures, and all applications call the procedures only.
Benefits of using stored procedures
Reduced network usage between clients and servers
The stored procedure performs processing on the database server, without transmitting unnecessary data across the network. Only the records that are actually required by the client application are transmitted. Using a stored procedure can result in reduced network usage and better overall performance.
The more SQL statements that you group together in a stored procedure, the more you reduce network usage and the time that database locks are held.
Stored procedures are tunable. By having procedures that handle the database work for your interface, you eliminate the need to modify application source code to improve a query's performance. Changes can be made to the stored procedures that are transparent to the front-end interface.
Stored procedures are usually written by database developers/administrators. Persons holding these roles are usually more experienced in writing efficient queries and SQL statements. If you have your people performing the tasks to which they are best suited, then you will ultimately produce a better overall application.
Improved security, by including database privileges with stored procedures that use static SQL, the database administrator (DBA) can improve security, because you can enable controlled access to sensitive data by appropriate selection of the privileges a program has when it executes.
Centralized security, administration, and maintenance for common routines, by managing shared logic in one place at the server, you can simplify security, administration, and maintenance.
We encapsulate business logic in stored procedure, but stored procedures are incredibly tightly coupled to the specific database, it would be hard to switch from one database to another database vendor.

A database trigger is procedural code that is automatically executed in response to certain events on a particular table or view in a database. Triggers are commonly used to prevent changes, log changes, audit changes, enhance changes, enforce or execute business rules.
Advantages of Triggers
The Main advantage of the trigger is automatic, whenever the table is affected by inserts update, delete, or query that time, the triggers will implicitly call.
Disadvantages of Triggers
Triggers run on every update made to the table therefore it adds more load to the database and cause the system to run slower.
It is not possible to track or debug triggers.
Viewing a trigger is difficult compared to tables, views stored procedures.
Triggers execution is invisible; it is easy to forget about triggers.
They serve two different purposes. A procedure executes only when called. A trigger is 'triggered' by a system event and allows to intercede on insert, update, delete.
We can write a trigger that calls a procedure. We can't really write a procedure that 'calls' a trigger.

Add 3rd Jars to Maven2 Build Path without Installing Them

Add 3rd Jars to Maven2 Build Path without Installing Them

Sometimes, we want to add a 3rd jar to our project, but it is not included in public maven repository, and we don't want to bother to create a local or intranet repository, and install the jar into it.


For example, we want to add javaparser to our application, it is a library to parse java source, and extract methods, fields, javadoc, and comments.

We can define its scope as system, and specify its path.








Where ${basedir} is pointing to your project's root.


But this has some limitations, for example, when use maven assembly plugin to generate assemblies, jars under scope "system" are not included.


The solution is to put the dependency in a "file system repository" local to the project, then use install-file to install the jar to local default repository first (~/.m2/repository), then move the directory tree to ${basedir}/my-repo.

mvn install:install-file  -Dfile=%LIBPATH%\javaparser1.0.8.jar -DgroupId=japa.parser  -DartifactId=javaparser -Dversion=1.0.8 -Dpackaging=jar  -DgeneratePom=true 

I would declare that repository in my pom.xml like this:















Maven 2 assembly with dependencies: jar under scope “system” not included.

Can I add jars to maven 2 build classpath without installing them?

Defect – /var is full

Defect – /var is full

Log is Deleted When Application Is Running

It is reported that /var file system on our product machine is full.
This is weird, as the application has the mechanism to trim log according to the specified maximum size.

Check the application logs are missing. Guess maybe somebody deleted them for unknown reason. 'lsof -p $app_pid' shows state of application logs as deleted, and its size is extremely huge.

So write a sample test to see what would happen if delete log file when application is running.

It is found out that that the OutputStream is unaware that underlying file is removed, still write to the already-deleted file, no exception is thrown(if use other streams than PrintStream), or no error code is set. No log would be really recorded!!!

And after log file is removed, file.length() would always be 0, so no trim would happen, this would cause file size increase continuously. Due to java still has file handle to the deleted file, Linux is unable to free up disk space held by deleted files.
At last /var is filled up!!!
$lsof -p $app_pid | grep $log_dir
java    22533 $USER    6w   REG    8,5      819 1153292 $log_name (deleted)

The fix is simple, before write out to log file, check whether log file exists, if not, close old stream, and recreate the file, and stream.
package logger;


public class LoggerTest {
    String fileName;
    File file;
    PrintStream ps;
    private static final long maxFileSize = 1024 * 1024;
    private long fileSize;

    public LoggerTest(String fileName) throws IOException {
       this.fileName = fileName;

    public void writeForEver(String str) throws IOException,
           InterruptedException {
       while (true) {
           // after file is removed, file.length() would always be 0, so no trim would happen
           // file size would increase continuously
           fileSize = file.length();
           if (fileSize > maxFileSize) {

           // ps.checkError would not report error, even when the file is deleted
           if (ps.checkError()) {
              throw new IOException("ps.checkError() " + ps.checkError());

           if (!file.exists()) {
              ps.println(file.getName() + " is deleted, recreate");
           // verified after file is removed, file.length() would always be 0.
           System.out.println("fileSize " + fileSize + " " + str);

    private void trim(boolean b) {
       // trim the log to specified size

    private void configure() throws IOException {
       file = new File(fileName);
       FileOutputStream fos = new FileOutputStream(file, true);
       ps = new PrintStream(fos);

    public static void main(String[] args) throws IOException,
           InterruptedException {
       new LoggerTest("log.txt").writeForEver("Hello World!");


Java (159) Lucene-Solr (110) All (60) Interview (59) J2SE (53) Algorithm (37) Eclipse (35) Soft Skills (35) Code Example (31) Linux (26) JavaScript (23) Spring (22) Windows (22) Web Development (20) Tools (19) Nutch2 (18) Bugs (17) Debug (15) Defects (14) Text Mining (14) J2EE (13) Network (13) PowerShell (11) Chrome (9) Continuous Integration (9) How to (9) Learning code (9) Performance (9) UIMA (9) html (9) Design (8) Dynamic Languages (8) Http Client (8) Maven (8) Security (8) Trouble Shooting (8) bat (8) blogger (8) Big Data (7) Google (7) Guava (7) JSON (7) Problem Solving (7) ANT (6) Coding Skills (6) Database (6) Scala (6) Shell (6) css (6) Algorithm Series (5) Cache (5) IDE (5) Lesson Learned (5) Miscs (5) Programmer Skills (5) System Design (5) Tips (5) adsense (5) xml (5) AIX (4) Code Quality (4) GAE (4) Git (4) Good Programming Practices (4) Jackson (4) Memory Usage (4) OpenNLP (4) Project Managment (4) Python (4) Spark (4) Testing (4) ads (4) regular-expression (4) Android (3) Apache Spark (3) Become a Better You (3) Concurrency (3) Eclipse RCP (3) English (3) Firefox (3) Happy Hacking (3) IBM (3) J2SE Knowledge Series (3) JAX-RS (3) Jetty (3) Restful Web Service (3) Script (3) regex (3) seo (3) .Net (2) Android Studio (2) Apache (2) Apache Procrun (2) Architecture (2) Batch (2) Build (2) Building Scalable Web Sites (2) C# (2) C/C++ (2) CSV (2) Career (2) Cassandra (2) Distributed (2) Fiddler (2) Google Drive (2) Gson (2) Html Parser (2) Http (2) Image Tools (2) JQuery (2) Jersey (2) LDAP (2) Life (2) Logging (2) Software Issues (2) Storage (2) Text Search (2) xml parser (2) AOP (1) Application Design (1) AspectJ (1) Bit Operation (1) Chrome DevTools (1) Cloud (1) Codility (1) Data Mining (1) Data Structure (1) ExceptionUtils (1) Exif (1) Feature Request (1) FindBugs (1) Greasemonkey (1) HTML5 (1) Httpd (1) I18N (1) IBM Java Thread Dump Analyzer (1) JDK Source Code (1) JDK8 (1) JMX (1) Lazy Developer (1) Mac (1) Machine Learning (1) Mobile (1) My Plan for 2010 (1) Netbeans (1) Notes (1) Operating System (1) Perl (1) Problems (1) Product Architecture (1) Programming Life (1) Quality (1) Redhat (1) Redis (1) Review (1) RxJava (1) Solutions logs (1) Team Management (1) Thread Dump Analyzer (1) Troubleshooting (1) Visualization (1) boilerpipe (1) htm (1) ongoing (1) procrun (1) rss (1)

Popular Posts