Handling Java Web Application Input, Part 1
| |||||||
Inadequate data validation is the most common cause of security exploits suffered by web applications today. A staggering fact is the high number of applications exploited through weak validation. This is due to the simplicity of such an attack. No longer do attackers have to spend vast amounts of time researching ways to circumvent the security infrastructure of an application. An attacker can use freely available tools to scan for vulnerable websites. Using these findings, an attacker can use a web browser to ghost straight through firewall rule sets on port 80, altering the intended behavior of an application. This is true never more so than today. There are a multitude of technologies and frameworks available. Engineers are under increasing pressure to complete work on time, and hence place a heavy reliance on such tools. However, such technology may not adequately deal with user input to meet all cases, and as a result may introduce unintentional security vulnerabilities. Therefore, it is of paramount importance that secure coding practices are in place to close any possible doorway that permits such nefarious attacks to take place.
The purpose of this series of articles is to explain common security vulnerabilities associated with application input. This series emphasizes the importance of handling application input correctly. Although the topics covered are nothing new, they are critical to ensuring the security of an application. This series is aimed at practitioners interested in planning, designing, implementing, and maintaining software systems that are unaware of such issues. In this article, part one in the series, we will look at some validation best practices, along with SQL injection attacks. In later articles, we will look at other common attacks, and in particular, part two will deal with cross-site scripting attacks and error-handling techniques.
The most common web application exploits are the result of not validating input effectively. Web applications provide two means for validating user input: client-side validation and server-side validation. Client-side validation is useful in that it enhances the user experience by improving the responsiveness and usability of an application. It is normally implemented in HTML-based user interfaces through a combination of JavaScript and HTML element attributes. However, client-side validation is easily circumvented, is less sophisticated than server-side validation and, if not used in conjunction with server-side validation, can introduce a number of serious vulnerabilities into an application.
For example, consider an online e-commerce application that allows a customer to purchase a number of books. The customer goes through the standard process of checking out: entering payment and delivery details. At each step through this process, client-side validation is performed, and the state at each step is stored in HTML hidden fields. Finally, the user is presented with a means to confirm the transaction. On confirmation, the order is wrapped in a HTML request and sent across the wire to a unsuspecting server. On receipt, the server performs no validation, but simply accepts the data with no questions asked and goes about its business.
Now, if you haven't already spotted the vulnerability, the threat is quite serious and easy to exploit. Before submitting the order, the attacker can view the HTML source, change the price stored unencrypted in HTML hidden field(s), and remove any JavaScript client-side checks by disabling scripting through the browser or deleting the script and saving the modified version of the checkout page locally. The attacker can then load the newly crafted checkout page in a web browser of their choice and complete the checkout process, by submitting the order to the server for processing.
Although the above example is very simplistic, you are probably thinking that never in a million years would you allow something similar to manifest itself. The point here is that data input into an application crosses a trust boundary; you have no control over what a user inputs, whether well-meaning or otherwise. Therefore, you are making an assumption about the integrity and skill of an end user, which is a sure recipe for disaster. For example, consider a simple message board application. A well-meaning user can alter page formatting and possibly deface a web page by passing in content containing HTML markup.
As a result, never gamble: always identify where data flows from an external source and where it does, analyze it carefully. If your findings reveal input being used to generate content, carefully consider how its usage can damage your application and be as stringent and prudent as possible, by employing server-side validation.
A critical validation practice is to always test for valid data rather than invalid data. The criterion here is that there will most likely be a case you simply cannot perceive. For example, consider a simple file upload application, which for security reasons rejects all files with an .exe extension. If validation is coded to reject all files with this extension, you are opening up the application to new exploits through other types of malicious files. Alternatively, by explicitly checking for files with extensions considered safe and rejecting everything else, you are putting a more secure policy in place. Moreover, by checking for validity, you reduce the potential for exploits through data masquerading. Data masquerading is the process by which insecure data is represented in a way to make it look secure. This can be achieved through using features of the host platform, such as passing a file regarded as insecure, using a different file name representation to a Windows system using the MS-DOS short name, or by an attacker encoding data using a different character set.
Now, with the above in mind, let's look at a common exploit realized through database interaction.
A common and dangerous exploit suffered by applications is an SQL injection attack. An SQL injection attack is the process of accepting input and using it to dynamically generate an SQL query that is subsequently executed against a database. As an example, let's consider a very simple and contrived example that has a simplistic database schema, as shown in Figure 1.

Figure 1. Example's entity relationship diagram

Figure 2. A simple customer detail view
Figure 2 shows a simple web page that accepts a customer ID and
displays the relevant information about a customer. This simple
implementation uses a query built by pulling the customer ID from
the HttpServletRequest and using it to generate the
SQL.
final String custID = httpRequest.getParameter("custID");
final String sql = "Select * From Customer Where CustomerID
= '" + custID + "'"
The above code snippet illustrates an important point, in that we are trusting the user to input valid data, and are providing them with the capacity to tailor a query to return data not intended. Consider what would happen if the user entered this:
cust1' or 1=1 --
This input would generate the following SQL:
Select * From Customer Where CustomerID
= 'cust1' or 1=1 -- '
which will result in all data held within the
Customer table being returned, as shown in Figure 3
below. The -- is a comment operator understood by SQL
Server; other vendors may provide a different operator, such as the
# provided by MySQL. This operator will have the
effect of causing any data appearing in the query after the comment
to be ignored.

Figure 3. SQL injection results
A user could also simulate the same result by using the SQL
like operator.
cust1' or CustomerID like '%
Some database vendors provide support for executing multiple SQL statements, like SQL Server. For example, entering something along the lines of:
Cust1'; select * from creditcard
Cust1'; drop table creditcard
would cause havoc to an application. An attacker could use this
technique to obtain a privileged login to a system, delete
pertinent data, or even place an order for goods. Fortunately, this
exploit relies on support from the database vendor and the data
access middleware used. Java uses JDBC as its data access
technology, and by default it is not possible to introduce this
vulnerability. This is due to the fact that a
java.sql.Statement is only allowed to return one
java.sql.ResultSet object at a time. However, it is
possible in Java to return multiple result sets, so Java developers
beware. This can be done through using a custom
Statement implementation or more programmatically,
through using features of the Statement interface, as
shown below.
final String sql ="Select * From Customer; " +
Select * From CreditCard";
final Statement s = con.createStatement();
final boolean rsReturned = s.execute(sql);
while (true) {
if (rsReturned) {
ResultSet rs = s.getResultSet();
// do something with result set
rs.close();
}
else {
// do something, could be an update etc
}
if (!s.getMoreResults()) {
// no more results so exit loop
break;
}
}
s.close();
con.close();
Another form of SQL injection attack is facilitated through the
use of SQL union statements.
Cust1' union select Customer_ID, Type, Number,
ExpiryDate From CreditCard --
This is perhaps the most difficult of attacks to achieve--it requires understanding of the underlying database, in that the data types of each column and number of returned columns are required to be equal to the original query, as shown above. However, an attacker usually prevails through trial and error, especially if the application ignorantly reports any database errors to a patient attacker. Figure 4 shows what could happen.

Figure 4. SQL union injection returning credit card information
The first step to preventing SQL injection attacks is to take advantage of the features provided by your database system. The most important of these is: never establish a database connection using an administrator account. Instead, provide role-based logins with the minimum privileges required to carry out a particular task. This will suppress an opportunity for an attacker to perform additional operations, as doing so will result in error. For example, if a task requires read access, never give write access; if a task requires read access to a particular table, only permit access to that table and no other.
Moreover, always consider where you store configuration information, especially in an external shared hosting environment. For instance, if your application stores SQL statements or database connection settings in a properties or XML file for flexibility, it will be more secure to move them into a Java class, applying techniques to protect against decompilation (which is the process of reconstructing Java source code from a compiled class file using a decompiler such as DJ Decompiler). Alternatively, if possible, use stored procedures and encrypt them (if supported by your database). Using these techniques will help to protect your database configuration data from prying eyes, and the possibility of your application being circumvented through the modification of such data.
The most critical step in preventing against attack is to employ
server-side validation on any input data used as part of a database
call. Validation code should test for validity, ensuring that input
data is of the expected type and length, rejecting any invalid
data. To further prevent against attack, prefer the use of
java.sql.PreparedStatement over
java.sql.Statement and
java.sql.CallableStatement when calling stored
procedures.
final String sql = "Select * from Customer where CustomerID =?";
final PreparedStatement ps = con.prepareStatement(sql);
ps.setString(1,customerID);
Using a java.sql.PreparedStatement as shown above
provides strong type checking through parameterized input via the
setXXX() methods. Also, the SQL above now contains a placeholder
for such input, making obsolete the need to handle different types
of input differently during query construction. This ensures that
any data passed is of the correct type, which limits what an
attacker can do. For instance, an attacker can no longer pass
character data where a numeric value is expected, as this will
result in error. In the case where character data is expected, the
attacker can no longer circumvent a query by passing additional SQL
commands. For example, consider Figure 3 again,
where an attacker passes cust1' or 1=1 -- as input
into the query below:
final String custID = httpRequest.getParameter("custID");
final String sql = "Select * From Customer Where CustomerID = '" + custID +"'
to produce the result shown below:
final String sql = "Select * From Customer Where CustomerID = 'cust1' or 1=1 --
By using a java.sql.PreparedStatement, an attacker
can no longer manipulate the query to achieve the same result as
before. This is due to the input being escaped as a benign string
and not a behavior-altering unit of instruction. It generates the
following query:
final String sql = "Select * From Customer Where CustomerID = "cust1' or 1=1 --"
The input cust1' or 1=1 -- is now treated as the
search criteria used to match the CustomerID field. Since no
customer has an ID of this form, no results will be returned. A
further advantage of this approach is that it eliminates the need
for messy validation code that would be required if a
java.sql.PreparedStatement was not used. Such
validation code would include a routine to replace any single quote
character sent as part of the input string with the addition of an
extra single quote or by replacing it with a space character.
Moreover, by enforcing the use of
java.sql.PreparedStatement and
java.sql.CallableStatement as a coding standard for
all database access, this reduces the burden placed on a developer who
may not be aware of such validation routines when new functionality
is added, thus introducing unintentional security holes.
In this article, I talked about some validation best practices when dealing with input from external sources. I also talked about the threat posed through SQL injection attacks and ways to prevent such an attack. In part two of this series, we will look at the very real and dangerous threat of cross-site scripting and some error-handling best practices in J2EE web applications.
Stephen Enright is a Dublin-based software engineer.
|
|