 |
Sweeping the File System with NIO-2
Introduction
JSR 203 (NIO-2), being
implemented in the OpenJDK
project, is shaping the future of I/O in the upcoming JDK 7. File I/O
has
been lingering around since JDK 1.0, but lacked many capabilities and
is
being overhauled in NIO-2 arriving in the next JDK release. This will have
a groundbreaking impact on the way Java applications interact with the
file system. NIO-2 provides enhancements to the file system API,
asynchronous I/O API, and socket channel API.
A good number of Java
applications work closely with the file system. The historic file
system management capabilities in the JDK are limited, and therefore even
commonly performed file interactions can require a lot of custom coding
on top of the provided API. For example, let's say
that you need to poll files for changes. You'd have to write
that yourself. Even some of the provided features have deficiencies:
the rename and move operations are not guaranteed to be
atomic. In the event of failure, the original file and the target
file may both exist or the target file may be incompletely written to
the disk. The
applications that want to handle these scenarios are forced to
resort to native code and
thus lose the platform-independence benefits of Java. The new(er)
NIO
API (NIO-2) allows
exploiting the native
file-system capabilities for interacting with the file system in a
clean way. This brings Java on par with other programming
languages when dealing with file system.
In
this article, I will
focus
on how the new API brings a
fresh perspective for accessing and manipulating the file system. The
term NIO-2 will be used in this article to refer to the file-system
enhancements provided by NIO-2, though the scope of NIO-2 is much wider
than that. I have tried to provide code snippets frequently to give an
idea of what it feels like to use the new API.
Why another API for file
handling?
The current java.io.File
API is a child left behind, while other
parts of the Java
API have grown robust. It provides minimal file management
capabilities. Some of the areas where the current file API
falls short are:
- Lack of information: The
current API is terse
when passing the information back to the application. For example,
operations like
rename()
and delete()
return a true
or a false
to indicate success or failure.
There is no information on what does a false
mean, why a
file could not
be deleted, etc. The application is not effectively told what
went wrong.
- Performance: The current
file API uses
fine-grained method calls for seeking information about a file. Calls
for file metadata such as "are you a file?", or "are you a directory,
are you
hidden, what was your last modified time?" are all made against the
file system in a granular way. There is no way to go and get all the
attributes in a single shot. This is inefficient. When asked for the entries under a directory, the
current API returns a list or an array (the
file.list()
operation). This doesn't scale when accessing large
directories, specifically when accessing a remote file system.
- Limited capability: The
current file API has no support in multiple areas that are so
frequently needed in applications.
- There is no way to
perform
rename(),
move(),
and other file operations atomically. These could fail with source and
target files co-existing. There is no way to create a file along with
initial file permissions set up in an atomic operation. This
allows
for an attack
window,
between the period when a
file is being created in one operation and the file permissions are
modified in a
subsequent operation.
- When copying a file, the
best option available until JDK 6 is
fileChannel.transferFrom(srcChannel,
0, srcChannel.size()). This
uses a byte channel as an intermediate for copying data. This copies
the file data, but does not copy the attributes. Currently,
there
is no
simple option for copying a file, along with its set of file
attributes.
- There is no support for
fetching the metadata of a file or a file
system, other than the basic attributes. For example, there is no
direct way to know the file owner, and the group or user permissions.
- There is no API level
support for
handling symbolic links, without
writing the native code. If the application demands resolving and
traversing through symbolic links, you have to write the
algorithms that handle the copy, move, delete and rename operations
involving
symbolic links. The complexity of handling circular
references with symbolic links is another task to deal with. There is a
good
chance that the application that needs to handle symbolic links will
resort
to native code that is not portable.
- There is
no support to probe the type of file
content. If you want to guess the type of file content, you have to write your own algorithm to read the bytes and make a good
guess on the content type.
- There is no way to
extend the file API to support other
custom file systems like memory files, encrypted file-system files, and
so on. In
essence, if you want to write an application like TrueCrypt,
you have no good support
from existing file API.
- There is no support for
file notifications. The only way to
know about
a change in the file system is to poll it, or
write a native call based on your operating system.
The NIO-2 Prodigy
The NIO-2 API includes features to correct the shortcomings of the
current file-system API. There are certain elements of
design used in the NIO-2 APIs that are worth mentioning.
- Informative exceptions:
The methods that access the file system
throw checked
IOExceptions.
There are specific
subclasses of IOException
that encapsulate detailed
information about the files in context, the reason for failure, and the
detailed message. This can assist the application to recover from
specific errors.
- Method
Chaining is used to simplify
repeated object interactions on
the same object. This produces compact code and alleviates the need for
declaring unnecessary variables.
DirectoryStream<Path> dirStream = Paths.getPath("conf_link").createSymbolicLink(..., ...).resolve() .newDirectoryStream();
- Varargs
are used in operations that accept options or flags as
parameters.
This allows passing the options as arrays, or as comma-separated values.
Path path = Paths.getPath("/app/config"); SeekableByteChannel channel = path.newByteChannel(StandardOpenOption.CREATE, StandardOpenOption.WRITE, StandardOpenOption.APPEND, StandardOpenOption.SYNC); channel.write(...);
Class orchestration
Packages: The NIO-2
APIs for
file management reside under a new package
java.nio.file, with two
sub-packages:
java.nio.file.attribute:
This is the container for classes that supports bulk access to file and
file store attributes.
java.nio.file.spi:
This is
the service provider interface subpackage. It provides a contract for
pluggable file system implementations and is a facility for creating
your own file system provider implementations.
The key players of NIO-2 file management API are exibited in the class
diagram of Figure
1.

Figure 1. Class diagram for NIO-2 file management API
The primary classes from java.nio.file
package:
FileStore:
A file store is the underlying storage for files in a particular file
system. It could be a storage pool, device, partition, volume, concrete
file system, etc. The FileStore
class represents the
physical
characteristics of the device, and lets you know about the type of
volume, how much disk space is left, etc. It allows access to the
metadata of
file store.
Path path = Paths.get("/app");
FileStore store = path.getFileStore(); FileStoreSpaceAttributeView fileStoreAttribute = store.getFileStoreAttributeView(attributeName); long unallocated = fileStoreAttribute.readAttributes().unallocatedSpace();
FileSystems:
This is a factory for
file systems. It has operations to get a file system, given a URI.
There are operations to construct new file systems.
FileSystem:
A file system is usually a single
hierarchy of files with one top-level root directory. In some cases it
may have several different file hierarchies, each with its own
top-level
root directory. Further, a file system can span over multiple file
stores that vary in features. Each file system is identified by a URI
in the new API. The default file system is identified by URI file:///.
The default file system creates objects that provide access to the file
systems accessible to the JVM. The FileSystem
class
provides access to
the associated FileStorePath
instance, given a path string on the file system.
FileSystem fileSystem = FileSystems.getDefault(); Path path = fileSystem.getPath("/Users/Guest/Public"); for (Path rootDirs : fileSystem.getRootDirectories()) { ... } for (FileStore fileStore : fileSystem.getFileStores()) { ... }
FileRef:
A FileRef
is the basic
notation of reference to a file. A file is mostly located using the Path,
but could be widely implemented using other
means like the
file identifier to locate a file. It has operations to open a file for
reading or writing. These operations are symbolic
link aware. The symbolic link
related
methods are constructed in a manner that allows you to specify the
behavior when a symbolic link is encountered. A FileRef acts
as a gateway to the associated metadata or file attributes,
allowing bulk
access to the file attributes. For example, you can access the
traditional unix style file permission attributes.
FileRef fileRef = Paths.get("/Users/Guest/run.sh"); try { OutputStream os = fileRef.newOutputStream(OpenOption.WRITE, OpenOption.APPEND, OpenOption.DELETE_ON_CLOSE);
... } catch (IOException e) { ... }
PosixFileAttributeView attributeView = fileRef.getFileAttributeView( PosixFileAttributeView.class, LinkOption.NOFOLLOW_LINKS); PosixFileAttributes attributes = attributeView.readAttributes(); GroupPrincipal groupName = attributes.group(); Set<PosixFilePermission> permissionSet = attributes.permissions();
Path:
This is the central class that an application developer will encounter
most. Path
is the implementation of FileRef
that uses system path to
locate and access a file. Its the NIO equivalent of java.io.File.
A path is hierarchical and knows about its
name. Path
has two kinds of operations: those that
deal with
the methods to access components, combine paths,
and those that
deal with the file operations.
All
operations support symbolic link semantics. The operations that access
the file system throw meaningful exceptions that have details of what
went wrong. The below code snippets
provide
a peek into the file management capabilities with NIO-2.
Path
is an Iterable
over the name elements
of the entire path.
Path path = Paths.get("/application/apache-tomcat/conf/server.xml"); for (Path pathElement : path) { //gets the path elements- application, apache-tomcat, conf, server.xml //in this order. }
The normalize
operation removes redundancies from the
path.
Path path = Paths.get("/app/dir/../../tmp/vm"); Path normalized = path.normalize(); // normalized to /tmp/vm
The resolve
operation resolves the given relative path
against the current path.
Path path = Paths.get("/app/dir"); Path resolved = path.resolve("../dirA/dirB") ; // resolved to /app/dirA/dirB
The relativize
operation constructs a
path that originates from the original path and ends at a location
path. It returns the relative path between two given paths.
Path path = Paths.get("/a/b/c"); Path absolutePath = Paths.get("/a/x"); Path relativized = path.relativize(absolutePath); // returns path ../../x
There are a bunch of path
comparison operations like startsWith(),
endsWith(), and isSameFile().
Path
has operations for copying, moving and
deleting a file. The new API passes on the onus of implementing the
operating system's native
calls to the service provider. For example, a typical service provider
would provide implementations for atomically moving/copying a file on
Windows based file systems with MoveFileEx
system call, and likewise with other file systems. The application
leverages platform independence by programing against the API. The copy
and link options in the code snippet below provide a sneak peek
into the additional armor.
/* * Copy a file with the attributes. */ path.copyTo(targetPath, StandardCopyOption.REPLACE_EXISTING, StandardCopyOption.COPY_ATTRIBUTES); /* * Move a file atomically. */ path.moveTo(targetPath, StandardCopyOption.ATOMIC_MOVE);
/* * The exceptions thrown are specific to the cause of failure. */ Path link = Paths.get("/app/tool/lib") ; // a symbolic link
try { link.delete(); // the link is deleted, and not the target of the link.
} catch (DirectoryNotEmptyException e) { ... } catch (NoSuchFileException e) { ... } catch (IOException e) { ... }
The checkAccess operation
allows to check the existense of a file and to know
if the JVM has appropriate access privileges to a file.
FileRef fileRef = Paths.get("/Users/Guest/run.sh"); try { fileRef.checkAccess(AccessMode.READ, AccessMode.EXECUTE); } catch (NoSuchFileException e) { ... } catch (AccessDeniedException e) { ... } catch (IOException e) { ... }
SeekableByteChannel:
This is the NIO-2 equivalent of RandomAccessFile.
It
allows reading and writing bytes from a channel of variable
length. The
set of OpenOption
flags that
can be provided when creating the SeekableByteChannel
provides a glimpse into the capabilites of new API (and actually the
service provider): READ,
WRITE, APPEND, TRUNCATE_EXISTING, CREATE, CREATE_NEW, DELETE_ON_CLOSE,
SPARSE, SYSNC, DSYNC, and NOFOLLOW_LINKS.
The SeekableByteChannel can
be cast to FileChannel
for advanced operations like file locking, memory mapped I/O, etc.
Path path = ... ; SeekableByteChannel channel = path.newByteChannel(StandardOpenOption.WRITE, StandardOpenOption.APPEND); channel.write(...);
DirectoryStream:
The
provision for
accessing
directories is slightly different than in java.io.File
(file.list()).
A DirectoryStream
is an Iterable
over entries in a directory. The iterator is weakly
consistent, and
may or may not reflect updates to the directory while iterating. The
iterator scales to larger directories, is less demanding
on resources, and improves response time when
accessing remote and network mounted file systems. It
supports
filtering by globs,
regex,
or by a custom filter (DirectoryStream.Filter),
as you
iterate over the directory.
Path dir = Paths.get("/dev/nfs1"); DirectoryStream<Path> stream = dir.newDirectoryStream(); try { for (Path entry : stream) { ... } } finally { stream.close(); }
FileSystemException:
This extends IOException
for backward compatibility, and has useful methods for
inspecting the
cause of exception. FileSystemException
is thrown by the operations that access the file system. It has
specific child exceptions that are thrown by the API with information
pertaining to the exception condition. All other exceptions in NIO-2
API are unchecked.
- Utility
classes: A couple of utility
classes in NIO-2 contain static operations that are handy to use.
Files
has utility operations for files and directories. Paths
has operations to get a Path
from a stringized
path or uri.
The
primary interfaces from java.nio.file.attribute
package:

Figure 2. Class diagram for NIO-2 file management API
Each of the AttributeView
types provide a read-only or
updatable view of attribute values associated with an object in the
file system. The FileAttributeViewtypes
are
associated with file metadata, and the FileStoreAttributeView types
are associated with the file store metadata.
BasicFileAttributeView:
This provides bulk access to the basic set of file attributes through a
single readAttributes()
operation. The basic attributes
comprise created/modified/access times, and information like
whether the file is a symbolic link, or a directory, or a regular
file. This view allows updating the value of time-based attributes.
DosFileAttributeView:
This lets you access legacy DOS attributes to know if the file
is hidden, read only, archive, or a system file.
PosixFileAttributeView:
This provides a view into the file attributes that are common to the
POSIX compliant file systems. It allows you to inspect the group and the
user that have permissions on the file. The file permissions and the
file owner can be inspected and updated.
FileStoreAttributeView:
This provides a view into the file store attributes like the total,
usable, and unallocated space in a file system.
- Utility
class:
Attributes
contain convenience operations that operate on or return file and file
store
attributes
FileRef file = Paths.get("/app/scripts/exec.sh"); UserPrincipal principal = Attributes.getOwner(file)
BasicFileAttributes basic = Attributes.readBasicFileAttributes(file, LinkOption.NOFOLLOW_LINKS);
System.out.format("isFile:%s size:%s", basic.isRegularFile(), basic.size());
FileStoreSpaceAttributes attrs = Attributes.readFileStoreSpaceAttributes(store); System.out.format("Total:%s Usable:%s", attrs.totalSpace(), attrs.usableSpace());
File Notification and Watch Service API
File notification refers to the system's ability to detect
and signal changes to files and folders such as when Windows
Explorer
or Mac Finder magically detects changes in the currently open folder.
Windows uses the ReadDirectoryChangesW
system call for this, and most Linux flavors use the iNotify
system call facility.
The Watch Service API in NIO-2
allows applications to watch a file system for changes using native
notifications from the underlying file-system.
The service provider implementation transparently uses the
native file event notification wherever available on a given file
system. When not available, the implementation falls back to polling.
WatchKey
and the WatchService
are
thread safe, allowing for a thread pool to work in tandem. Thus
multiple threads can use the same WatchService
instance for registering different Watchable
instances without worrying about concurrency issues. The notifications
can be comsumed in another thread. The starring cast
of the Watch Service API:
WatchService:
The handle to
the watch service from the file system. The actual implementation is
loaded using a service-provider loading facility.
Watchable:
The types that
can be registered with the watch service. Path
is a Watchable.
WatchKey:
The token of
registration. It's an association class on the
Watchable-WatchService association

Figure 3. Class
diagram of Watch Service API
Modus Operandi:
- The application gets a
handle to watch service from the file system.
- The application registers a
Watchable
with this WatchService,
mentioning the events of interest.
During registration, a WatchKey
is created and returned to the
application as a token of registration. The initial state of this WatchKey
is non-signalled, meaning that the event of
interest is not
yet experienced.
- When an event is detected,
the state of the
WatchKey
changes from non-signaled to
signaled. The
watch service maintains a queue of signaled watch keys. The watcher
thread does the following:
- Polls the watch service
queue to get the signaled watch
keys.
- Examines the signaled watch key for event type and attached objects
- Consumes the event
appropriately.
- Resets the watch key,
which effectively moves the watch
key back to non-signaled.
While the watch key is in
signaled state, the
watch service continues to accumulate events for the watchable. The
detection of events, preservation of event order, and
timelines are specific to service provider implementation, and the
system calls available for a given file system.
/* Step One */ WatchService watchService = FileSystems.getDefault().newWatchService();
/* Step Two */ Path path = FileSystems.getDefault().getPath("/app/config"); WatchKey watchKey = path.register(watchService, StandardWatchEventKind.ENTRY_CREATE, StandardWatchEventKind.ENTRY_MODIFY);
/* * Step Three: In a separate watcher thread. */ WatchKey key = watchService.take();// poll or timed-poll can be used.
List<WatchEvent> events = key.pollEvents(); for (WatchEvent event : events) { if (event.kind() == StandardWatchEventKind.ENTRY_MODIFY) { // consume the event } } watchKey.reset();
The WatchService
API finds
applicability in many scenarios. Some of them are:
- Sensing the changes in
configuration files and reloading the new configurations.
- Hot deployment of jars and
class files in application servers.
- Applications like a text
editor, that has loaded a file content in memory. When there are
changes made to the file by another application, the editor must be
notified
of changes to the underlying file, which then pops up a alert box to the
user; or any applications that work
closely with the file system.
- When two applications
integrate a
file system, and the changes
done by one application to a
file must
be detected by another application.
Provider interface
The java.nio.file.spi.FileSystemProvider
interface loads and deploys the file system provider implementations.
It uses the service
loader mechanism, and allows
replacing the default
file system provider. It also allows interposing on the default file
system provider for injecting your own caching, access control,
logging, etc.
The provider interface is the point of extension for the applications
that want a better control on the file system, or those that want to
write their own special purpose
file system like a distributed, fault-tolerant file system.
It's a facility to build your own concrete implementation of
the provider interface, like the default service provider's
implementation, and deploy the same.
For example, if you want to write a memory-based file system, where
you allocate a chunk of memory for storing files as in a regular file
system, you can extend the provider interface to do that.
Interoperability
The Java file management API has been around for more than a decade.
The NIO-2 API works with the existing code, without the need for
extensive re-writing the code. At the same time, the
application
has an emergency escape to NIO-2 wherever needed.
- The
Path
object is bilingual, and works with InputStream/OutputStream
as well as the ByteChannel.
Path path = Paths.get("/app/doc/readme.txt");
InputStream inStream = path.newInputStream(); OutputStream outStream = path.newOutputStream(StandardOpenOption.WRITE);
SeekableByteChannel channel = path.newByteChannel(StandardOpenOption.WRITE);
- The
java.io.File
class has been retrofitted with a toPath()
method that returns the Path
object for the file. This acts at a gateway to NIO-2. It's a touch point
for migrating from the old file I/O to the NIO-2 file I/O.
File sourceFile = ... ; File targetFile = ... ;
Path source = sourceFile.toPath(); Path target = targetFile.toPath();
source.moveTo(target, StandardCopyOption.ATOMIC_MOVE);
- The
Scanner class
has been updated with constructors that accepts a FileRef.
Walk the Tree-Walk
The class java.nio.file.Files
has convenience methods for files and
directories. We will discuss two of the methods that are available
in this class.
- The
tree walk: The
walkFileTree()
utility method implements the Visitor
pattern, and provides an internal iterator. It allows you to perform an
operation on each node within a
file tree, rooted at a given starting file. It provides a traversal on
the file system, and you can perform operations on the files, as you
traverse. For an instance, this method can be used to copy or move a
directory with all of its entries and metadata to another location.
- The
method walks the file
tree in a depth-first manner. The tree traversal completes when all
accessible files in the tree have been visited, a visitor returns a
result of
FileVisitResult.TERMINATE
or the visitor terminates due to an uncaught exception. When the file
is a directory, the utility method opens it using the DirectoryStream,
and continues.
- The operations that you
desire to perform on the visited files and directories must be
implemented as a concrete
FileVisitor.
The FileVisitor
interface has methods like visitFile()
to perform
operation on the file being visited, preVisitDirectory(),
and postVisitDirectory() to
perform operations on the
directories prior and post visit. There are additional operations for
exception scenarios. These must be
overridden to provide the desired behavior. Or just extend the SimpleFileVisitor
to override one of these operations. SimpleFileVisitor
has the default implementations for FileVisitor
operations.
- A
FileVisitor
also has a say in the traversal by returning a FileVisitResult
(CONTINUE, TERMINATE,
SKIP_SUBTREE, SKIP_SIBLINGS)
that is used in traversal.
- By default symbolic
links are
not followed during traversal. The
FileVisitOption
parameter can be optionally provided to indicate if you want
to follow
links. When following links you could end up in infinite loops (cyclic
graphs) and stack overflows. But never mind, the API detects
cycles in case you are following symbolic links.
- You can optionally
specify
the maximum levels of the tree that you want to traverse.
Path rootedAt = Paths.get("/private/var/tmp");
EnumSet<FileVisitOption> options = EnumSet.of(FileVisitOption.DETECT_CYCLES); int maxDepth = 10;
Files.walkFileTree(rootedAt, options, maxDepth, new SimpleDeletingVisitor());
static class SimpleDeletingVisitor extends SimpleFileVisitor<Path> {
public FileVisitResult visitFile(Path file, BasicFileAttributes attrs) { try { if (<<check on attributes>>) { file.delete(); } } catch (IOException e) { ... } return FileVisitResult.CONTINUE; }
public FileVisitResult preVisitDirectory(Path dir) { System.out.format("Visiting directory: %s%n", dir.getName()); return FileVisitResult.CONTINUE; } }
- Know
what you are dealing with:
The
probeContentType()utility
method is used to probe the content type of a file. It uses the FileTypeDetector
implementations to detect the content
type. The
service provider provides the FileTypeDetector
implementations using
the provider interface.
FileRef file = Paths.get("/app/doc/license.pdf"); String type = Files.probeContentType(file);
System.out.format("%s\t%s%n", file, type);
Summary
All in all, the NIO-2 file API works more consistently across
platforms, with operations that weren't supported in the
earlier
file management API. It supports bulk access to file attributes. Larger
sets of advanced file attributes are accessible through API. The new
file notification API can replace the less efficient and manually
implemented mechanism of polling a file-system for changes. The new
exceptions allow the application to handle and recover from exception
scenarios gracefully, on a case by case basis. The service
provider interface allows for cleanly developing and deploying custom
file systems. And what more, the interoperability with existing code
has been taken care.
The NIO-2 API is still a work in progress, and will be released as a
part of JDK 7. As the API gets into shape to handle the next
generation file management needs, some of the interfaces and
APIs may change
from
what we discussed above.
Acknowledgements
Thanks to Alan Bateman, the specification lead for JSR 203 and the implementation lead for NIO-2 at OpenJDK, for providing a thorough technical review of the article. Alan ensured that I wasn't out of sync with the developments on NIO-2.
Thanks also to James Gould, Shahid Khan, and David Smith for providing a timely technical review and for helping me to modulate the pitch of the article.
Resources
Manish K. Maheshwari Manish K. Maheshwari is a graduate in Electrical Engineering and has been working in Java EE and Java SE technology since JDK 1.1 and Java Web Server days.
|
How about FIFO, will NIO-2