The File Intrinsic Class

The File intrinsic class provides access to create, read, and write files.  A "file" is a collection of text or other data stored on a storage device such as a hard disk, and typically given a name that programs and users can use to refer to the data collection.

 

In addition to reading and writing ordinary operating system files, the File class can be used to read “resources.”  A resource is essentially a file, but has two important differences.  First, a resource is referenced using a URL-style notation, which is a universal notation that is identical on all operating systems; “URL” stands for “uniform resource locator,” which is a Web standard.  Resources don’t use true URL’s, but rather borrow the standard URL notation for representing relative (subdirectory) paths.  Second, a resource can be embedded into the program’s image file, or into an external resource bundle file, using a tool that comes with the TADS 3 compiler.  The resource mechanism’s benefit is that it allows a developer to bundle extra data embedded directly in the program’s image file, which simplifies distributing and installing the program by reducing the number of files that have to be shipped along with it.

 

To use File objects, you must #include the system header file "file.h".

File Formats

TADS 3 provides access to files using three different "formats."  A file's format is simply the way the file's data are arranged; each format is useful in different situations.  The formats are:

 

 

Resources can be read using the Text and Raw formats.

Creating a File Object

A File object gives you working access to a file on disk.  The File keeps track of all of the information involved with your access to the file: the format you're using to read and write the file, the type of access you have to the file, and the current position in the file where you're reading or writing.


You do not use the new operator to create a File object; instead, you use one of the "open" methods of the File class itself:

 

  File.openTextFile(filename, access, charset?)
  File.openDataFile(filename, access)
  File.openRawFile(filename, access)
 

The filename argument is a string giving the name of the file to be opened or created.

 

The access argument gives the type of access you want to the file, and determines whether an existing file is to be used or a new file is to be created.  The access argument can be one of the following constants:

 

 

File.openTextFile(filename, access, charset?) opens a file in text format.  Any access mode may be used with this method.  If the charset argument is given, it must be an object of the CharacterSet intrinsic class giving the character set to be used to translate between the file's character set and the internal TADS Unicode character set.  If this argument is missing, the File object will use the plain ASCII character set by default.

File.openDataFile(filename, access) opens a file in "data" format.  Any access mode may be used with this method.

File.openRawFile(filename, access) opens a file in "raw" format.  Any access mode may be used with this method.

All of the "open" methods check the file safety level settings to ensure that the file access is allowed.  If the file safety level is too restrictive for a requested operation, the method throws a FileSafetyException.  The file safety level is a setting that the user specifies in a manner that varies by interpreter; it allows the user to restrict the operations that a program running under the interpreter can perform, to protect the user's computer against malicious programs.

Opening a Resource

In addition to the methods that open ordinary operating system files, the File object has two methods that can be used to open resources:

 

  openTextResource(resName, charset?)
  openRawResource(resName)

 

The resName argument gives the name of the resource to be opened.  This is given as a URL-style relative path name: the “/” character is used as the path separator, but the path cannot start with a “/”, as it must be relative to the working directory (which is generally the directory containing the image file).  Note that the URL notation is universal: you must always use the same “/” path separator notation, regardless of the operating system.  The File object automatically converts the URL-style path to the correct local conventions.  This means that when you’re opening a resource file, you don’t have to be concerned with the local file system naming rules; simply use the standard URL format, and the File object will automatically adapt to the platform at run-time.

 

The charset argument has the same meaning as it does for openTextFile().

 

Note that the open-resource methods don’t take an access-mode argument, as the open-file methods do, because resource files can only be opened for reading.  FileAccessRead is the only possible access mode for a resource, so the methods don’t need a separate argument for the mode.

 

The open-resource methods are not sensitive to the file safety level.  Since resources can only be read, never written, and are constrained to the image file’s directory and its subdirectories (since their paths are always relative to the image file’s home directory), it's not possible for a T3 program to do any damage to the system using resources, and highly unlikely that a program could use them to gain access to any sensitive system information.  Resources are thus inherently “sandboxed” to a degree that no extra file safety protection is required.

File Methods

closeFile() – close the file.  This flushes internal buffers to the external storage device and releases all operating system resources associated with the open file.  On many operating systems, when a program is working with a file, other programs are not allowed to access the same file, to prevent any data corruption that would occur if multiple programs were accessing the same data simultaneously without coordinating their activities; closing a file tells the operating system that your program is finished with the file, and that it is therefore safe to allow other programs to access the file.  You are not strictly required to call this method when finished with a file, because TADS will automatically close the file when the garbage collector determines that the File object is no longer usable; however, this could result in consuming system resources for much longer than necessary, so it is always good programming practice to close files explicitly as soon as you know you're done with them.

 

After closing a file, no further operations can be performed on the file.  Any attempts to perform operations on the file will result in a FileClosedException being thrown.

 

getCharacterSet() – returns the CharacterSet object that the file is using for its character translations.  This is useful only with files in text format.

 

getFileSize() – returns the size in bytes of the file.  This is the size of the file as it appears on disk, so this might not be the same as the apparent size of the file's data stream as the program sees it; for example, if the file is being read as a text file, character set translations and newline format conversions will usually make the in-memory representation differ somewhat from the binary representation on disk.

 

getPos() – returns an integer giving the current read/write position in the file; this is simply the byte offset in the file of the next read or write operation.  When a file is first opened, this will return zero, because the first read or write operation will occur at the first byte of the file, which is at offset zero.

 

readBytes(byteArr, start?, cnt?) – this function, which is used only for raw files, reads bytes from the file into byteArr, which must be an object of intrinsic class ByteArray.  If start and cnt are given, they give the starting index in the byte array at which the bytes are to be stored and the number of bytes to be read; if these are omitted, the function reads as many bytes from the file as there are bytes in the byte array, and stores them in the byte array starting at its first element (index 1).

 

This function returns the number of bytes actually read from the file.  If the end of the file is encountered before the request is fulfilled, the return value will be smaller than the number of bytes requested.  If the function returns zero, it simply means that there are no more bytes available in the file.

 

Note that if the file is open for write-only access, a FileModeException will be thrown.

 

readFile() – reads data from the file and returns the value.  This function reads data according to the file's format:

 

 

In any case, when the end of the file is reached, the function returns nil.  If any error occurs reading the file, the method throws a FileIOException.
 

Note that if the file is open for write-only access, a FileModeException will be thrown.

 

setPos(pos) – set the read/write position in the file to pos, which is an integer giving a byte offset in the file.  The first byte in the file is at offset zero.

 

For text and data format files, this function should be used with caution.  In particular, you should only use this function to set a file position that was previously returned from a call to getPos().  Text and data format files have data structures that span multiple bytes in the file, so setting the file to an arbitrary byte position could cause the next read or write to occur in the middle of one of these multi-byte structures, which could corrupt the file or cause data read to be misinterpreted.

 

For raw files, since your program is responsible for the exact byte layout of the file, you can set the read/write position wherever you want without any danger of confusing the File object.  However, if you are defining your own multi-byte structures, you must naturally take care to move the file position only to the proper boundaries within your own structures.

 

setPosEnd() – sets the read/write position in the file to the end of the file.  Any subsequent writing to the file will place new bytes after the last existing byte in the file.  This function is useful if you want to add new data after all of the existing data in a file, and is also useful to determine the size of a file (which you can do by seeking to the end of the file and then using getPos() to determine the new position in the file).

 

Note that the warnings mentioned in setPos() regarding valid positions generally don't apply to setPosEnd().  It is usually safe to go to the end of a file, because whatever multi-byte data structures occur in the file should be complete units, hence moving to the end of the file should set the position to the end of the last structure.

 

writeBytes(byteArr, start?, cnt?) – this function, which is used only for raw files, writes bytes from byteArr, which must be an object of intrinsic class ByteArray, to the file.  If start and cnt are given, they give the starting index in the byte array and the number of bytes to be written; if these are omitted, all of the bytes in the byte array are written.  The bytes are written to the file without translation.

 

This function has no return value; if any error occurs writing the bytes, a FileIOException is thrown.

 

Note that if the file is open for read-only access, a FileModeException will be thrown.

 

writeFile(val) – writes the value val to the file.  The value is written according to the file's format:

 

 

Writing an enumerator value to a data format file ties the file to the particular version of your program that wrote the file.  When you compile your program, the compiler assigns an arbitrary internal identifier value to each enumerator, and it is this arbitrary internal value that the writeFile() function stores in the file.  When you use readFile() to read an enumerator value, the system uses the current internal enumerator value assignments made by the compiler.  Because these values are arbitrary, they can vary from one compilation to the next, so it is not guaranteed that a file containing enumerators can be correctly read after you have recompiled your program.  For this reason, you should never write enumerators to a file unless you are certain that the file will only be used by the identical version of your program (so it's safe, for example, to use enumerators in a temporary file that will be read back in during the same run of the program).  If you must store enumerators in a file that might be read by a future version of your program, you should use some mechanism (such as reflection) to translate enumerator values into integers, strings, or other values that you define and can therefore keep stable as you modify your program.

 

If any error occurs writing the data, such as running out of disk space, the method throws a FileIOException.  If the file is open for read-only access, a FileModeException is thrown.

 

setCharacterSet(charset) – sets the CharacterSet object that the file uses for its character translations.  The charset value must be an object of intrinsic class CharacterSet.  Subsequent read or write operations will use the given character set for character translations.

Interaction with Save/Restore, Undo, and Restart

File objects are inherently transient; all instances returned from the creation methods (openTextFile, etc.) are transient and thus not affected by save, restore, restart, or undo.

 

If a File instance is part of the program when pre-initialization completes, and is thus saved to the final image file, the instance will be “unsynchronized” when the program is loaded.  This means that the File object no longer refers to an open operating system file – once the object has been saved with the image file and then reloaded, there is obviously no longer an active association with the system file.  When a File object becomes unsynchronized, it will no longer allow any operation that could be affected by the inconsistency.  In particular, the file cannot be read or written once it is unsynchronized.  To enforce this, the File object throw a FileSyncException if any of these operations are attempted on an unsynchronized file.