Professional Documents
Culture Documents
White Paper
Summary: This white paper introduces the .NET Framework Class Library’s file I/O
functionality that is available to Microsoft® Visual Basic .NET programmers. Five
examples are presented that demonstrate various ways to save and retrieve test
data from a file. In addition, examples are presented to demonstrate reading
application configuration files, file and directory manipulation and monitoring files.
Contents
Introduction
Saving Test Data Using Comma Separated Value Text Format
Saving Test Data Using Name Value Pair Text Format
Saving Data Using BinaryWriter
Saving Data Using Binary Serialization
Saving Data Using XML Serialization
Reading Application Configuration Settings
File and Directory Operations
Monitoring Files and Directories with FileSystemWatcher
Conclusions
Introduction
Microsoft Visual Basic .NET (VB.NET) is a significant new release and a worthy
successor to Visual Basic 6.0. VB.NET introduces language advances, a new runtime
engine and a significantly more powerful runtime library. The language changes
introduce full support for object-oriented programming and structured exception
handling. The new runtime engine, called the Common Language Runtime (CLR)
provides better memory management and support for multiple threads. The .NET
Framework Class Library (FCL) provides significantly more functionality than the VB
6 runtime library. In particular, the FCL provides more options for storing test data
to file and for manipulating files and directories in general.
Before exploring the new file I/O functionality in the FCL, it is worth pointing out that
the VB 6 approach to file I/O using VB 6 statements like Open, Print, Input, Close
is still available in VB.NET albeit with some syntax changes. For example, in VB 6
you would open a file, read from it and close it with the code shown in Figure 1.
Agilent Developer Network White Paper 2 of 17
Storing Test Data to File Using Microsoft® Visual Basic® .NET
November 2002
Figure 1 Reading a File using VB 6
The VB 6 file I/O statements have been replaced by a set of compatibility functions
including FileOpen, Print, Input and FileClose among others. For more
information see File Access Types, Functions, and Statements. Despite the syntax
changes, performing file I/O with these functions is basically the same as it was in
VB 6. You can continue to use that approach in VB.NET if you desire. Keep in mind
that the VB 6 compatibility methods mentioned above are merely wrappers on top of
the FCL’s native file I/O functionality. Besides just bypassing an extra layer of code,
learning how to directly use the FCL’s file I/O objects and methods gives you more
flexibility, an easier mechanism for saving test data and allows you to read file I/O
code in other .NET languages like C#. The rest of this article explores those native
FCL file I/O objects and methods.
The 1.0 version of the FCL contains more than 3000 public types that contribute over
16000 public properties, 33000 public methods and 4000 public events (I told you it
contained significantly more functionality than the VB 6 runtime library). Fortunately
this functionality is organized into many different namespaces like System,
System.Collection, System.Net, System.Text, System.Threading,
System.Windows.Forms just to name a few. The file I/O functionality is located in
the System.IO namespace.
There are a number of file I/O examples used throughout the rest of this article. All
of the examples dealing with test data use the data format shown in Figure 3 to
illustrate the different ways to save and retrieve test data.
The example code associated with this paper can be downloaded from
http://www.agilent.com/find/adnwhitepaperexamples.
One very popular text file format is “comma separated value” or CSV. CSV files can
be easily imported by applications like Microsoft Excel. Figure 4 shows how to save
data in a text file using CSV format.
The following line of code from Figure 4 demonstrates the use of the
System.IO.File class to open up a file for writing in text mode.
One interesting tidbit about the File.CreateText method is that it uses something
called UTF-8 encoding for the text written to file. The advantage to UTF-8 encoding
is that ASCII characters require only one byte per character in the file. At the same
time, Unicode characters can be saved although these characters may occupy two or
more bytes per character in the file. There are ways to force exclusive ASCII or
Unicode encoded text files. You will see an example of how to do this later.
The code to read the test data from this file is shown in Figure 6.
' Read entire file as a string and split the string on commas
Dim strings As String() = reader.ReadToEnd().Split(",")
The following line of code from Figure 6 opens a file for reading in text mode.
Then the code reads in the entire file using the StreamReader.ReadToEnd method.
This approach is fine for smaller files but it can use large amounts of memory for
large data files. I will show you a better approach to reading a large text file in the
next example. Since the file is in CSV format, this example uses the String.Split
method to split the string on commas. The implicit assumption here is that none of
our string values contain commas. That leaves an array of strings to parse. Since
the dutId variable is of type string no parsing is required. For timestamp, the
DateTime class fortunately provides a shared Parse method that takes a string
containing a date and time and creates a new DateTime object. As it turns out,
most of the VB.NET primitive types like Integer and Double also provide shared
Parse methods that take a string and return a value of the appropriate type. As a
result, reading the stationId is straightforward. For the array, this example parses
the array size first and allocates the appropriate size array. The following line of
code may look a bit unusual to a VB 6 programmer.
One of the problems with CSV format is that if you want to store strings that contain
embedded commas, then parsing the file contents becomes much harder. Certainly
String.Split wouldn’t work very well in this case. Another approach to saving data
that doesn’t suffer from problems handling special characters like commas is “name
value” format. Furthermore, since each test data element is saved on its own line in
the file, it is easier to read the file in smaller chunks versus reading the entire file at
once. Figure 7 demonstrates how to save the same test data in name value pair
format.
Notice in this example that a StreamWriter object is created directly rather than
using File.CreateText as shown in this line of code.
Saving test data out in name value format is pretty easy. You just write out a name
for each element of the test data followed by an “=” sign followed by the value for
that element of the test data. The file contents produced by this approach are
shown in Figure 8.
DutId=A00100
Timestamp=10/11/2002 2:52:26 AM
StationId=1
Data=1.1,2.2,3.3
The code to read the test data from this file is shown in Figure 9.
' Parse each value - note with this scheme you don't
' have to read the data in the order it was written
Select Case name
Case "Timestamp"
timestamp = DateTime.Parse(value)
Case "StationId"
stationId = Integer.Parse(value)
Case "Data"
Dim dataStr() As String = value.Split(",")
data = New Double(UBound(dataStr)) {}
Dim i As Integer
For i = 0 To UBound(dataStr)
data(i) = Double.Parse(dataStr(i))
Next
Again, this example directly creates a StreamReader but this time rather than
reading in the entire file contents at once using the StreamReader.ReadToEnd
method, it reads in the file one line at a time using the StreamReader.ReadLine
method. This works nicely because each name value pair was saved on its own line.
The example uses a Select statement to execute different parse code depending
upon the name value pair that was read from each line of the file. One minor
advantage of this approach is that the data doesn’t have to appear in any particular
order in the file. Putting the code in the Case statements is manageable for a small
example like this but if the amount of data starts getting larger you might want to
put the parsing code for each piece of data in its own function.
The previous two examples create files that are human readable which has it
advantages. However, if you have a large amount of test data, reading in and
parsing data stored as text can be slow. If performance is your primary concern
then consider saving your test data to a binary file as shown in Figure 10.
Microsoft recognized this and provided several different reader and writer classes.
StreamReader and StreamWriter work with any type of stream object including
FileStream and provide services for reading and writing data in text format.
StreamWriter allows you to select the text encoding you desire; remember it
defaults to UTF-8. StreamReader can typically determine the encoding type based
on the existence and value of a byte order mark (BOM) in the text file that it is
reading. Of course, you can always explicitly choose which encoding that
StreamReader should use.
You may also notice the StringReader and StringWriter classes, which are very
similar in functionality to a StreamReader and StreamWriter. The primary
difference is that StringReader and StringWriter operate on a string instead of a
Stream.
BinaryReader and BinaryWriter provide services for reading and writing primitive
data types in binary format. As you might notice in Figure 10, writing data to file
using the BinaryWriter is very straightforward. One thing to note about all Stream
related reader and writer objects is that when you call the Close method on them,
they automatically close the stream they are using. The contents of the generated
file are shown in Figure 11. Because it is a binary format, characters that cannot be
displayed are represented by their 3 digit decimal value.
006 A 0 0 1 0 2 022 1 0 / 1 2 / 2 0
0 2 1 1 : 2 5 : 2 0 P M 005 000
000 000 002 000 000 000 f f f f f f 010 @ 017 @
000 000 000 000 000 000 022 @
The code to read the test data from this binary file is shown in Figure 12.
Reading the test data from the binary file is also pretty straightforward. After
creating a FileStream and a BinaryReader, this example uses BinaryReader
methods like ReadString, ReadInt32 and ReadDouble to read the test data. One
thing to note is that BinaryReader and BinaryWriter don’t handle DateTime
objects directly so you have to store the DateTime object as text and then read it
back in as text. Like the first example (Figure 6) you have to read the test data in
the same order in which it was saved out.
You have seen various different ways to save test data in both text and binary
format. One thing in common with all of these approaches is that code had to be
written to save each individual element of the test data as well as parse each
individual element when reading the data back from file. Fortunately, the FCL
provides a mechanism referred to as serialization that does all of this tedious work
for us. All you have to do is package the test data in a class and mark that class as
serializable as shown in Figure 13. The FCL introduces a new concept called
attributes. Just decorate the TestData class with the SerializableAttribute using
the notation “<Serializable()>”. This attribute tells the CLR that any objects of
type TestData can be serialized.
All you have to do to serialize the test data is create a FileStream object and a
BinaryFormatter object (located in the
System.Runtime.Serialization.Formatters.Binary namespace) and then tell the
BinaryFormatter object to save our test data array to the FileStream object as
shown in Figure 14.
' Create FileStream on which we will serialize our test data and
' create BinaryFormatter that will do the serialization
Dim stream As FileStream = New FileStream(filename, FileMode.Create)
Dim formatter As BinaryFormatter = New BinaryFormatter()
The following line of code calls a shared method in the TestData class that is not
shown in the TestData class definition in Figure 13.
The sample project associated with this article does contain the GenerateData
method. This method simply creates two TestData objects with fake data and
returns an array of TestData containing these two objects.
Note that the following line is only necessary when you strong name your
assemblies.
formatter.AssemblyFormat = FormatterAssemblyStyle.Simple
When you strong name an assembly that contains a serializable class or structure,
the assembly version is saved out to file with a serialized class or structure instance.
If you then update the version number on that assembly, you will get an error when
The file contents of the generated file are shown in Figure 15. Again, because it is a
binary format, characters that cannot be displayed are represented by their 3 digit
decimal value.
000 001 000 000 000 001 000 000 000 000 000 000 000 012 002 000
000 000 006 F i l e I O 007 001 000 000 000 000 001
000 000 000 002 000 000 000 004 015 F i l e I O .
T e s t D a t a 002 000 000 000 009 003 000 000
000 009 004 000 000 000 005 003 000 000 000 015 F i l e
I O . T e s t D a t a 004 000 000 000 005
D u t I d 009 T i m e s t a m p 009
S t a t i o n I d 004 D a t a 001 000
000 007 013 008 006 002 000 000 000 006 005 000 000 000 006 A
0 0 1 0 0 I c j @ 008 001 000 000 000 009 006
000 000 000 001 004 000 000 000 003 000 000 000 006 007 000 000
000 006 A 0 0 1 0 2 I c j @ 008 005 000 000
000 009 008 000 000 000 015 006 000 000 000 003 000 000 000 006
? 001 @ f f f f f f 010 @ 015 008 000 000 000
003 000 000 000 006 017 @ 000 000 000 000 000 000 022 @ f
f f f f f 026 @ 011
The code to read the data from file, a process referred to as deserialization, is shown
in Figure 16.
' Deserialize data & type cast it from System.Object to correct type
Dim data() As TestData = CType(formatter.Deserialize(stream), _
TestData())
This code looks remarkably similar to our serialization code. A FileStream object is
created but this time using FileMode.Open instead of FileMode.Create mode. The
example uses a BinaryFormatter like before but this time the Deserialize method
is called and passed the FileStream object for the data file. The return type of the
Deserialize method is Object so the example uses VB.NET’s CType function to cast
Binary serialization looks pretty good. However sometimes when performance isn’t a
primary concern there are advantages to saving data in text format. This next
example serializes the test data in XML. Besides being a human readable format,
there are many tools available for working with and displaying XML data.
Technically, this example could have used our previous definition for TestData to
serialize in XML format. However, to demonstrate the ways that the XML output can
be customized during serialization this example uses a different test data class called
TestDataXml. This class, shown in Figure 17, demonstrates the use of the
XmlAttributeAttribute (located in the System.Xml.Serialization namespace) to
mark certain public fields so that they appear as XML attributes instead of XML
elements in the output. Note also that in the case of the DutId field you can even
change the name of the XML attribute to DutIdentifier.
Imports System.Xml.Serialization
<Serializable()> _
Public Class TestDataXml
<XmlAttributeAttribute("DutIdentifier")> _
Public DutId As String
<XmlAttributeAttribute()> _
Public Timestamp As DateTime
Public StationId As Integer
Public Data() As Double
End Class
The code to serialize the TestDataXml array in XML format is shown in Figure 18.
' Create StreamWriter on which the test data will be serialized and
' create XmlSerializer that will do the serialization
Dim writer As New StreamWriter(filename)
Dim serializer As New XmlSerializer(GetType(TestDataXml()))
The code to deserialize the data is shown in Figure 20. As you can see, the code is
very similar to what was used in the binary deserialization example. Again, the
primary difference is that this code creates an XmlSerializer object instead of a
BinaryFormatter. You must tell the XmlSerializer what type of object you need to
deserialize and then call the XmlSerializer.Deserialize method to retrieve the data.
' Deserialize data & type cast it from System.Object to correct type
Dim data() As TestDataXml = CType(serializer.Deserialize(reader), _
TestDataXml())
One fairly common reason for doing file I/O in an application is to read application
configuration data from file. You can easily add an application configuration file to
an application by selecting the “Project -> Add New Item” menu item in VS.NET and
selecting “Application Configuration File”. The contents of the application
configuration file used in this example are shown in Figure 21.
There are two classes related to files in the System.IO namespace. One is File and
the other is FileInfo. File contains only shared methods that allow you to do things
like check if a file exists, create a file, move a file, copy a file, delete a file, get file
attributes, etc. For all of these methods you only need a filename (or two) to
perform the operation. You can do all of this without ever creating a File object.
FileInfo contains very similar functionality except that it contains only instance
methods which means you have to create a FileInfo object first. You can do this by
creating a FileInfo object and providing a filename to the constructor. However, the
more common scenario is that you got back a FileInfo object from a call to a
method like DirectoryInfo.GetFiles which returns an array of FileInfo objects.
There are also two classes related to directories. Just like the File class, the
Directory class has only shared methods that require only a directory name (or two)
to do operations like checking if a directory exists, creating a directory, moving a
directory, deleting a directory, getting the parent directory and getting and setting
the current directory. There are also methods for enumerating the files and
directories contained within a directory. The DirectoryInfo class contains
functionality that is very similar to the Directory class, but, like the FileInfo class,
it contains only instance methods. In order to use these instance methods, you will
need to create a DirectoryInfo object first.
Another useful class in the System.IO namespace is the Path class. It contains
shared methods for operations like getting the filename portion of a path, getting the
directory portion of a path, getting the extension portion of a path, changing the
extension of a filename, getting a temporary filename and getting the temp path on
the system.
When you need to find the location of a special folder like the DesktopDirectory or
the ApplicationData directory, you can use the Environment.GetFolderPath
method. It takes a value from the Environment.SpecialFolder enumeration and
returns the path to that folder on the system.
The code shown in Figure 23 demonstrates some of the various file and directory
operations that can be done with System.IO types.
' Copy original file to new file but don't overwrite an existing file
Dim copiedFilename As String = Path.ChangeExtension(filename, ".bak")
File.Copy(filename, copiedFilename, False)
newDirInfo.Delete(True)
Occasionally you may have the need to monitor files or directories and perform an
operation like reloading a file whenever it has been changed outside of the current
application. Microsoft needed this functionality for ASP.NET and they made it
available for everyone else to use. The code in Figure 24 shows how easy it is to
set up a FileSystemWatcher to notify a program whenever the specified file is
written to.
This code creates a FileSystemWatcher object and configures it to monitor the file
specified by filename. The code specifies that it is only interested in events related
to file writes by setting the NotifyFilter property to NotifyFilters.LastWrite.
Whenever the file is written to, the event handler associated with the
FileSystemWatcher gets fired. You may experience more file write related events
being fired than you might expect. It turns out that when you create a file,
FileStream’s constructor writes a zero length file which fires the event handler.
Then, when the FileStream is closed, the event handler is fired again when the file
is flushed out to disk.
Conclusions
In this article, we have looked at a simple way to read application configuration files,
manipulate files and directories as well as monitor files. We have also explored five
different methods for saving and retrieving test data from file in both text and binary
formats.
The System.IO namespace in the FCL provides low level, flexible methods for
creating, opening, writing to and reading from files as well as high level, simple to
use methods for doing the same. Even better, the serialization technique provides a
very simple approach to saving and retrieving test data in which the FCL does most
of the work for you.
Microsoft, Visual Basic, and MSDN are either registered trademarks or trademarks of Microsoft Corporation
in the United States and/or other countries.