Monthly Archives: March 2010

Parsing Key-Value Pairs Via Regular Expressions

5
Filed under Code Garage, Regular Expressions, VB Feng Shui

I’ve often found myself in need of a Key-Value pair parser. Simple stuff, really. Essentially, the idea is to be able to parse any of the following from a typical command buffer:

Key=Value (no whitespace in the key or value)

Key=”Value” (whitespace is ok in the value)

Key (when just the existence of the key signals something)

This sort of parser is fairly easy to write, but this time, I’d just finished playing with regular expressions for another parsing task, so I thought, why not give them a try here?

After a few minutes with the Rad Regular Expression Designer, I’d put together what appeared to be a pretty robust expression for this.

My version keys of the matches instead of the seperators. I did this mainly because I wanted the Key and Value parts to be returned as “cleanly” as possible. That means the Key should be just the Key, no whitespace or “=” and the value should never include the leading or trailing quote marks, if they’re there).

The end result is a function that takes a string buffer and returns a generic Dictionary of Key Value string pairs.

Imports System.Text.RegularExpressions


Module RegEx

    Public Function ParseKeyValuePairs(ByVal Buffer As String) As Dictionary(Of String, String)
        Dim Result = New Dictionary(Of String, String)

        '---- There are 3 sub patterns contained here, seperated at the | characters
        '     The first retrieves name="value", honoring doubled inner quotes
        '     The second retrieves name=value where value can't contain spaces
        '     The third retrieves name alone, where there is no "=value" part (ie a "flag" key
        '        where simply its existance has meaning
        Dim Pattern = "(?:(?<key>\w+)\s*\=\s*""(?<value>[^""]*(?:""""[^""]*)*)"") | " & _
                      "(?:(?<key>\w+)\s*\=\s*(?<value>[^""\s]*)) | " & _
                      "(?:(?<key>\w+)\s*)"
        Dim r = New System.Text.RegularExpressions.Regex(Pattern, RegexOptions.IgnorePatternWhitespace)

        '---- parse the matches
        Dim m As System.Text.RegularExpressions.MatchCollection = r.Matches(Buffer)

        '---- break the matches up into Key value pairs in the return dictionary
        For Each Match As System.Text.RegularExpressions.Match In m
            Result.Add(Match.Groups("key").Value, Match.Groups("value").Value)
        Next
        Return Result
    End Function


    Public Sub Test()
        Dim s = "Key1=Value Key2=""My Value here"" Key3=Test Key4 Key5"
        Dim r = ParseKeyValuePairs(s)
        For Each i In r
            Debug.Print(i.Key & "=" & i.Value)
        Next
    End Sub
End Module

I’ve included a simple test function to help validate it.

DISCLAIMER: I’m not RegEx guru, so there are likely much faster ways to assemble the Regex. If you have one, by all means please comment! And finally, it’s entirely possible that I’ve missed some examples of badly formed input that would cause weird parsing results.

For instance, in the above example, notice that “Key3=Test Key4 Key5” will return Key3 set to “Test” and Key4 and Key5 set to empty strings.

If the user meant for Key3’s value to be “Test Key4 Key5”, there would need to be quotes around the value.

But, parsing issues like that will be the norm in any kind of parsing logic for formats such as this, so I’m not terribly worried about it.

Implicit Casts in VB.net

0
Filed under Code Garage, VB Feng Shui

I really hadn’t paid much attention to Implicit Casting in VB. I’d heard the term, thought it might be an interesting idea, but then completely forgot about it.

But I was recently reading a blog post discussing the EntitySpaces ORM, and the author happened to offhandedly mentioned implicit casts with respect to certain features of that ORM. I was intrigued again, so I set off down that road.

Come to find out, Implicit casts were one of the language features new in VS2005! (wow, how’d I miss that?) Essentially, they allow you to explicitly dictate what happens when you implicitly  cast one object to a different type.

How’s that again?

VB is rife with instances where you might need to implicitly cast one object to another of a different type. You’ve got the obvious situations, for instance, converting an object from a subtype to a super type (granted, not the most ideal example, but it illustrates the point):

Dim d = New Dog
Dim a As Animal = d

And then there are more subtle examples. Here, I’ve created a Dog object with the name “Rover” but then I’m comparing it directly to a string. The implicit conversion is from Dog to String:

Dim d = New Dog
d.Name = “Rover”

If d = "Rover" Then blah....

Now, before anyone rails on me for encouraging bad programming practices, these are all contrived examples, just to point out the possibilities.

The point is, if you find yourself needing to cast one object into another, and such casting makes logical sense, implicit casting provides a very clean, terse way to accomplish it.

Widening or Narrowing?

Any time you convert one object to another, there’s only 2 possible outcomes:

  • The conversion will always succeed. This is known as a Widening Conversion.
  • The conversion could succeed or fail. This is known as a Narrowing Conversion.

A lot of documentation on the matter will describe a Widening Conversion as one that converts a derived type to one of it’s base types, and a Narrowing Conversion as one that converts a base type to a derived type, but in practice, the two types don’t have to be related at all.

For instance, you can define either a Narrowing or a Widening conversion operator to convert from a StringBuilder Class to a String and vice versa, but neither type is directly related to the other.

The Caveats

There’s always at least one, right? Well, first, as you might guess, used with abandon, implicit casting can lead to code that bears a striking (and quite unwelcome) resemblance to old school VB code chock full of variants.

But a less obvious snag is that when you cast, you need to pay special attention to whether you’re creating a new object or just recasting the existing object.

For instance, you might define a Widening Conversion from Dog to Animal as:

Public Shared Widening Operator CType(ByVal InitialData As Dog) As Animal
    Dim a = New Animal
    a.Name = InitialData.Name
    Return a
End Operator

But in actuality, a new Animal object with the same Name as the original Dog object is created. That might be what you want, or it might not be.

Project Level Implicit Conversion

If you’ve never noticed before, your project actually has an overall level of warning configuration for Implicit Conversion, because it can be such a nasty issue.

You’ll find it on the Compile Tab of the project properties:

image

  1. Set to None, VB won’t warn you at all when you attempt to implicitly convert types.
  2. Set to Warning, VB will show a warning in the Errors list, but will still allow the project to compile.
  3. Set to Error, VB won’t even allow the project to compile.

Generally, you’ll want this set to Error. The good news, however, is that even with this option set to Error, VB will not consider those implicit conversions that are handled explicitly by a Widening Conversion as errors. If you think about it, this makes sense, because a Widening conversion, by its very nature, can’t fail, and thus really isn’t an implicit conversion anymore.

StringBuilder Example

Francesco Balena wrote up an excellent short article here that illustrates using a Widening conversion to make working with StringBuilder objects far easier. He uses Widening conversions to implicitly cast from StringBuilder to String and back, like so:

Public Shared Widening Operator CType(ByVal op As StringBuilder6) As String Return op.ToString() End Operator Public Shared Widening Operator CType(ByVal str As String) As StringBuilder6 Dim op As New StringBuilder6() op.buffer.Append(str) Return op End Operator

Since these are Widening Conversions, they won’t fail, and thus they won’t be flagged as errors even with Implicit Conversions set to Error in the project properties.

A One Line CSV Parser

4
Filed under Code Garage, Regular Expressions, Utilities, VB Feng Shui

Parsing up Quote Comma delimited text is a pretty common thing to do, and it seems trivial enough till you realize all the little gotcha’s that come with the problem (like doubled quotes, commas in quotes, etc, etc). Then it becomes just another laborious exercise in boring coding.

I came across a regex some time ago that makes the process literally one line of code. I can no longer find the original author but the code I found (what little there was of it) was C# and actually split up into a few lines of code, so this is converted to the equivalent VB.net:

    Public Function QCSplit(ByVal Args As String) As String()

        Return (New System.Text.RegularExpressions.Regex(",(?=(?:[^""]*""[^""]*"")*(?![^""]*""))")).Split(args)
    End Function

The regex used here doesn’t actually match on the contents, it matches on the commas that split things up, and it then uses the SPLIT function to actually split those things up.

I’ve used this a while now and it works great, but I’ve seen a few interesting alternatives since.

The most interesting thus far is this one by Daniel Einspanjer. Not interesting enough yet for me to switch to it, but the regexlib.com website is quite nice as a great repository of good regex recipes.

Regular Expression Tester

And speaking of Regular Expressions, the guys over at RAD Software, have a free Regular expression tester for .net style regular expressions that works fantastically. If you’re just starting out in regex’s (and seriously, who isn’t <G>), you owe it to yourself to pick up a decent regex test tool, and this one is as good as I’ve seen so far.

image

As far as I can tell, it can handle all the various options for regex’s, and dynamically shows match results, etc. Very handy for trying out expressions without actually running them in .net.

Add it to your External Tools menu in VS and it’ll be right there, good to go.

Mail Merge and Reports in Word

2
Filed under Office, VB Feng Shui

If you have a need to generate documents, and I mean lots of documents, you’re likely the eventually investigate the Word mail merge functionality.

It’s ok, but there’s lots of things that it can’t do, especially with respect to generating tables or dealing with table oriented data.

You next stop might be a report generator, like Crystal Reports. But the big problem with those kinds of packages is they require you to use a proprietary report template editor, usually band oriented (the typical “header/body with repeating lines/footer” type reporting system). Not terribly bad, but not great if you actually want to generate mail merged documents.

Word Documents as Templates

But there are alternatives out there. Two of the most intriguing I’ve seen are an open source package called FlexDoc and a commercial application called Windward Reports.

Both of them allow you to use standard Word documents as “report” templates”. But keep in mind, “report template” in this sense is a pretty open ended concept. You could generate everything from a mass mailing letter, to legal documents, to actual reports with tabular data, and even graphs and charts.

And both are lightning fast, since they work directly on the Word document file itself (or more precisely, the DOCX file format, older DOC format files aren’t supported).

The Commercial Product

Windward Reports is polished commercial product. It’s not cheap, but it’s incredibly flexible, supporting everything from XML to SQL data sources, charting, graphs, tabular report type elements, formatted content (ie insert a field into your Word document that consists of formatted HTML, for example), and a lot more.

Windward uses standard Word Mail Merge fields for its tags, specifically the AUTOTEXTLIST field. Those field types have been in Word since even before Word 2000, and it’s very unlikely they’ll change any time soon, so you’re pretty safe there.

One really nice element of Windward is that they have available a “tagging helper” addin for Word called AutoTag. It can really help speed up the design of your Word report templates. It’s optional. Technically, you don’t absolutely have to have it, but it’s a lot more painful to create reports without it.

The Open Source Project

Flexdoc is open source and not quite as polished as Windward. But it’s got much of the same functionality. It can connect to just about any data source, it’s fast, and it can do tabular data very well, but it can’t do charts or graphs, and it can’t handle formatted data (hmtl or otherwise) at all by default, though there are mods that can be made to improve that situation.

The biggest problem with Flexdoc is that it currently relies on the CustomXMLElements functionality of Word that has, as of Jan 10, 2010, been removed from the product because of the lawsuit with I4I.

The author of FlexDoc indicated that there were plans for a ContentControl-based version in the future, but that might be a while off. Still, it’s an open source project, so you could always throw in and help make those changes if you really needed them.

DVDs, Subtitles and Tivo

0
Filed under Tivo

One of the biggest niceties about a Tivo is that you don’t have to fumble with media. I used to have a stack of VHS tapes that I’d put in rotation to records shows I’d want to watch. I’d usually record over them once I’d watched the show, but dealing with all the media was a big headache (forget about actually programming the VCR).

Fast forward, and now, instead of tapes, we have DVDs, CDs, and BluRay discs. Same basic problem though.

Fortunately, Tivo to the rescue! I’ve written before about the easiest free way to move a DVD movie over to the Tivo, so I won’t rehash that here.

However, recently, I ended up with a subtitled foreign language film that I wanted to move to my Tivo. I moved it just fine, but I ended up with an MPG file that contained the foreign language film and NO SUBTITLES! Ack!

So, I need amend my previous post with additional information about how to handle subtitled movies, still completely free. It’s a little more involved but not much so.

Step 1

First, you’ll need to copy the movie off the DVD using DVDFab. This is the same as before. However, you’ll want to be sure to copy the “Full Disc” this time, since you want to make sure you have the subtitle files available for later. Grab a copy of the free “DVDFab Decrypter” here. Install it then run it and choose the “Full Disc” option, as highlighted below.

image

Choose a target folder on your harddrive somewhere and kick it off. We go through this first step for 2 reasons:

  1. if the DVD has been encrypted, this removes that, so that the following utilities will work on the movie.
  2. all the remaining processing is MUCH faster if the source files are on your harddrive and not the DVD.

Step 2

Now that you’ve got all the files from the DVD to your harddrive, you need to “render” the movie with the appropriate subtitles. You see, subtitles are embedded in the VOB files on the DVD as literally bitmapped images of  the text that need to be overlaid by the DVD player when you select to show subtitles (if you were to play the movie on an actual DVD player). Since we’ll be ending up with an MPG file, none of that will apply. Tivo doesn’t have clue one about subtitles and whatnot, so the separate subtitle images are useless to it.

Instead, what you need is a copy of the movie with the subtitles “burned onto” the actual frames of the movie. This way, the Tivo can simply “play” the movie, the subtitles will be just a part of the image frames in the movie. This does mean you won’t be able to “turn off” the subtitles, but, at least for my purposes, that’s a pretty minor issue.

In order to burn the subtitles onto the movie, you’ll need the free program AutoGordianKnot. Yeah, weird name, but it does exactly what it says it will do. Essentially, it converts the DVD VOB fileset into a DIVX AVI file, but it can render the subtitles into the output avi file easily.

image

In the above screenshot:

  1. Select the input file (choose the VTS_01_0.IFO file, that will almost always be the proper root file to pick). Select an output folder for the resulting AVI file.
  2. Pick your audio track (if english is listed, you shouldn’t need to be doing any of this!).
  3. Select the subtitles you want rendered into the output avi file. In this case, you’ll likely want English, but there might be more than one choice here
  4. I set the target quality to 100. This is because we’re going to have to “reconvert” the video again and the higher the quality here, the better the end result will be.
  5. And finally, click the Add Job and then the Start buttons.

Note that in my case, the very first time I ran this, I had a few dialogs pop up that I had to click OK on. If you don’t wait for them, eventually the program will timeout and the conversion will fail. The good thing is that, at least for me, all the dialogs displayed within the first 5 minutes or so of the process, so you shouldn’t have to watch the entire process. It can take a while!

Step 3

Once that’s done, you should end up with an AVI file that is playable and that contains the foreign language audio track, along with the english subtitles. Yeah!

However, there’s one problem. The Tivo doesn’t know how to play these files. Basically, Tivos can ONLY play MPG files, and AutoGK can ONLY render DivX AVI files.

That’s where the last program comes into play, AVItoMPG. (alternately, you might try another free format converter, Super ©, but I haven’t personally tried that one yet).

With AVITompg, the easiest option is to download their “Portable” version. It’s the exe and nothing more. Just download it and run it!

image

Click the “Add Video” button, select the AVI file you generated in Step 2 above, then be sure to select the DVD Compatible MPEG2 format in the Output Format box.

image

The other MPG formats should work as well, but I haven’t tested them. Leave the other settings “Auto”. I found no need to change any of them.

Click OK, then click the Convert button and let it go! When it finishes, you should have an MPG file in the output folder, that contains the foreign language audio track and the English subtitles.

Done!

The MPG file is the only one you still need. You can delete all the VOB files from Step 1 and the AVI file from Step 2.

Now, just move that MPG file wherever it needs to go so that you can get to it from your Tivo and you’ll be able to watch that movie, pause, rewind etc. No DVD to hassle with anymore, either!

Cleaning up Messy DataContractSerializer XML

7
Filed under Code Garage, Software Architecture, VB Feng Shui, XML

I was working with XML serialization of objects recently and was using the good ol’ DataContractSerializer again.

One thing that I bumped into almost immediately is that the XML that it spits out isn’t exactly the neatest, tidiest of XML possible, to say the least.

So I set out on a little odyssey to see exactly how nice and clean I could make it.

(EDIT: I’ve added more information about how the Name property of the Field object is being serialized twice, which is another big reason for customizing the serialization here, and for specialized dictionary serialization in general).

First, the objects to serialize. I’ve constructed a very rudimentary object hierarchy that still illustrates the problem well.

In this case, I have a List of Record objects, called a Records list. Each Record object is a dictionary of Field objects. And each Field object contains two properties, Name and Value. The code for these (and a little extra code to make populating them easy) is as follows.

Public Class Records
    Inherits List(Of Record)


    Public Sub New()
        '---- default constructor
    End Sub

End Class


Public Class Record
    Inherits Dictionary(Of String, Field)


    Public Sub New()
        '---- default constructor
    End Sub


    Public Sub New(ByVal ParamArray Fields() As Field)
        For Each f In Fields
            Me.Add(f.Name, f)
        Next
    End Sub
End Class


Public Class Field

    Public Sub New()
        '---- default constructor
    End Sub


    Public Sub New(ByVal Name As String, ByVal Value As String)
        Me.Name = Name
        Me.Value = Value
    End Sub


    Public Property Name() As String
        Get
            Return _Name
        End Get
        Set(ByVal value As String)
            _Name = value
        End Set
    End Property
    Private _Name As String



    Public Property Value() As String
        Get
            Return _Value
        End Get
        Set(ByVal value As String)
            _Value = value
        End Set
    End Property
    Private _Value As String

End Class

Yes, I realize there are DataTables, KeyValuePair objects, etc that could do this, but that’s not the point, so just bear with me<g>.

To populate a Records object, you might have code that looks like this:

Dim Recs = New Records
Recs.Add(New Record(New Field("Name", "Darin"), New Field("City", "Arlington")))
Recs.Add(New Record(New Field("Name", "Gillian"), New Field("City", "Ft Worth")))
Recs.Add(New Record(New Field("Name", "Laura"), New Field("City", "Dallas")))

Ok, so far so good.

Now, lets serialize that with a simple serialization function using the DataContractSerializer:

    ''' <summary>
    ''' Serializes the data contract to a string (XML)
    ''' </summary>
    Public Function Serialize(Of T As Class)(ByVal SerializeWhat As T) As String
        Dim stream = New System.IO.StringWriter
        Dim writer = System.Xml.XmlWriter.Create(stream)

        Dim serializer = New System.Runtime.Serialization.DataContractSerializer(GetType(T))
        serializer.WriteObject(writer, SerializeWhat)
        writer.Flush()

        Return stream.ToString
    End Function

In the test application, I put together, I dump the resulting XML to a text box. Yikes!

image

So, what’re the problems here? <g>

  1. You’ve got that “http://www.w3.org/2001/XMLSchema-instance” namespace attribute amongst other
  2. lots of random letters
  3. no indenting
  4. You can’t really tell it from this shot, but the Record dictionary is serializing the name property twice, because I’m using it as the Key for the dictionary, but it’s also a property of the objects in the dictionary.

All this noise might be fine for computer to computer communication, but it’s pretty tough on human eyes<g>.

Ok, first thing to do is indent:

    ''' <summary>
    ''' Serializes the data contract to a string (XML)
    ''' </summary>
    Public Function Serialize(Of T As Class)(ByVal SerializeWhat As T) As String
        Dim stream = New System.IO.StringWriter
        Dim xmlsettings = New Xml.XmlWriterSettings
        xmlsettings.Indent = True
        Dim writer = System.Xml.XmlWriter.Create(stream, xmlsettings)

        Dim serializer = New System.Runtime.Serialization.DataContractSerializer(GetType(T))
        serializer.WriteObject(writer, SerializeWhat)
        writer.Flush()

        Return stream.ToString
    End Function

Notice that I added the use of the XMLWriterSettings object. This allows me to set the Indent property, and things are much more readable.

image

But that’s still a far cry from nice, simple, tidy XML. Notice all the “ArrayofArrayOf blah blah” names, and the randomized letter sequences? Plus, it’s much more obvious how the NAME jproperty is being serialized twice now. Yuck! Surely, we can do better than this!

Cleaning Up the Single Entity Field Object

The DataContractSerializer certainly works easily enough to serialize the Field object, but unfortunately, it decorates the serialized elements with a load of really nasty looking and completely unnecessary cruft.

My first thought was to simply decorate the class with <DataContract> attributes:

<DataContract(Name:="Field", Namespace:="")> _
Public Class Field

    Public Sub New()
        '---- default constructor
    End Sub


    Public Sub New(ByVal Name As String, ByVal Value As String)
        Me.Name = Name
        Me.Value = Value
    End Sub

    <DataMember()> _
    Public Property Name() As String
        Get
            Return _Name
        End Get
        Set(ByVal value As String)
            _Name = value
        End Set
    End Property
    Private _Name As String



    <DataMember()> _
    Public Property Value() As String
        Get
            Return _Value
        End Get
        Set(ByVal value As String)
            _Value = value
        End Set
    End Property
    Private _Value As String

End Class

But this yields:

image

So we have several problems:

  • Each field is rendered into a Value element of the Record’s field collection
  • The Key of the Record collection duplicates the Name of the individual Field objects
  • and we still have a noxious xmlns=”” attribute being rendered.

Unfortunately, this is where the DataContractSerializer’s simplicity is it’s downfall. There’s just no way to customize this any further, using ONLY the DataContractSerializer.

However, we can implement IXMLSerializable on our Field object to customize its serialization. All I need to do is remove the DataContract attribute, and add a simple implementation of IXMLSerializable to the class:

Public Class Field
    Implements System.Xml.Serialization.IXmlSerializable


    Public Sub New()
        '---- default constructor
    End Sub


    Public Sub New(ByVal Name As String, ByVal Value As String)
        Me.Name = Name
        Me.Value = Value
    End Sub

    Public Property Name() As String
        Get
            Return _Name
        End Get
        Set(ByVal value As String)
            _Name = value
        End Set
    End Property
    Private _Name As String


    Public Property Value() As String
        Get
            Return _Value
        End Get
        Set(ByVal value As String)
            _Value = value
        End Set
    End Property
    Private _Value As String


    Public Function GetSchema() As System.Xml.Schema.XmlSchema Implements System.Xml.Serialization.IXmlSerializable.GetSchema
        Return Nothing
    End Function


    Public Sub ReadXml(ByVal reader As System.Xml.XmlReader) Implements System.Xml.Serialization.IXmlSerializable.ReadXml

    End Sub


    Public Sub WriteXml(ByVal writer As System.Xml.XmlWriter) Implements System.Xml.Serialization.IXmlSerializable.WriteXml
        writer.WriteElementString("Name", Me.Name)
        writer.WriteElementString("Value", Me.Value)
    End Sub
End Class

And that yields a serialization of:

image

Definitely better, but still not great.

Cleaning up a Generic Dictionary’s Serialization

The problem now is with the Record dictionary.

Public Class Record
    Inherits Dictionary(Of String, Field)
    Implements System.Xml.Serialization.IXmlSerializable


    Public Sub New()
        '---- default constructor
    End Sub


    Public Sub New(ByVal ParamArray Fields() As Field)
        For Each f In Fields
            Me.Add(f.Name, f)
        Next
    End Sub

    Public Function GetSchema() As System.Xml.Schema.XmlSchema Implements System.Xml.Serialization.IXmlSerializable.GetSchema
        Return Nothing
    End Function

    Public Sub ReadXml(ByVal reader As System.Xml.XmlReader) Implements System.Xml.Serialization.IXmlSerializable.ReadXml

    End Sub

    Public Sub WriteXml(ByVal writer As System.Xml.XmlWriter) Implements System.Xml.Serialization.IXmlSerializable.WriteXml
        For Each f In Me.Values
            DirectCast(f, System.Xml.Serialization.IXmlSerializable).WriteXml(writer)
        Next
    End Sub
End Class

Adding an IXMLSerializable implementation to it as well yields the following XML:

image

Definitely much better! Especially notice that we’ve gotten rid of the duplicated “Name” key. It was duplicated before because we used the Name element of the Field object as the Key for the Record dictionary. This be play an important part in deserializing the Record’s dictionary of Field objects later.

Cleaning up the List of Records

Finally, the only thing really left to do is clean up how the generic list of Record objects is serialized.

But once again, the only way to alter the serialization is to implement IXMLSerializable on the class.

<Xml.Serialization.XmlRoot(Namespace:="")> _
Public Class Records
    Inherits List(Of Record)
    Implements System.Xml.Serialization.IXmlSerializable


    Public Sub New()
        '---- default constructor
    End Sub

    Public Function GetSchema() As System.Xml.Schema.XmlSchema Implements System.Xml.Serialization.IXmlSerializable.GetSchema
        Return Nothing
    End Function

    Public Sub ReadXml(ByVal reader As System.Xml.XmlReader) Implements System.Xml.Serialization.IXmlSerializable.ReadXml

    End Sub

    Public Sub WriteXml(ByVal writer As System.Xml.XmlWriter) Implements System.Xml.Serialization.IXmlSerializable.WriteXml
        For Each r In Me
            DirectCast(r, System.Xml.Serialization.IXmlSerializable).WriteXml(writer)
        Next
    End Sub
End Class

Notice that I’ve implemented IXMLSerializable, but I also added the XmlRoot attribute with a blank Namespace parameter. This completely clears the Namespace declaration from the resulting output, which now looks like this:

image

And that is just about as clean as your going to get!

But That’s Not all there is To It

Unfortunately, it’s not quite this simple. The thing is, you very well may want to serialize each object independently, not just serialize the Records collection. Doing that as we have things defined right now won’t work. The Start and End elements won’t be generated in the XML properly.

Instead, we need to add XmlRoot attributes to all three classes, and adjust where the WriteStartElement and WriteEndElement calls are made. So we end up with this:

<Xml.Serialization.XmlRoot(Namespace:="")> _ Public Class Records Inherits List(Of Record) Implements System.Xml.Serialization.IXmlSerializable Public Sub New() '---- default constructor End Sub Public Function GetSchema() As System.Xml.Schema.XmlSchema Implements System.Xml.Serialization.IXmlSerializable.GetSchema Return Nothing End Function Public Sub ReadXml(ByVal reader As System.Xml.XmlReader) Implements System.Xml.Serialization.IXmlSerializable.ReadXml End Sub Public Sub WriteXml(ByVal writer As System.Xml.XmlWriter) Implements System.Xml.Serialization.IXmlSerializable.WriteXml For Each r In Me writer.WriteStartElement("Record") DirectCast(r, System.Xml.Serialization.IXmlSerializable).WriteXml(writer) writer.WriteEndElement() Next End Sub End Class <Xml.Serialization.XmlRoot(ElementName:="Record", Namespace:="")> _ Public Class Record Inherits Dictionary(Of String, Field) Implements System.Xml.Serialization.IXmlSerializable Public Sub New() '---- default constructor End Sub Public Sub New(ByVal ParamArray Fields() As Field) For Each f In Fields Me.Add(f.Name, f) Next End Sub Public Function GetSchema() As System.Xml.Schema.XmlSchema Implements System.Xml.Serialization.IXmlSerializable.GetSchema Return Nothing End Function Public Sub ReadXml(ByVal reader As System.Xml.XmlReader) Implements System.Xml.Serialization.IXmlSerializable.ReadXml End Sub Public Sub WriteXml(ByVal writer As System.Xml.XmlWriter) Implements System.Xml.Serialization.IXmlSerializable.WriteXml For Each f In Me.Values writer.WriteStartElement("Field") DirectCast(f, System.Xml.Serialization.IXmlSerializable).WriteXml(writer) writer.WriteEndElement() Next End Sub End Class <Xml.Serialization.XmlRoot(ElementName:="Field", Namespace:="")> _ Public Class Field Implements System.Xml.Serialization.IXmlSerializable Public Sub New() '---- default constructor End Sub Public Sub New(ByVal Name As String, ByVal Value As String) Me.Name = Name Me.Value = Value End Sub Public Property Name() As String Get Return _Name End Get Set(ByVal value As String) _Name = value End Set End Property Private _Name As String Public Property Value() As String Get Return _Value End Get Set(ByVal value As String) _Value = value End Set End Property Private _Value As String Public Function GetSchema() As System.Xml.Schema.XmlSchema Implements System.Xml.Serialization.IXmlSerializable.GetSchema Return Nothing End Function Public Sub ReadXml(ByVal reader As System.Xml.XmlReader) Implements System.Xml.Serialization.IXmlSerializable.ReadXml End Sub Public Sub WriteXml(ByVal writer As System.Xml.XmlWriter) Implements System.Xml.Serialization.IXmlSerializable.WriteXml writer.WriteElementString("Name", Me.Name) writer.WriteElementString("Value", Me.Value) End Sub End Class

 

 

And Finally, Deserialization

Of course, all this would be for nought if we couldn’t actually deserialize the xml we’ve just spent all this effort to clean up.

Turns out that deserialization is pretty straightforward. I just needed to add code to the ReadXml member of the implemented IXMLSerializable interface. The full code for my testing form is below. Be sure to add a reference to System.Runtime.Serialization, though, or you’ll have type not defined errors.

Public Class frmSample

    Private Sub btnTest_Click(ByVal sender As Object, ByVal e As System.EventArgs) Handles btnTest.Click

        '---- populate the objects
        Dim Recs = New Records
        Recs.Add(New Record(New Field("Name", "Darin"), New Field("City", "Arlington")))
        Recs.Add(New Record(New Field("Name", "Gillian"), New Field("City", "Ft Worth")))
        Recs.Add(New Record(New Field("Name", "Laura"), New Field("City", "Dallas")))

        Dim t As String
        t = Serialize(Of Field)(Recs(0).Values(0))
        Dim fld = Deserialize(Of Field)(t)
        Debug.Print(fld.Name)
        Debug.Print(fld.Value)
        Debug.Print("--------------")

        t = Serialize(Of Record)(Recs(0))
        Dim rec = Deserialize(Of Record)(t)
        Debug.Print(rec.Values.Count)
        Debug.Print("--------------")

        t = Serialize(Of Records)(Recs)
        tbxOutput.Text = t

        Dim recs2 = Deserialize(Of Records)(t)
        Debug.Print(recs2.Count)
    End Sub
End Class


<Xml.Serialization.XmlRoot(Namespace:="")> _
Public Class Records
    Inherits List(Of Record)
    Implements System.Xml.Serialization.IXmlSerializable


    Public Sub New()
        '---- default constructor
    End Sub

    Public Function GetSchema() As System.Xml.Schema.XmlSchema Implements System.Xml.Serialization.IXmlSerializable.GetSchema
        Return Nothing
    End Function

    Public Sub ReadXml(ByVal reader As System.Xml.XmlReader) Implements System.Xml.Serialization.IXmlSerializable.ReadXml
        reader.MoveToContent()
        reader.ReadStartElement("Records")
        reader.MoveToContent()
        Do While reader.NodeType <> Xml.XmlNodeType.EndElement
            Dim Rec = New Record
            DirectCast(Rec, System.Xml.Serialization.IXmlSerializable).ReadXml(reader)
            Me.Add(Rec)
            reader.MoveToContent()
        Loop
        reader.ReadEndElement()
    End Sub

    Public Sub WriteXml(ByVal writer As System.Xml.XmlWriter) Implements System.Xml.Serialization.IXmlSerializable.WriteXml
        For Each r In Me
            writer.WriteStartElement("Record")
            DirectCast(r, System.Xml.Serialization.IXmlSerializable).WriteXml(writer)
            writer.WriteEndElement()
        Next
    End Sub
End Class


<Xml.Serialization.XmlRoot(ElementName:="Record", Namespace:="")> _
Public Class Record
    Inherits Dictionary(Of String, Field)
    Implements System.Xml.Serialization.IXmlSerializable


    Public Sub New()
        '---- default constructor
    End Sub


    Public Sub New(ByVal ParamArray Fields() As Field)
        For Each f In Fields
            Me.Add(f.Name, f)
        Next
    End Sub

    Public Function GetSchema() As System.Xml.Schema.XmlSchema Implements System.Xml.Serialization.IXmlSerializable.GetSchema
        Return Nothing
    End Function

    Public Sub ReadXml(ByVal reader As System.Xml.XmlReader) Implements System.Xml.Serialization.IXmlSerializable.ReadXml
        reader.MoveToContent()
        reader.ReadStartElement("Record")
        reader.MoveToContent()
        Do While reader.NodeType <> Xml.XmlNodeType.EndElement
            Dim fld = New Field
            DirectCast(fld, System.Xml.Serialization.IXmlSerializable).ReadXml(reader)
            Me.Add(fld.Name, fld)
            reader.MoveToContent()
        Loop
        reader.ReadEndElement()
    End Sub

    Public Sub WriteXml(ByVal writer As System.Xml.XmlWriter) Implements System.Xml.Serialization.IXmlSerializable.WriteXml
        For Each f In Me.Values
            writer.WriteStartElement("Field")
            DirectCast(f, System.Xml.Serialization.IXmlSerializable).WriteXml(writer)
            writer.WriteEndElement()
        Next
    End Sub
End Class


<Xml.Serialization.XmlRoot(ElementName:="Field", Namespace:="")> _
Public Class Field
    Implements System.Xml.Serialization.IXmlSerializable


    Public Sub New()
        '---- default constructor
    End Sub


    Public Sub New(ByVal Name As String, ByVal Value As String)
        Me.Name = Name
        Me.Value = Value
    End Sub

    Public Property Name() As String
        Get
            Return _Name
        End Get
        Set(ByVal value As String)
            _Name = value
        End Set
    End Property
    Private _Name As String


    Public Property Value() As String
        Get
            Return _Value
        End Get
        Set(ByVal value As String)
            _Value = value
        End Set
    End Property
    Private _Value As String


    Public Function GetSchema() As System.Xml.Schema.XmlSchema Implements System.Xml.Serialization.IXmlSerializable.GetSchema
        Return Nothing
    End Function


    Public Sub ReadXml(ByVal reader As System.Xml.XmlReader) Implements System.Xml.Serialization.IXmlSerializable.ReadXml
        reader.MoveToContent()
        reader.ReadStartElement("Field")
        reader.MoveToContent()
        If reader.Name = "Name" Then Me.Name = reader.ReadElementContentAsString
        reader.MoveToContent()
        If reader.Name = "Value" Then Me.Value = reader.ReadElementContentAsString
        reader.MoveToContent()
        reader.ReadEndElement()
    End Sub


    Public Sub WriteXml(ByVal writer As System.Xml.XmlWriter) Implements System.Xml.Serialization.IXmlSerializable.WriteXml
        writer.WriteElementString("Name", Me.Name)
        writer.WriteElementString("Value", Me.Value)
    End Sub
End Class



Public Module Serialize
    ''' <summary>
    ''' Serializes the data contract to a string (XML)
    ''' </summary>
    Public Function Serialize(Of T As Class)(ByVal SerializeWhat As T) As String
        Dim stream = New System.IO.StringWriter
        Dim xmlsettings = New Xml.XmlWriterSettings
        xmlsettings.Indent = True
        Dim writer = System.Xml.XmlWriter.Create(stream, xmlsettings)

        Dim serializer = New System.Runtime.Serialization.DataContractSerializer(GetType(T))
        serializer.WriteObject(writer, SerializeWhat)
        writer.Flush()

        Return stream.ToString
    End Function


    ''' <summary>
    ''' Deserializes the data contract from xml.
    ''' </summary>
    Public Function Deserialize(Of T As Class)(ByVal xml As String) As T
        Using stream As New MemoryStream(UnicodeEncoding.Unicode.GetBytes(xml))
            Return DeserializeFromStream(Of T)(stream)
        End Using
    End Function


    ''' <summary>
    ''' Deserializes the data contract from a stream.
    ''' </summary>
    Public Function DeserializeFromStream(Of T As Class)(ByVal stream As Stream) As T
        Dim serializer As New DataContractSerializer(GetType(T))
        Return DirectCast(serializer.ReadObject(stream), T)
    End Function
End Module

Of particular note above is the ReadXML function of the Field object.

It checks the name of the node first and then places the value of the node into the appropriate property of that object. If I didn’t do that, the deserialization process would require the fields in the XML to be in a specific order. This is a minor drawback to the DataContractSerializer that this approach alleviates.

What’s Next?

The one unfortunate aspect of this is that it requires you to implement IXMLSerializable on each object that you want the XML cleaned up for.

Generally speaking, The DataContractSerializer will be perfectly fine for those cases where humans aren’t likely to ever have to see the XML you’re generating. And you get a performance boost for sacrificing that flexibility and “cleanliness”.

But for things like data file imports, custom configuration files, and the like, it may be desirable to  implement custom serialization like this so that your xml files can be almost as easy to read as those old school INI files!

Code Garage – Case Insensitive Dictionaries

0
Filed under Code Garage

Something that’s always bothered me a little about the generic dictionary support in .NET is that it’s, by default, case sensitive. I’d never really contemplated it much more than that until today, when I really needed  a dictionary that supported a fast, case insensitive lookup.

At first, I used a list, and the FirstOrDefault function along with a lambda expression. It worked, but I soon realized it was wretchedly slow.

I knew that surely, there was a way to get case insensitive lookups with a generic dictionary, but I’d never really gone looking for it. But a little searching later, and I’d found the answer.

    Public Sub Test()

        Dim d = New Dictionary(Of String, String)(StringComparer.CurrentCultureIgnoreCase)
        d.Add("John", "JohnTest")
        d.Add("Bob", "BobTest")
        d.Add("Bill", "BillTest")
        d.Add("Zack", "ZackTest")

        Debug.Print(d.Keys.Contains("ZACK"))
        Debug.Print(d("bill"))
    End Sub

You must specify an IEqualityComparer object as part of the constructor, and the object to use can be easily obtained from the StringComparer factory object., as show above.

Give it a shot with and without the (StringComparer.CurrentCultureIgnoreCase) clause.

Even better. If you’re defining your own strongly typed dictionary based on the generic dictionary, you can force the comparer in your constructor, so that instances of your dictionary will always use the right comparer; code that instantiates your dictionary won’t have to bother with (or remember to supply) the StringComparer object.

    Public Class StringDict
        Inherits Dictionary(Of String, String)

        Public Sub New()
            MyBase.New(StringComparer.CurrentCultureIgnoreCase)
        End Sub
    End Class


    Public Sub Test2()
        Dim d = New StringDict
        d.Add("John", "JohnTest")
        d.Add("Bob", "BobTest")
        d.Add("Bill", "BillTest")
        d.Add("Zack", "ZackTest")

        Debug.Print(d.Keys.Contains("ZACK"))
        Debug.Print(d("bill"))
    End Sub

It may not be new, but it’s new to me, and awfully nice to know!