Tag Archives: C#

Tips for debugging a WiX MSI

WiX is an excellent technology that simplifies the creation of MSI files using an XML abstraction on top of the Windows Installer APIs. WiX doesn’t change the underlying MSI architecture and it can be a huge pain in the ass sometimes. I thought I would share some tips for debugging what’s going on with an MSI. These tips aren’t specific to WiX, that’s just the technology I’m familiar with.

Logging

The first thing you should try is to add the log switch when you run your installer. To do this, open a command prompt in the directory where your MSI file is located and run it with the following parameters:

msiexec yourinstaller.msi /L*v install.log

The /L*v says to log verbose messages to a file.

For a full list of msiexec switches, check out MSDN’s documentation for msiexec.

Re-cache MSI

If you think your package was somehow corrupted or you’ve made a simple such as changing a file’s guid or a feature condition, you can re-cache the package with the following command:

msiexec /fv yourinstaller.msi

This is generally used when you install a product and then thing “oops, I forgot something…”. You don’t want to create a whole new release, so you can update the installed/cached MSI.

Be careful to read the documentation. With the /f parameter, you can’t pass properties on the command line.

Increased Debug Messages

You can actually increase the amount of messages exposed to MSI logs by setting a machine policy for Windows Installer.

Open regedit and go to HKEY_LOCAL_MACHINE\Software\Policies\Microsoft\Windows\Installer (you may need to create this key). Add a DWORD key called Debug with a value of 7.

The value of 7 will write all common debugging messages to the log as well as command line information. If you’re passing sensitive data to the installer on the command line, be warned that this information gets dumped via the OutputDebugString function. This means that anyone with physical access to the machine or access to the machine via TCP/IP may be able to read these messages. The documentation for the policy settings warns that this setting is for debugging purposes only and may not exist in future versions of Windows Installer.

For a full list of available settings, check out Microsoft’s Windows Installer Machine Policies.

System Debugging

Passing the logging switch and log file name to an MSI every time you run an MSI can be annoying. You can set the Logging machine policy to have Windows Installer write a log to a temp file, but then you have to track down the temp file.

Instead, you can run Sysinternals’ DebugView. This utility allows you to catch debugging information exposed by the Win32 OutputDebugString function on local and remote machines. This doesn’t just gather information from MSIs. For example, you can open a TFS query editor in Visual Studio 2013 and inspect the queries made via TFS.

Orca

Lastly, the Windows SDK contains a file called Orca.msi. Orca is a tool to inspect the contents of an MSI. An MSI is essentially just a database of tables and fields with values. You can open an existing MSI, edit a condition, and save it. This can make investigation a little easier. For instance, if you’re trying to figure out why a condition on a custom action doesn’t seem to be passing, you can edit the MSI and modify the condition instead of rebuilding the MSI. It’s also helpful if you’re trying to figure out how another MSI implements some logic.

I’m sure there are plenty of other techniques for debugging MSI files.

Software Abstractions take Skill.

I recently read Adaptive Code via C# and posted a review on Amazon:

This book is a new favorite of mine. I’ve always prided myself on writing clean and concise code. I’ve always been fond of SOLID principles. I wanted to read this book to keep my understanding of SOLID principles fresh. It also covers design patterns, although be aware that it’s not an in-depth design patterns book.

The book is broken into three sections. The first section is an introduction to Scrum and SOLID. Before I had even finished the first section, I was already recommending this book to colleagues. This leading content isn’t necessarily targeted toward developers. I think many managers or team leaders could benefit from reading the basics of Scrum and understanding the terminology for SOLID programming.

I already had a pretty solid (sorry for that) understanding of SOLID principles, so I felt like the second section was more of a refresher. In that vein, I think it’d be hard for me to definitively say how easily digestible this section will be for beginners. I think it will greatly help intermediate or expert engineers to gain a new understanding of software architecture. The book’s audience is meant to be intermediate and expert engineers, but I think beginners could get the content. It’s so well written and clearly explained that I think anyone who might struggle a little with the concepts presented in the book could easily substitute any gaps in understanding with Wikipedia or blogs. Though, I honestly felt like there were no gaps. This section may be boring for non-developers, although I know project managers, program managers, and directors that would find this section interesting. The biggest thing to keep in mind is that SOLID principles are not rules; they’re guidelines.

I thought the last section was excellent. It is split into three chapters in which you’re presented with somewhat realistic dialog (‘fortnight’) that follows a small team through the first two sprints for a chat application. I’ve read a few books on Agile and Scrum methodologies and this section was probably the most fun to read on the topic. It could just be that it’s written as a script with code examples, but it was refreshing and easy to follow.

This book does a great job at explaining technical debt. While reading the book, I realized nobody at my current job has ever said the term ‘technical debt’. I asked around and found that it was a new term to most. The concept of technical debt is one thing I know I had problems understanding as a beginner. As developers become more mature, they begin to understand that the field is usually roughly equal parts business and technology. It’s really important to understand these ‘trade-offs’ and the last section demonstrates pretty well how technical debt occurs.

If you’re on the fence about purchasing this book, you should buy it. It’s a quick read and an understanding of the subject matter will improve your software. I’ve never regretted purchasing a Microsoft Press book.

I gave a short presentation at work recently about SOLID principles, so I was stoked to have a chance to read this book. One of the biggest takeaways I hope anyone has from this book is the ability to abstract software in useful ways.

I recently solved an issue using techniques such as those presented in this book. Specifically, this solution included the Open/Closed Principle, Dependency Injection, and Single Responsibility Principle as well as the Decorator pattern. The issue was simple, one I’m sure many people have encountered in their time with C#; a dataset’s DataRow does not inherent from IDataRecord in the same way something like SqlDataReader does.

Suppose you have a domain model of some kind that you need ‘mapped’ from your database into a single object.

Here’s a demonstration of the problem. I use a csv of cars from Wikipedia and instead of mapping to a domain object, I write out to the console.

First, an example using a DataReader:

using (var connection = new OdbcConnection(connectionString))
{
    connection.Open();
    using (var cmd = connection.CreateCommand())
    {
        cmd.CommandText = "select * from Cars.csv";

        using (var reader = cmd.ExecuteReader())
        while (reader.Read())
        {
            OverloadedDumpRow(reader);
        }
    }
}

Now, an example using a DataRow:

using (var connection = new OdbcConnection(connectionString))
{
    connection.Open();

    var adapter = new OdbcDataAdapter("SELECT * FROM Cars.csv", connection);

    DataSet ds = new DataSet("Temp");
    adapter.Fill(ds);

    foreach (DataRow row in ds.Tables[0].Rows)
    {
        OverloadedDumpRow(row);
    }
}

Aside from the implementation of data retrieval, OverloadedDumpRow should look exactly the same for both examples:

Console.WriteLine("A {0} {1} {2}", 
    row["Year"], row["Make"], row["Model"]);

The problem is that, since these two implementations don’t share a common base type of any sort (DataRow derives from nothing). This isn’t really an issue in a small example like this, but if you have a complex domain and you want to return data from your database and parse that data consistently. Think about what you’d have to do for each and every domain model just to parse resultsets from IDataRecord and DataRow. How do you determine whether or not you’ll even *need* both implementations? To be DRY, you’d need to pull your data from the following methods into variables or directly into your target domain object:

static void OverloadedDumpRow(IDataRecord row)
{
    Console.WriteLine("A {0} {1} {2}",
                    row["Year"], row["Make"], row["Model"]);
}

static void OverloadedDumpRow(DataRow row)
{
    Console.WriteLine("A {0} {1} {2}",
                    row["Year"], row["Make"], row["Model"]);
}

Obviously, this isn’t reusable. What we’d need is an interface. Something like this:

public interface IStringIndexed
{
  object this[string key] { get; }
}

Then, we could implement the redundant methods above in a single method:

static void DumpRow(IStringIndexed record)
{
  Console.WriteLine("A {0} {1} {2}",
    record["Year"], record["Make"], record["Model"]);
}

To get from IDataRecord and DataRow to this target interface, you’ll need an adapter. An adapter decorates a target type and exposes some new interface for that type. This can be a little confusing in our case because IDataRecord and DataRow have the same functionality (returning an object by string index), but they don’t have a consistent interface allowing us to write abstractions on top of these two types.

Our simple interface follows the single responsibility principle (implementations can only get an object by key) as well as the open/closed principle (you can now write extension methods against IStringIndexed).

Writing an adapter to use the interface from above is ridiculously easy. Here’s one of them:

internal class DataRowStringIndexedWrapper : IStringIndexed
{
    private readonly DataRow _row;

    public DataRowStringIndexedWrapper(DataRow row)
    {
        _row = row;
    }

    #region IStringIndexed Members

    object IStringIndexed.this[string key]
    {
        get { return _row[key]; }
    }

    #endregion
}

You would wrap an IDataRecord in exactly the same way.

Here are the two updated examples (notice both of these dump the record to console using the same method):

using (var connection = new OdbcConnection(connectionString))
{
    connection.Open();
    using (var cmd = connection.CreateCommand())
    {
        cmd.CommandText = "select * from Cars.csv";

        using (var reader = cmd.ExecuteReader())
        while (reader.Read())
        {
            DumpRow(new DataRecordStringIndexedWrapper(reader));
        }
    }
}

using (var connection = new OdbcConnection(connectionString))
{
    connection.Open();

    var adapter = new OdbcDataAdapter("SELECT * FROM Cars.csv", connection);

    DataSet ds = new DataSet("Temp");
    adapter.Fill(ds);

    foreach (DataRow row in ds.Tables[0].Rows)
    {
        DumpRow(new DataRowStringIndexedWrapper(row));
    }
}

This is the power of following SOLID principles and studying design patterns. This example isn’t taken from the book, but you’ll learn these skills and more in the book.

Code

A console application of this example is available on github.

You can purchase Adaptive Code via C# from informIT or Amazon.

Data at the root level is invalid. Line 1, position 1.

Recently, I encountered a really weird problem with an XML document. I was trying to load a document from a string:

var doc = XDocument.parse(someString);

I received this unhelpful exception message:

Data at the root level is invalid. Line 1, position 1.

I verified the XML document and retried two or three times with and without the XML declaration (both of which should work with XDocument). Nothing helped, so I googled for an answer. I found the following answer on StackOverflow by James Brankin:

I eventually figured out there was a byte mark exception and removed it using this code:

string _byteOrderMarkUtf8 = Encoding.UTF8.GetString(Encoding.UTF8.GetPreamble());
if (xml.StartsWith(_byteOrderMarkUtf8))
{
    xml = xml.Remove(0, _byteOrderMarkUtf8.Length);
}

This solution worked. I was happy. I discussed it with a coworker and he had never heard of a BOM character before, so I thought “I should blog about this”.

Byte-Order Mark

The BOM is the character returned by Encoding.UTF8.GetPreamble(). Microsoft’s documentation explains:

The Unicode byte order mark (BOM) is serialized as follows (in hexadecimal):

  • UTF-8: EF BB BF
  • UTF-16 big endian byte order: FE FF
  • UTF-16 little endian byte order: FF FE
  • UTF-32 big endian byte order: 00 00 FE FF
  • UTF-32 little endian byte order: FF FE 00 00

Converting these bytes to a string (Encoding.UTF8.GetString) allows us to check if the xml string starts with the BOM or not. The code then removes that BOM from the xml string.

A BOM is a bunch of characters, so what? What does it do?

From Wikipedia:

The byte order mark (BOM) is a Unicode character used to signal the endianness (byte order) of a text file or stream. It is encoded at U+FEFF byte order mark (BOM). BOM use is optional, and, if used, should appear at the start of the text stream. Beyond its specific use as a byte-order indicator, the BOM character may also indicate which of the several Unicode representations the text is encoded in.

This explanation is better than the explanation from Microsoft. The BOM is (1) an indicator that a stream of bytes is Unicode and (2) a reference to the endianess of the encoding. UTF8 is agnostic of endianness (reference), so the fact that the BOM is there and causing problems in C# code is annoying. I didn’t research why the UTF8 BOM wasn’t stripped from the string (XML is coming directly from SQL Server).

What is ‘endianness’?

Text is a string of bytes, where one or more bytes represents a single character. When text is transferred from one medium to another (from a flash drive to a hard drive, across the internet, between web services, etc.), it is transferred as stream of bytes. Not all machines understand bytes in the same way, though. Some machines are ‘little-endian’ and some are ‘big-endian’.

Wikipedia explains the etymology of ‘endianness’:

In 1726, Jonathan Swift described in his satirical novel Gulliver’s Travels tensions in Lilliput and Blefuscu: whereas royal edict in Lilliput requires cracking open one’s soft-boiled egg at the small end, inhabitants of the rival kingdom of Blefuscu crack theirs at the big end (giving them the moniker Big-endians).

For text encoding, ‘endianness’ simply means ‘which end goes first into memory’. Think of this as a direction for a set of bytes. The word ‘Example’ can be represented by the following bytes (example taken from StackOverflow):

45 78 61 6d 70 6c 65

‘Big Endian’ means the first bytes go first into memory:

45 78 61 6d 70 6c 65
<-------------------

‘Little Endian’ means the text goes into memory with the small-end first:

45 78 61 6d 70 6c 65
------------------->

So, when ‘Example’ is transferred as ‘Big-Endian’, it looks exactly as the bytes in the above examples:

45 78 61 6d 70 6c 65

But, when it’s transferred in ‘Little Endian’, it looks like this:

65 6c 70 6d 61 78 45

Users of digital technologies don’t need to care about this, as long as they see ‘Example’ where they should see ‘Example’. Many engineers don’t need to worry about endianness because it is abstracted away by many frameworks to the point of only needing to know which type of encoding (UTF8 vs UTF16, for example). If you’re into network communications or dabbling in device programming, you’ll almost definitely need to be aware of endianness.

In fact, the endianness of text isn’t constrained by the system interacting with the text. You can work on a Big Endian operating system and install VoIP software that transmits Little Endian data. Understanding endianness also makes you cool.

Summary

I don’t have any code to accompany this post, but I hope the discussion of BOM and endianness made for a great read!

ServiceStack’s Markdown Razor Engine. Wow.

ServiceStack is a pretty sweet-looking alternative to WCF. It provides strongly-typed, well-designed, REST/RCP+SOAP services for .NET and Mono. Check out the README in the repository to see how ridiculously easy it is to setup a service. Unfortunately, I haven’t yet had a chance to use the stack. I was looking through the code today and saw a Markdown-Razor Engine.

I was like, “Wha….?

Yes, they offer an alternative view engine which blends Markdown and Razor. These are my two favorite syntaxes, so I had to see how well it compiles.

I compiled the ServiceStack.RazorEngine from source (in Ubuntu 12.04/Mono, no less!) to test a simple markdown file.

I created this markdown file, example.md, to throw at the framework.

# Example ServiceStack.Markdown

## Showing @examples.Count items

@foreach (var item in examples) {
  - @item.Name: @item.Number 
}


**Note: The template requires a space after the item.Number value**

For more information, check out 
[the docs](http://www.servicestack.net/docs/markdown/markdown-razor).

Also, *don't forget* to check out the code at 
[gh:ServiceStack/ServiceStack](https://github.com/ServiceStack/ServiceStack)

Here are some _other_ attempts to **break** 
the markdown **generation_of_html**. **escaped\_underscore\_in\_tags**

A space after double-asterisk (&lt;strong&gt; tags) will ** break **

Now, when I say simple, I mean it has most elements of markdown that I use regularly. It also uses the markdown-razor foreach syntax which iterates over a collection of Example objects. This view is not strongly-typed, so this is equivalent to a dynamic Razor view.

The code to compile this view is really simple. You create the template engine, and provide a markdown page object which takes a full path (.e.g /fakepath/Debug) and the template contents (e.g. the above markdown snippet).

// Create the markdown-razor template compiler
MarkdownFormat format = new MarkdownFormat();
string contents = File.ReadAllText(template);
var page = new MarkdownPage(format, path, "example", contents );
format.AddPage(page);

Then, you create your view object (ViewBag?).

// Create our view container (ViewBag)
var view = new Dictionary<string, object>() 
{
	{ "examples", examples }
};

All that’s left is to pass in the view object and compile it to html!

// Compile and output. 
// This can be redirected to html file 
// e.g. RazorExample.exe > output.html
var html = format.RenderDynamicPageHtml("example", view);
Console.WriteLine(html);

The Code

As always, the code for this post is available on github. More examples on the Markdown Razor syntax available in ServiceStack are available in the test project in ServiceStack’s repository.

Proxy Objects

I started to familiarize myself with proxy objects a couple of years ago when I started used Fluent NHibernate on a pretty large project. NHibernate itself proxies objects to allow the framework to do its magic. Proxies are a fantastic thing. I spoke with a friend of mine today about some advanced coding techniques including proxy objects and IoC, which made me want to write a little about some of those topics.

From Wikipedia:

A proxy, in its most general form, is a class functioning as an interface to something else. The proxy could interface to anything: a network connection, a large object in memory, a file, or some other resource that is expensive or impossible to duplicate.

Charles Bretana gave an excellent and succinct definition of proxy objects on Stack Overflow back in 2008:

One purpose is to “pretend” to be the real class so a client component (or object) can “believe” it’s talking to the “real” object, but inside the proxy, other stuff, (like logging, transactional support, etc.) is being done at the same time… Secondly, a proxy can be very cheap in comparson to the real object,. and often is used so that the real objects can be conserved (turned off or released to a pool to be used by other clients) when the client is not using them… The proxy stays “alive” and the client thinks it still has a connection to the real object, but whenever it “calls” the object, it is actually calling the proxy, which goes and gets another real object just to handle the call, and then releases the real object when the call is done.

I’ve created an example (available on github) which demonstrates how to proxy method calls in a few different ways, ranging from very simple to using Castle.DynamicProxy and interceptors. The code is written in C# and although the code doesn’t handle some of the more advanced topics of proxy objects, such as resource pooling as Charles described, it will (I hope) introduce proxy objects in a comprehensible way.

Read on for killer examples.
Continue reading Proxy Objects

Debug.WriteLine

I answered a question on StackOverflow last week which made me remember a few years ago when I also wondered, “Where does Debug.WriteLine go?!”

The answer is very simple (and no, I’m not peeved that I was the first to answer and didn’t get accepted).

If you look at the Debug.WriteLine documentation on MSDN, you will see a ConditionalAttribute:

[ConditionalAttribute("DEBUG")]
public static void WriteLine(
	string message
)

This attribute is one of a few* compiler-time attributes. In other words, the compiler looks for this attribute specifically and, if found and the DEBUG condition is met (i.e. ‘DEBUG’ is defined), THIS METHOD IS NOT COMPILED. That’s pretty awesome, but you don’t need to take my word for it. I’ve created a quick example.

Program.cs

Program.cs is a small program with three important pieces: Console.WriteLine, Debug.WriteLine, and a final Console.WriteLine. We have a before and after Console.WriteLine to show that Debug.WriteLine is not compiled into Program.exe when DEBUG is not defined.

using System;
using System.Diagnostics;

class Program {
    static void Main(string[] args) {
        Console.WriteLine("Console.WriteLine #1");
        Debug.WriteLine("Debug.WriteLine");
        Console.WriteLine("Console.WriteLine #2");
    }
}

To compile normally, run

csc Program.cs

.
To compile with the DEBUG constant defined, run:

csc /define:DEBUG Program.cs

You can run Program.exe after both compiles, but you will see the same output.

The IL

You can use a tool called ildasm to dump the IL code of Program.exe (ildasm is part of the Windows SDK, after installation it can be found at, e.g., C:\Program Files (x86)\Microsoft SDKs\Windows\v7.0A\Bin\NETFX 4.0 Tools).

Just open ildasm, choose File->Open and select Program.exe. Next, choose File->Dump. Choose ‘Dump IL Code’ and any other options you want.

ildasm options

Here is the output of the main method for both versions of the exe:

Program-NoDefines.il

  .method private hidebysig static void  Main(string[] args) cil managed
  {
    .entrypoint
    // Code size       24 (0x18)
    .maxstack  8
    IL_0000:  nop
    IL_0001:  ldstr      "Console.WriteLine #1"
    IL_0006:  call       void [mscorlib]System.Console::WriteLine(string)
    IL_000b:  nop
    IL_000c:  ldstr      "Console.WriteLine #2"
    IL_0011:  call       void [mscorlib]System.Console::WriteLine(string)
    IL_0016:  nop
    IL_0017:  ret
  } // end of method Program::Main

Program-DEBUGDefined.il

 .method private hidebysig static void  Main(string[] args) cil managed
  {
    .entrypoint
    // Code size       35 (0x23)
    .maxstack  8
    IL_0000:  nop
    IL_0001:  ldstr      "Console.WriteLine #1"
    IL_0006:  call       void [mscorlib]System.Console::WriteLine(string)
    IL_000b:  nop
    IL_000c:  ldstr      "Debug.WriteLine"
    IL_0011:  call       void [System]System.Diagnostics.Debug::WriteLine(string)
    IL_0016:  nop
    IL_0017:  ldstr      "Console.WriteLine #2"
    IL_001c:  call       void [mscorlib]System.Console::WriteLine(string)
    IL_0021:  nop
    IL_0022:  ret
  } // end of method Program::Main

As you can see, Debug.WriteLine is only added to Program.exe when we pass the DEFINE constant! Pretty cool, huh?

* The only other compiler-time attribute I know off-hand is ObsoleteAttribute, which will generate compiler warnings or errors if passing true as a second parameter into the constructor.