Tag Archives: LINQ

Data at the root level is invalid. Line 1, position 1.

Recently, I encountered a really weird problem with an XML document. I was trying to load a document from a string:

var doc = XDocument.parse(someString);

I received this unhelpful exception message:

Data at the root level is invalid. Line 1, position 1.

I verified the XML document and retried two or three times with and without the XML declaration (both of which should work with XDocument). Nothing helped, so I googled for an answer. I found the following answer on StackOverflow by James Brankin:

I eventually figured out there was a byte mark exception and removed it using this code:

string _byteOrderMarkUtf8 = Encoding.UTF8.GetString(Encoding.UTF8.GetPreamble());
if (xml.StartsWith(_byteOrderMarkUtf8))
{
    xml = xml.Remove(0, _byteOrderMarkUtf8.Length);
}

This solution worked. I was happy. I discussed it with a coworker and he had never heard of a BOM character before, so I thought “I should blog about this”.

Byte-Order Mark

The BOM is the character returned by Encoding.UTF8.GetPreamble(). Microsoft’s documentation explains:

The Unicode byte order mark (BOM) is serialized as follows (in hexadecimal):

  • UTF-8: EF BB BF
  • UTF-16 big endian byte order: FE FF
  • UTF-16 little endian byte order: FF FE
  • UTF-32 big endian byte order: 00 00 FE FF
  • UTF-32 little endian byte order: FF FE 00 00

Converting these bytes to a string (Encoding.UTF8.GetString) allows us to check if the xml string starts with the BOM or not. The code then removes that BOM from the xml string.

A BOM is a bunch of characters, so what? What does it do?

From Wikipedia:

The byte order mark (BOM) is a Unicode character used to signal the endianness (byte order) of a text file or stream. It is encoded at U+FEFF byte order mark (BOM). BOM use is optional, and, if used, should appear at the start of the text stream. Beyond its specific use as a byte-order indicator, the BOM character may also indicate which of the several Unicode representations the text is encoded in.

This explanation is better than the explanation from Microsoft. The BOM is (1) an indicator that a stream of bytes is Unicode and (2) a reference to the endianess of the encoding. UTF8 is agnostic of endianness (reference), so the fact that the BOM is there and causing problems in C# code is annoying. I didn’t research why the UTF8 BOM wasn’t stripped from the string (XML is coming directly from SQL Server).

What is ‘endianness’?

Text is a string of bytes, where one or more bytes represents a single character. When text is transferred from one medium to another (from a flash drive to a hard drive, across the internet, between web services, etc.), it is transferred as stream of bytes. Not all machines understand bytes in the same way, though. Some machines are ‘little-endian’ and some are ‘big-endian’.

Wikipedia explains the etymology of ‘endianness’:

In 1726, Jonathan Swift described in his satirical novel Gulliver’s Travels tensions in Lilliput and Blefuscu: whereas royal edict in Lilliput requires cracking open one’s soft-boiled egg at the small end, inhabitants of the rival kingdom of Blefuscu crack theirs at the big end (giving them the moniker Big-endians).

For text encoding, ‘endianness’ simply means ‘which end goes first into memory’. Think of this as a direction for a set of bytes. The word ‘Example’ can be represented by the following bytes (example taken from StackOverflow):

45 78 61 6d 70 6c 65

‘Big Endian’ means the first bytes go first into memory:

45 78 61 6d 70 6c 65
<-------------------

‘Little Endian’ means the text goes into memory with the small-end first:

45 78 61 6d 70 6c 65
------------------->

So, when ‘Example’ is transferred as ‘Big-Endian’, it looks exactly as the bytes in the above examples:

45 78 61 6d 70 6c 65

But, when it’s transferred in ‘Little Endian’, it looks like this:

65 6c 70 6d 61 78 45

Users of digital technologies don’t need to care about this, as long as they see ‘Example’ where they should see ‘Example’. Many engineers don’t need to worry about endianness because it is abstracted away by many frameworks to the point of only needing to know which type of encoding (UTF8 vs UTF16, for example). If you’re into network communications or dabbling in device programming, you’ll almost definitely need to be aware of endianness.

In fact, the endianness of text isn’t constrained by the system interacting with the text. You can work on a Big Endian operating system and install VoIP software that transmits Little Endian data. Understanding endianness also makes you cool.

Summary

I don’t have any code to accompany this post, but I hope the discussion of BOM and endianness made for a great read!

Flattr this!

DRY! GenericComparer for sorting Generic Lists

I’m a pretty firm believer in the Ruby/Ruby on Rails idea of DRY (“Don’t Repeat Yourself”).

That said, I get pretty tired of writing comparers for sorting lists and generic lists. Every one of these comparers is exactly the same: you specify a list of properties related to the object and a sort direction, then call compare on those properties.

This can be changed with a little reflection:

   public class GenericComparer<T> : IComparer<T>
    {
        public string SortExpression { get; set; }
        public int SortDirection { get; set; } // 0:Ascending, 1:Descending

        public GenericComparer(string sortExpression, int sortDirection)
        {
            this.SortExpression = sortExpression;
            this.SortDirection = sortDirection; 
        }
        public GenericComparer() { }

        #region IComparer<T> Members
        public int Compare(T x, T y)
        {
            PropertyInfo propertyInfo = typeof(T).GetProperty(SortExpression);
            IComparable obj1 = (IComparable)propertyInfo.GetValue(x, null);
            IComparable obj2 = (IComparable)propertyInfo.GetValue(y, null);

            if (SortDirection == 0)
            {
                return obj1.CompareTo(obj2);
            }
            else return obj2.CompareTo(obj1); 
        }
        #endregion
    }

This is a code snippet I love to have in my arsenal.

Here’s how you use it:

List<MyObject> objectList = GetObjects(); /* from your repository or whatever */
objectList.Sort(new GenericComparer<MyObject>("ObjectPropertyName", (int)SortDirection.Descending));
dropdown.DataSource = objectList;
dropdown.DataBind();

Note that Sort returns void. This means you can’t throw this to the right of a DataSource call. You’d have to call this, then call your DataSource/DataBind.

Flattr this!

Useful Serialization Methods of LINQ to SQL objects

I am just going through a project for my senior “Projects in Information Systems” class, commenting most of the complex logic. I came across these two methods I wrote to serialze LINQ to SQL objects. They came in pretty handy, so I thought I’d share even though they don’t have error-handling. I’m not too embarrassed.

The way I used them was pretty ghetto, I was having troubles getting the objects I needed to serialize to be used in a Session object (I thnk it had something to do with an Order_Details table having two foreign keys to the same table, Ledger, for an IN and OUT field). Anyway, I serialed the object to XML and just stored the whole string into a listBox’s value. I know that’s begging for poor performance, and overriding the data validation checks is opening the ASP.NET page up for security issues, but the only people who are going to use this are me and my instructor. So, it was a quick work around. I could have stored it in a Session Object as a string, but then I would have had to call some funky work around for the listBox. Anyway, enough about that.

/// <summary>
/// Serializes a LINQ object to an XML string
/// </summary>
/// <typeparam name="T">Type of the Object</typeparam>
/// <param name="linqObject">The LINQ object to convert</param>
/// <returns>string</returns>
public static string SerializeLINQtoXML<T>(T linqObject)
{
   // see http://msdn.microsoft.com/en-us/library/bb546184.aspx
   DataContractSerializer dcs = new DataContractSerializer(linqObject.GetType());

   StringBuilder sb = new StringBuilder();
   XmlWriter writer = XmlWriter.Create(sb);
   dcs.WriteObject(writer, linqObject);
   writer.Close();

   return sb.ToString();
}

/// <summary>
/// Deserializes an XML string to a LINQ object
/// </summary>
/// <typeparam name="T">The type of the LINQ Object</typeparam>
/// <param name="input">XML input</param>
/// <returns>Type of the LINQ Object</returns>
public static T DeserializeLINQfromXML<T>(string input)
{
   DataContractSerializer dcs = new DataContractSerializer(typeof(T));

   TextReader treader = new StringReader(input);
   XmlReader reader = XmlReader.Create(treader);
   T linqObject = (T)dcs.ReadObject(reader, true);
   reader.Close();

   return linqObject;
}

Flattr this!

Displaying multiple fields in a Dropdownlist’s DataTextField

I’ve encountered this problem on occasion, where I want to display more than one field in a dropdownlist’s DataTextField property. In the past, I’ve overcome this problem by rewriting a SQL statement, or adding another column in the database itself to accomodate my needs.

In one of my classes (INFO 465: Projects in Information Systems @ VCU), we’re working from a database which we’re not allowed to change. The reason we can’t change it is because the instructor uses the same database for his examples. I could just write another method into my business logic layer, but it would get cluttered pretty quickly.
So, I decided to make use of LINQ and found the following solution:

  ddlUsers.DataSource = BLL.Employee.GetEmployees()
                  .Select(be => 
                      new { 
                          ID = be.Id, 
                          FullName = String.Format("{0}{1}{2}", 
                                      be.LastName,
                                      (!string.IsNullOrEmpty(be.FirstName) ? ", " : string.Empty),
                                      be.FirstName)
                          }).AsEnumerable();

This takes the List of Business Entity objects and uses the LINQ select statement to generate an implicit/anonymous object from that. The only downfall to this method is that the new object only has local scope. But, since I’m only using this in a dropdown, it’s a pretty nifty trick.

Flattr this!

LINQ: more like "Luckily I Never Quit"

I’ve been using LINQ quite a bit lately. As the blog title says: luckily, I never quit.

There are a lot of things to get used to with LINQ. For one, it uses deferred queries. When you make a change, you must submit changes before the change is reflected in the database. Yes, this mirrors the actions when working directly with a database (COMMIT), but when you’re thinking of it as application logic, it seems counter-intuitive.

<aside>

At school, I’m in a class called INFO 465: Projects in Information Systems. The class is slow-going. At VCU, there are three tracks of studies in Information Systems: Application Development, Business Analysis, and Network Administration. Unfortunately, most of the people in BA and Networking are there because they don’t want to do any programming at all. You would think a class geared to using your course skills would accommodate each track equally?

That’s not the case in this class. There are two projects. One is individual work and the other is group work. The individual project is really just a large programming assignment to create an “Enterprise System” for a landscaping company. Luckily, the company consists of the landscaper, a truck, and his helper.

Anyway, to get back on track, we’re allowed to do our project any way we want. Most of the people in the class chose to do Windows Forms using VB.NET examples provided by the teacher. Good luck with that! I chose to use ASP.NET, AJAX, Web Services, and LINQ. I chose this route because I wanted this class to be a learning experience, not just a copy-paste session for 3 hours a week.

</aside>

Back to LINQ. To be quick about development, I’ve decided to use GridViews, DetailsViews, and LinqDataSources. Every time I try to link a LinqDataSource to a GridView’s selected key, I get this message:

Operator '==' incompatible with operand types 'Int32' and 'Object'

Here is an example of the LinqDataSource that throws this error:

<asp:LinqDataSource ID="srcSelectedOrder" runat="server"
        ContextTypeName="INFO465_First_Web.DatabaseMaps.OrdersDataContext"
        EnableDelete="True" EnableInsert="True" EnableUpdate="True" TableName="Orders"
        Where="Id == @Id">
        <WhereParameters>
            <asp:ControlParameter ControlID="GridView1" Name="Id"
                PropertyName="SelectedValue" Type="Int32" />
        </WhereParameters>
    </asp:LinqDataSource>

The problem is where you have the Where clause “Id == @Id”. For some reason, the data source thinks @Id is an object instead of an Int32, even though Type=”Int32″ in the WhereParameters.

The fix is the change this to Where=”Id == Int32?(@Id)”. I don’t know why this is. I mean, if you’re taking a primary key as the DataKeyName, it is being passed as Int32 in the WhereParameters, why do you have to cast it? Maybe the LinqDataSource automatically converts Id == @Id to an SQL query that would look like “Id == ‘2’”, and you have to cast to Int32 explicitly so it will look like “Id == 2” in the query? Again, this is another place that LINQ just seems backwards.

Here is the fixed data source:

<asp:LinqDataSource ID="srcSelectedOrder" runat="server"
        ContextTypeName="INFO465_First_Web.DatabaseMaps.OrdersDataContext"
        EnableDelete="True" EnableInsert="True" EnableUpdate="True" TableName="Orders"
        Where="Id == Int32?(@Id)">
        <WhereParameters>
            <asp:ControlParameter ControlID="GridView1" Name="Id"
                PropertyName="SelectedValue" Type="Int32" />
        </WhereParameters>
    </asp:LinqDataSource>

(I always format my code with http://www.manoli.net/csharpformat/)

Flattr this!