LINQ to SQL Outer Joins

Tuesday, 21 October 2008 11:32 by slanford

Transitioning from standard SQL to LINQ to SQL is in many ways very intuitive, both conceptually and syntactically. Some routine tasks are however decidedly unintuitive, at least in comparison to standard SQL. One such task is constructing outer joins. As you will soon see the LINQ to SQL syntax for performing outer joins is far less intuitive for most people than the OUTER JOIN keyword syntax of standard SQL. Hopefully in a future version of LINQ to SQL Microsoft will give us an OUTER JOIN operator but until then here are four ways to do outer joins in LINQ to SQL.

Suppose we have a Patients table and a Claims table where a patient can have zero or more claims and we want to return all patients, regardless of whether they have a claim, and any claim information we have if any. This is of course very simple to do in standard SQL with an outer join. Let's see how to accomplish this with LINQ to SQL.

Method 1: Using a Group Join with DefaultIfEmpty()

            MedicalDataContext db = new MedicalDataContext();
            var query = (  from p in db.Patients
                           join c in db.Claims on p.UnqPatientId equals c.PatientId into claims
                           from c in claims.DefaultIfEmpty()
                         select new { Patient = p, Claim = c });
 
            foreach (var item in query)
            {
                System.Diagnostics.Debug.WriteLine(String.Format("Account {0}: #{1}.", item.Patient.AccountNumber, (item.Claim == null ? String.Empty : item.Claim.ClaimNumber)));
            }

A group join is a join that uses the into operator to coalesce parent and child information into a single hierarchical collection so you end up with parent information appearing only once in the collection as opposed to flattened out into multiple instances as you would normally get. To convert this group join into a left join we simply add another from clause to select from the group and apply the DefaultIfEmpty() method to force the inclusion of null child references where parents have no children. Note that this method of performing an outer join produces a flattened result.

Method 2:  Using GroupJoin()

            MedicalDataContext db = new MedicalDataContext();
            var query = db.Patients.GroupJoin(db.Claims,
                                              p => p.UnqPatientId,
                                              c => c.PatientId,
                                              (Patient, Claims) => new { Patient, Claims });
 
            foreach (var item in query)
            {
                System.Diagnostics.Debug.WriteLine(String.Format("Account {0} has {1} claims.", item.Patient.AccountNumber, item.Claims.Count()));
            }

In the above example I am using the GroupJoin() method to define my outer join. The first parameter specifies the inner table to join on, the second parameter specifies the key column of the outer table, the third parameter specifies the key column of the inner table and finally the fourth parameter specifies what to return. Note that this method of performing an outer join produces a hierarchical result.

 Method 3:  Project Directly Into Object Hierarchy

            MedicalDataContext db = new MedicalDataContext();
            var query = (  from p in db.Patients
                         select new
                         {
                             Patient = p,
                             Claims = db.Claims.Where(c => c.PatientId == p.UnqPatientId)
                         });
 
            foreach (var item in query)
            {
                System.Diagnostics.Debug.WriteLine(String.Format("Account {0} has {1} claims.", item.Patient.AccountNumber, item.Claims.Count()));
            }

In this example I am using a Lambda Expression to project the claim data directly into the object hierarchy. Note that this method of performing an outer join produces a hierarchical result.

 Method 4:  Use Association Property

            MedicalDataContext db = new MedicalDataContext();
            var query = (  from p in db.Patients
                         select new
                         {
                             Patient = p,
                             Claims = p.Claims
                         });
 
            foreach (var item in query)
            {
                System.Diagnostics.Debug.WriteLine(String.Format("Account {0} has {1} claims.", item.Patient.AccountNumber, item.Claims.Count()));
            }

In this example I am using the Claims association property of the Patient object to project the claim data into the object hierarchy. Note that this method of performing an outer join produces a hierarchical result.

Categories:   LINQ to SQL
Actions:   E-mail | Permalink | Comments (0) | Comment RSSRSS comment feed

Returning LINQ to SQL Entities from a WCF Service

Monday, 6 October 2008 10:50 by slanford

When returning LINQ to SQL entities from a WCF service using the default properties for your LINQ to SQL entity model you will almost certainly encounter an immediate error when you attempt to call a service method. In fact, the only scenario I can think of in which you may not receive out-of-the box errors is if your model contains no relationships. For example, consider the following entity model and service method.

NOTE: The following scenario will only work if running service pack 1 for the .NET Framework 3.5 because the LINQ to SQL entity model is set to the default serialization mode of "None". This will work with SP1 because SP1 adds implicit DataContractSerializer support for Plain Old CLR Objects (POCO) by serializing public fields and read/write properties. Later in this post you'll see how to change the serialization mode to remove this restriction as well as solve the issue being presented.

LINQ to SQL Entity Model  (with Serialization Mode = "None")

        public List<Data.Employee> GetEmployees()
        {
            Data.NorthwindDataContext context = new Data.NorthwindDataContext();
 
            List<Data.Employee> employees = (  from e in context.Employees
                                             select e).ToList<Data.Employee>();
 
            return employees;
        }

WCF Service Method 

Now at this point if I call GetEmployees() from a client I would be greeted with a "The underlying connection was closed: The connection was closed unexpectedly." WebException. Why is this? The obvious answer is some exception occurred in the service method...but what? My first instinct was that it had something to do with the deferred loading mechanism of LINQ to SQL since the above scenario works without error if I simply remove the entity model relationship. Regardless of the cause I knew that the error must be happening during serialization since I could step through the service method without error. So as a quick way to view the underlying exception without having to enable tracing on the server I modified the service method to serialize my result inline.

        public List<Data.Employee> GetEmployees()
        {
            Data.NorthwindDataContext context = new Data.NorthwindDataContext();
 
            List<Data.Employee> employees = (  from e in context.Employees
                                             select e).ToList<Data.Employee>();
 
            //
            //Serialize the result to debug error
            //
            try
            {
                DataContractSerializer dcs = new DataContractSerializer(typeof(List<Data.Employee>));
                using (System.IO.MemoryStream ms = new System.IO.MemoryStream())
                {
                    dcs.WriteObject(ms, employees);
                }
            }
            catch (Exception ex)
            {
 
            }
 
            return employees;
        }

Modifed Debug Version of WCF Service Method 

Now when stepping through the service method I was able to see the actual exception which was not what I initally expected. The exception was "Object graph for type ... contains cycles and cannot be serialized if reference tracking is disabled." The cause of this error is due to circular references in the entity objects as a result of the defined relationship; Employee objects have a reference to one or more EmployeeTerritory objects and each EmployeeTerritory object has a reference back to its parent Employee object which can't be represented when serialized unless we enable the preservation of theses object references during serialization. There are two ways to enable object reference preservation for our LINQ to SQL classes.

1. The first way is to instruct the DataContractSerializer to preserve object references by setting the preserveObjectReferences of the appropriate constructor overload to true. For details on how to do this in WCF see Sowmy Srinivasan's post here. Note that even though this seems to work in my limited testing it is not supported. To quote the MSDN documentation here..."Unidirectional serialization is the only type of serialization supported by LINQ to SQL."

2. That brings us to the second and recommended way of enabling object reference preservation; change the Serialization Mode for the entity model to "Unidirectional" in Visual Studio.

With unidirectional serialization, as the name implies, bidirectional associations are not maintained. That is, the serialized object will only maintain a one-way association to avoid a circular reference. By convention, the property on the parent side of the association is serialized and the property on the child side of the association is not. So in my example the Employee.EmployeeTerritories[] property would be serialized but the EmployeeTerritory.Employee property would not.

There is however an unintended side-effect of changing the Serialization Mode. That is related data is not automatically fetched as the object graph is walked upon serialization. So in this example if we simply change the serialization mode to "Unidirectional" and do nothing else, what we get on the client side is a list of Employee objects with a null EmployeeTerritories[] property. Fortunately there is a relatively easy way to solve this by instructing LINQ to SQL to retrieve the EmployeeTerritory data along with the Employee data. To do this we need to specify the DataLoadOptions for the DataContext. Here is the modified WCF service method.

 
        public List<Data.Employee> GetEmployees()
        {
            Data.NorthwindDataContext context = new Data.NorthwindDataContext();
            System.Data.Linq.DataLoadOptions dlo = new System.Data.Linq.DataLoadOptions();
            dlo.LoadWith<Data.Employee>(e => e.EmployeeTerritories);
            context.LoadOptions = dlo;
 
            List<Data.Employee> employees = (  from e in context.Employees
                                             select e).ToList<Data.Employee>();
 
            return employees;
        }
Modified WCF Service Method To Include EmployeeTerritory Data In Results 

Now when we call this service method we get both Employee and EmployeeTerritory data at the client as expected.

Categories:   LINQ to SQL | WCF
Actions:   E-mail | Permalink | Comments (5) | Comment RSSRSS comment feed