Tuesday, September 9, 2008

Language Integrated Query (LINQ)




Your Ad Here

Language Integrated Query (LINQ)

A key element of almost any application is data. Inevitably, data needs to be listed, sorted, analyzed, or displayed in some fashion. It is the nature of what we, as programmers, do. We accomplish this by manually performing the appropriate operations and relying on the current functionality provided by the existing .NET Framework. We also rely heavily on the use of external data sources, such as SQL Server or XML files. Before LINQ, writing code to query a data source required the query to be sent to the data source as a string where it would be executed. This resulted in a separation of functionality and control between the application and the data. The .NET Framework has always provided functionality (such as ADO.NET) that made things fairly painless, but it required that developers have intimate knowledge of the data source and its respective query language to be able to accomplish their goals. Most developers have become used to working with data in this manner and have adapted appropriately. Language Integrated Query (LINQ, pronounced “link”) has positioned itself to resolve this situation and is one of the major new additions to the .NET Framework 3.5. LINQ, at its core, is a set of features that, when used together, provide the ability to query any data source. Data can be easily queried and joined from multiple and varying data sources, such as joining data gathered from a SQL Server database and an XML file. The initial release of VB 9.0 includes several APIs that extend LINQ and provide support for the most common data sources, as listed in Table 6-1. LINQ was designed to be easily extended, which you can take advantage of to create full query support for any other data sources not covered by the included APIs.
LINQ to Objects, represented by the System.Linq namespace, extends the core LINQ framework and provides the mechanisms necessary to query data stored in objects that inherit IEnumerable(Of T). Querying IEnumerable objects is also supported but requires an extra step, which is covered in recipe 6-2. A standard query consists of one or more query operators that query the given data source and return the specified results. If you have any familiarity with Structured Query Language (SQL), which LINQ closely resembles, you will quickly recognize these standard operators. Here is an example query, assuming names is an IEnumerable(Of String):
Dim query = From name In names
This query uses the From clause, which designates the source of the data. This clause is structured
like a For...Next loop where you specify a variable to be used as the iterator (in the case, name)
and the source (in this case, names). As you can see by the example, you do not need to specify the data type for the iterator because it is inferred based on the data type of the source. It is possible to reference more than one data source in a single From clause, which would then allow you to query on each source or a combination of both (see recipe 6-11 for more details).
It is important to note that the previous example does not actually do anything. After that line
of code executes, query is an IEnumerable(Of T) that contains only information and instructions that
define the query. The query will not be executed until you actually iterate through the results. Most
queries work in this manner, but it is possible to force the query to execute immediately.
Like name, the data type for the results (query) is also being inferred. The data type depends on
what is being returned by the actual query. In this case, that would be an IEnumerable(Of String)
since name is a String. When creating queries, you are not required to use type inference. You could
have used the following:
Dim query As IEnumerable(Of String) = From name As String In names Select name
Although that would work, type inference makes the query appear much cleaner and easier to
follow. Since the example returns a sequence of values, you execute the query by iterating through it
using a For...Next loop, as shown here:
For Each name in query
...
Next
If you need to ensure that duplicate data in the source is not part of the results, then you can add
the Distinct clause to the end of your query. Any duplicate items in the source collection will be
skipped when the query is executed. If you did this to the previous example, it would look like this:
Dim query = From name In names Distinct
Both of the previous example queries use what is known as query syntax, which is distinguished
by the use of query clauses (such as From or Distinct). Query syntax is used primarily for appearance
and ease of use. When the code is compiled, however, this syntax is translated to and compiled as
method syntax.
Behind all query operators (clauses) is an actual method. The exception to this rule is the From
clause, which simply translates to the For...Next loop shown previously. These methods are actually
extension methods that extend IEnumberable(Of T) and are found in the System.Linq.Enumerable
class. The previous example would be compiled as this:
Dim query = names.Distinct
Query syntax is much easier to understand and appears cleaner in code, especially with longer
or more advanced queries. However, with some query operators, method syntax can give you more
fine-tuned control over the operation itself or the results.
The Code
The following example queries the array of Process objects returned from the Process.GetProcess
function and displays them to the console:
Imports System
Imports System.Linq
Imports System.Diagnostics
Namespace Apress.VisualBasicRecipes.Chapter06
Public Class Recipe06_01
Public Shared Sub Main()
' Build the query to return information for all
' processes running on the current machine. The
' data will be returned as instances of the Process
' class.
Dim procsQuery = From proc In Process.GetProcesses
' Run the query generated earlier and iterate
' through the results.
For Each proc In procsQuery
Console.WriteLine(proc.ProcessName)
Next
' Wait to continue.
Console.WriteLine()
Console.WriteLine("Main method complete. Press Enter.")
Console.ReadLine()
End Sub
End Class
End Namespace

0 comments: