Tuesday, January 30, 2007

Enterprise Data Access with the Microsoft .NET CF/Mobile database

Microsoft's .NET Compact Framework provides Pocket PC developers with the tools needed to create robust applications, and offers several options for accessing and storing data from your enterprise database. This article will examine the most popular of these options, and is intended to be an overview of the subject for anyone contemplating using Pocket PCs in an enterprise setting.

The .NET framework is a platform which speeds development by giving developers access to a library of tested Classes and Functions, which they can use and re-use in their applications. Besides providing this "pre-built code," it also manages the code the developer writes by handling low-level functions such as memory allocation and disposal. The .NET Compact Framework (.NET CF) is a subset of the .NET framework, and is designed to run on Windows CE devices like the Pocket PC.

Before examining how to access your enterprise database, let's first discuss how data is stored on the device once it's been received. An understanding of storage options may help in choosing an access method, and give an appreciation of how .NET offers an end-to-end solution.

Intermittent Access

The reality of building applications for wireless or mobile devices is that access to your network or to the Internet is often intermittent. You may have some control over your environment and wireless networks in say, a factory or restaurant setting, but more often you are accessing data over the Internet, and then not continuously.

Retrieving and updating data may only occur at the start and end of a shift, in which case wireless functionality is not required. Regardless of your needs, a robust application should be able to store the data locally, use and manipulate it, and send changes (updates, inserts, deletions) back to the data source when a connection is available.

This introduces issues related to moving large amounts of data, and update conflicts, i.e. when several clients (Pocket PC applications) retrieve and modify the same data. In this conflict situation, it becomes a race back to the database, where the slowest user actually wins, and overwrites the faster user! Depending on your business this may not be an issue, but it's something to consider when analyzing your business requirements.

Central to the .NET framework is the idea of being disconnected from the central database and still being data-driven. Its central data object is the Dataset, which is like an in-memory representation of a simple database, complete with tables (called Datatables, containing rows and columns), relations (to create parent-child relationships between Datatables) and constraints (unique keys and foreign keys). The Datatable even has associated events, allowing you to control when the data in columns and rows changes in order to perform validation, enforce business rules, etc.

Mobile device constraints

Pocket PC development is much more constrained by limited memory and processing power than development for desktop computers. Constantly accessing data from a database adds the overhead of connecting, querying and retrieving the data each time. The Dataset allows you to store data in memory, with a powerful system for manipulating it. Changes to rows and columns in a Datatable are tracked, and you can use a DataAdapter object to retrieve the Dataset and, later, to automatically send the changes back to the database from which the data came. Once data is in the Dataset, you can easily bind it to .NET CF controls on the screen like the DataGrid. The point is that Microsoft has built its data model around the Dataset and the Datatable, and this is particularly useful to Pocket PC developers because of limited device resources.

Of course the Dataset is still in-memory data storage, and it should be saved elsewhere, i.e. the database from which it came. Otherwise, if the device does a soft reset, all data is lost. The Dataset is closely linked to standard XML, so you can easily load XML documents into a Dataset, and also save the contents of the Dataset to an XML file. So if you're not storing much data on the device, this may be an option.

SQL Server CE

To store larger amounts of data, as well as to have the features of a powerful Database Management System (DBMS), Microsoft offers SQL Server CE. It is a scaled down version of Microsoft's Enterprise DBMS SQL Server 2000. It supports a subset of the same T-SQL language, so there's little learning curve beyond understanding its limitations. You can use T-SQL to create databases, alter tables and enforce referential integrity, as well as select, join, insert, update, and delete data. It supports auto-incrementing (identity) columns. This alone makes it much more powerful than Pocket Access. It also has multiple column indexes, transactions, password protection of databases, and even 128-bit encryption of your data! This is impressive for a DBMS that only takes up 1.5 MB of space on the device.

What are its limitations? It doesn't support more than 1 connection at a time (not really a problem) and it doesn't support stored procedures, although you can use parameterized queries in your command statements, allowing you to insert values. Microsoft says it can support databases of up to 1 GB in size, but I can't imagine testing that limit on a Pocket PC!

Moving data from enterprise to device

Now that storage has been examined, what are the techniques that .NET provides for getting data from your enterprise into the hands of your mobile workforce? We'll start with the simplest and proceed to the more powerful and complex.



Fig. 1: Four methods of enterprise data access with the Microsoft .NET Compact Framework

Direct Querying

The simplest way to access enterprise data from the device is by directly querying it, providing the data is stored in a Microsoft SQL Server database and you can connect to it through a LAN. .NET CF provides the SqlClient class, which allows you to access data in the same manner as in a Web or desktop application on the network. Updates are immediate-you are directly accessing the source. It's fast, but doesn't give any support for update conflicts, lacks support for other types of databases (Oracle, etc), doesn't provide any protection if the connection fails, and requires your enterprise database to be directly accessible-which it would not be over the Internet. But if your users are working in a controlled atmosphere, i.e. in a warehouse, or are wirelessly connected through a VPN, this may be an option.

XML Web Services

The second option is XML Web Services, and this is in some ways the most flexible approach. You can use different server platforms (i.e. Linux) and different backend databases (mySQL, Oracle, etc), as these details are transparent to the mobile client. Data moves back and forth over the Internet (or Intranet) as XML (text) and can use the SOAP (Simple Object Access Protocol) standard. .NET provides a rich Web Service architecture, and by using Microsoft Visual Studio 2003, a developer can easily generate a proxy class that allows them to access functions on the server as if they were local to the device.

When you use the Microsoft ASP.NET Web Service platform on the server, it becomes possible to pull entire Datasets of information. Web Services also allows the pulling of custom objects, providing they can be transformed into XML (serialized). In this way, you could create a Customer object with a table of related orders as a Dataset on the server, and pull it down to the device, allowing you to further synchronize business logic between your applications.

Web Services provide a layer between your device and the enterprise database (using a Web server) where security and business logic can be controlled. But for moving large amounts of data, Web Services are somewhat slow and they don't allow any built-in means of synchronizing data with the enterprise. Here's where it gets really interesting! While Web Services are powerful, Microsoft provides faster and more powerful tools for getting data to and from the Enterprise.

SQL Server CE to the rescue

SQL Server CE offers two powerful tools for enterprise data access: Remote Data Access (RDA) and Merge Replication. Both give mobile devices access to an Enterprise SQL Server Database over HTTP via an IIS Web Directory (see Fig. 1). SQL Server CE Server Tools must be installed on the Web Server for this to take place. Both share the benefit of data being compressed for faster download (as opposed to the plain text of Web Services), and both can recover from communication failures by restarting the transmission from the last transmitted block within a timeout period.

Remote Data Access

Remote Data Access (RDA) allows you to pull data from a SQL Server into SQL Server CE by submitting a SQL statement to the enterprise database. Changes to the locally stored data can be optionally tracked, allowing you to push the changes back to the server. The push can be designated as batched, whereby no updates are performed if one row fails to be accepted (because of data problems, not because of a conflict with data updates from other users). You can also turn batching off and have individual failed upload rows sent to another table for your application to deal with later.

RDA also allows you to submit SQL statements (such as insert, update and delete) directly to a SQL Server. When you pull data to the device, you must first drop the local table and replace it with the new data (an all-or-nothing approach). With some custom coding and knowledge of your data model though, you could create a solution which only pulls down new or changed data (i.e. constructing a Select statement to filter the data) and merges it into the main table, but RDA wasn't inherently designed to do this. When you need to bring large amounts of data from the enterprise database, and expect it to change frequently, Merge Replication may be a better solution.

Merge Replication

Merge Replication allows multiple SQL Servers, which could be in different physical locations, to work autonomously with their own copies of data and later merge any changes back into one uniform result. It does this by having a central database provide a Publication (a collection of tables which can be filtered to expose only the relevant data), to which other databases become "subscribers." Applying an initial snapshot of the schema and data allows incremental changes to be tracked, resulting in only changed information needing to be subsequently moved across the network during synchronization of the databases. When multiple subscription databases attempt to update the same rows of data, Merge Replication also allows you to apply conflict resolution policies to decide which conflicting data changes will update the main publication. To do this, you can apply default resolvers, or write your own custom ones as stored procedures or custom DLLs.

Merge Replication is extended to mobile clients by allowing SQL Server CE to act as an anonymous subscriber to an enterprise Publication, and the client determines when the synchronization takes place (since only the client knows if it is connected). You can create Publications specific to certain groups or individuals, thus giving them access to only the data they need. You can also pass a value to the publication to further filter only the data needed. After applying the initial snapshot of data to the device, only changes to the master data are brought down on subsequent synchronizations. This is advantageous when working with large data sets that need to be brought down in their entirety if updated using RDA. Of course, you should always try to limit the amount of data downloaded to the device to only that which is necessary for the user. For example, you could set up multiple Publications showing customer data for differing regions, allowing salespeople to download only the data relevant to them. This takes strain off processing and searching on the device, as well as the obvious time/bandwidth issues in moving large amounts of data.

RDA is a powerful and light-weight solution for most situations. But it requires that the working data be downloaded in its entirety during the pull operation, and it doesn't inherently provide for conflict resolution when multiple clients update the same records at the same time. Often though, clients require read-only data, or update data that is relevant only to them and that does not conflict with other users. Merge Replication introduces overhead on the server, and to a lesser extent on the Pocket PC. It can sometimes complicate what could be a simple RDA solution, but it's an elegant solution when the complexity of the problem requires it.

Choose the best approach for your needs

.NET offers several solutions to help you move data between the enterprise and mobile users. Each one has its strengths, and often your solution will employ more than one method. You may need to download large amounts of customer data to the device, thus requiring the speed of RDA. But if you don't need to modify all that data but only, for instance, create new orders for customers, a simple upload using Web Services or even a direct link to a SQL sever on a LAN would work.

You may not even be concerned about wireless access: your delivery drivers might receive orders at the start of the shift using ActiveSync at the office, then carry the data and upload changes to the database at the end of the day. Or you may need your application to be constantly checking for a connection (in the background) and downloading/updating enterprise data. Whatever the challenge, the .NET Compact Framework offers a powerful and flexible platform for solving your mobile business needs.

No comments: