HTML 5 – The Offline Challenge

December 19, 2013 3:54 pm

HTML 5 – The Offline Challenge

I. Introduction

Some time ago our team had to add a new feature to one of our web projects. There was a request to add a capability to allow the users to use the application in “offline” mode. It was a great opportunity to dive into HTML 5 and more specifically the “offline” part of it. The new features of HTML5 are pretty interesting, but also “tricky” and this is why I would like to share some of the challenges we met.

First, I would like to say a few words about the technology on which the project is build. That is ASP.NET application which is storing data in SQL Server database. This is not a public application and only authorized users can access it. The system allows the users to go through their business process by entering the information in some structured web forms and at the end it allows them to generate a Word document, which is their final product. The web application is completely client-orientated and there is a huge usage of jQuery and AJAX. The AJAX calls are handled by WCF services. Frankly, this architecture made the implementation of the “offline mode” pretty straightforward.

The requirement was to allow the users to view and modify their data while offline. So, we had to get the main functionality of the application in offline mode. What are not included in the offline mode are the various administrative functions and the generation of output document.

Note that HTML 5 adds technical capabilities to load a particular web page while being offline and to store some data locally into the browser. It is up to you to decide what will do your application in “offline mode” and how exactly it will be implemented. It is not some kind of a magic that will just allow using a particular web application, which is designed to work online, when there is a lack of Internet connectivity. Decisions should be made about what information will be available in “offline mode”, when and how that information will be stored locally in the browser, what will be able the user to do with that data and finally how this data will be synchronized back into the database.

The “offline scenario” of our project is:
- While working online and browsing the main entities in the system (we call them “workshops”), automatically download and store the data for each workshop in the browser’s local storage;
- Catch when the Internet connection is gone and provide the user with a link to continue working in “offline mode”;
- On the login screen add similar connectivity check and alert the user if there is not Internet connection. In that case instead of showing the standard login prompt show them a link to go “offline”;
- In “offline mode” the system lists all workshops that have been already downloaded locally in the browser. It is a requirement to have been already online and have downloaded some data locally before being able to work with that data in “offline mode”;
- Flag any modifications on the offline data and when the user logs into the online system then the system checks if there are any data modifications in the browser’s local storage. If the user has done some modifications in “offline mode” then he is navigated to a small wizard that is asking to either commit the offline modifications or ignore them.

II. Application Cache

The web pages are normally loaded from the web server when an Internet connection is available. So, our first challenge was to get them showing in the user’s browsers even when there is no connectivity available. To get this happen we used the new feature of HTML 5 called “CACHE MANIFEST”. The cache manifest file is a simple text file stored on the web server. Note that the webserver should serve it as the “text/cache-manifest” MIME type to avoid having some browsers ignoring it. It is a list of all resources that will be required in “offline mode”. To instruct the browser to process the cache manifest file use the new manifest attribute on the <html> element:

...

Almost all modern browsers understand this and when they see a manifest file then they begin to download and cache all the resources listed in that file locally. The next time the user navigates to the target web address the browsers checks if it has been already cached. If this is the case and the resources have been already downloaded and cached then the browsers displays the cached web page. This is how a web page could be loaded even when the Internet connection is broken.

The cache manifest file must list all resources required to load a page in the “offline mode” of your web application. This means to include all HTML pages, images, JavaScript and CSS file, etc. A sample cache manifest file could look as follows:

CACHE MANIFEST
#rev 10

CACHE:
login.asp
PWStyle.css
/Roadmap/Styles/JQuery/smoothness/jquery-ui-1.9.2.custom.min.css
/Roadmap/Styles/Site.css
/Roadmap/Styles/jqtree.css
/Roadmap/jqTree/jqtree.css

Each cache manifest file begins with the “CACHE MANIFEST” text. Next, there could be few sections. These are NETWORK, CACHE and FALLBACK. The NETWORK section lists all resources which should never be cached. They won’t be available in offline mode and that content will be always downloaded online. The CACHE section explicitly says which resources should be cached and stored in the browser so that they will be available even when offline. The FALLBACK section is used to define what should happen if a particular resource is not available online and hasn’t been cached successfully. In that case the FALLBACK section shows which resource should be loaded instead. It could be helpful to specify that “/” should be replaced by “offline.html”, for example. In that case any web page which hasn’t been cached locally for a reason will display the “offline.html” page instead of showing the browser’s standard error message.

It is important to know that once the browser has cached the offline resources then it won’t update them until the cache manifest file is changed. Most often the list of files does not change and usually there are updates to some of the existing web resources. For example, in our application we usually have to update some of the existing JS file. In that case, to indicate that the browsers should re-download the updated resources and refresh the local cache we are just changing the “version” of the cache manifest file. It is just a number stored in a single comment line:

#rev 10

The browser check the manifest file each time the web page is loaded and when the manifest file is changed it starts updating the resources. However, there was an exception for Firefox. It was caching even the manifest file. To prevent this from happening we had to configure the web server to put the “expires: 0” HTTP header while serving the manifest file. This instructs the browser to not cache the manifest file.

Another interesting point is the actual process of updating the application cache. We choose to put the manifest file on the login (landing) page of our application. In that way each time the user is about to log into the system the browser will check for a new version of the manifest file and eventually download the latest version of the offline part of the system. The process of downloading all the offline resources could take some time (depending on their size and the network speed). In our case the entire offline application consists of about 100 web resources and downloading all those files via Internet is taking about 20 seconds. And it is great that browsers support few events which indicate the state of the application cache update. We used those events to notify the user while the browser is updating the cache and if there are any errors during that process. The window.applicationCache object supports events like “cached”, “checking”, “downloading”, “error”, etc. By adding some JS code like the following one these events can be intercepted and you can add some indication on the page:

var cache = window.applicationCache;
cache.addEventListener('cached', logCacheEvent, false);
cache.addEventListener('checking', logCacheEvent, false);
cache.addEventListener('downloading', logCacheEvent, false);
function logCacheEvent(e) {
...
}

Adding such a code to the page could help to debug the process of updating the cache as well.

III. Local Storage

The other part of the offline application architecture is data and its storage in offline mode. HTML5 and modern browsers bring the “local storage” feature. The local storage allows developers to store data locally in the browser. It is done by using the localStorage JavaScript object. The concept is similar to well-known cookies, however, local storage allows much more data to be stored per website. Most browsers allow storing up to 5 MB per website which should be enough in most cases. The object allows storing values in a key-value approach:

localStorage["mykey"] = "myval";
alert(localStorage["mykey"]);

It is important to notice that data stored in localStorage is retrieved as string values. Each developer must decide how exactly to construct and store the application specific data into localStorage. One approach could be to split the data in particular values and construct the keys dynamically which should be similar to how the data is stored in a HTML form where each input has a unique ID and value. We chose another possible way where a complete JS object is presented as a string in a JSON format and stored in a single localStorage key. In that way we avoid dealing with a complex keys structure which saved us a lot of work. In our project each main entity (or a “workshop”) is presented as a single JavaScript object. To get such an object in a JSON string we use the JSON.stringify() capability:

localStorage["Workshop_" + workshopId] = JSON.stringify(workshopObj);

Constructing the JavaScript object could be done in various ways depending on the particular project technology. In our case we build the object directly on the server side in C# and return it to JavaScript by calling WCF services from jQuery. This allows us to easily get our custom C# object to be transferred and stored into the browser local storage.

In our specific project, the entire “workshop” entity gets downloaded and stored in local storage when loading the initial screen for each “workshop”. Next, on each page where some fields get updated we also update the particular data in local storage. This is done by loading the workshop object from localStorage, updating the specific properties and storing it back into the browser’s localStorage:

var wJSON = localStorage["Workshop_" + workshopId];
var w = JSON.parse(wJSON);
w.Name = newName;
localStorage["Workshop_" + workshopId] = JSON.stringify(w);

The offline module is very similar to the online part of the application. The UI is almost the same and also the code is very similar as a flow. The only difference is that instead of doing various Ajax calls to the server for loading and storing pieces of data, the offline code is just using localStorage to load and store the data. The main reason for making this process so straightforward was the fact that the online web application was written with the intensive use of JavaScript and all the data was transferred to and back from the server by Ajax calls. In the offline version of the client script we replaced the Ajax calls with localStorage manipulation.

IV. Sync

The next step, which completes the offline mode flow, is to synchronize the offline changes. For the needs of our application, we decided to keep this process as simple as possible. When the user modifies some data in offline mode then we raise a flag for that entity. Next, when the user choose to work online with the system the application check the entities stored in localStorage and if there is at least one entity which has been modified in offline mode then we redirect the user to the “sync” page. So, the user is not allowed to work online if there are any pending offline changes in his browser. On the sync page we have provided the user to either commit or ignore the offline changes. After, syncing the changes we clear the localStorage and redirect the user to online mode where a fresh copy of data will be downloaded. To clear all the localStorage data there could be used the following line of code:

localStorage.clear();

V. Final thoughts

At the end of this post, I would like to notice few things which were specific to our application flow, but could be useful in lots of uniform web applications which are getting some advantages of the new offline capabilities of HTML5.

In this case, the application is for private usage and users are authenticated before being able to work with the system. However, the offline part of the application works entirely in client’s browser without a connection to the server. This means that there is no way to implement authentication. So, the offline part of the application is anonymous and everyone who has an access to the client device and web browser could access the data stored locally. Users should be aware of this and should not use the system from public devices where sensitive data could be stored locally. That is why we implemented an option to turn off the local data storage per user session.

Offline data is stored locally in the browser and if the user decides to clear the browser’s cache it will clear the offline data as well. Users should be aware of this and should take care for any offline data before deciding to clear their browser cache for a reason (e.g. removing the history and downloaded data from another third-party websites).

If the live system is regularly updated it is possible to change the data structure. In that case it is likely to have a user who has downloaded an old version of the data and tries to sync it into the latest online system. We had such a case in the past and to catch similar situations we added a version number which is stored into the locally downloaded data. Next, when the user tries to submit that data the system checks its version and the current system version. If there is a version difference a warning message is shown before allowing the old-structured data to be uploaded into the system.

Also, it is possible if the particular entity gets updated online during the time a user is working offline. In this situation it is possible if the user who is submitting offline changes later to override some recent changes done in the online system by other users. In order to avoid this we have added some timestamps about when data has been downloaded and when it was last modified in the online system. If there are any differences the system shows alert messages asking the user if he prefers to either override the recent changes or keep them. This depends on the particular application implementation but it is something you should consider when building offline web applications.

I hope that information was helpful and you enjoyed reading it!

HTML 5 – The Offline Challenge

Recent Posts

Archives