How to Display Huge Amount of Data in a Web Grid Quickly
A Little Background
A client was asking me for a potential solution on how to display huge amount of data (say 10,000 records) in a web grid quickly. They have implemented similar solution using ASP.NET AJAX and were complaining about its performance. I had a couple of untested idea in mind on how to solve such a problem; so I spent some time to play around with the idea.
On Pulling Data & Bottleneck
To avoid pulling data from an actually web service / database tier, I created a quick in page random data generator that will spurt out x amount of JSON objects in an array. After doing some testing, I know the bottleneck has nothing to do with data generation but mostly due to browser rendering process.
Prototype 1 (The Not So Fast and Furious, but Still Hella Damn Fast):
I don’t particularly like this solution since it relies on the browser (IE) quirk mode, but it sort of work.
To Block Or Not To Block
In quirk mode, I found out that I can push tons of non block elements such as SPAN very fast into the browser rendering pipeline. Changing the element to block elements such as DIV will tremendously slow down the rendering process. I believe that this is caused by the browser layout engine, where it will re-layout the web page each time a block element is inserted into the DOM.
One Two Punches Combo
To further relieve some pressures from the processor during the screen rendering, I utilize setTimeout to batch the DOM insertion periodically. So, roughly, the rendering happened 10 times across a period of time (for 10,000 total rows) and each insertion call will insert about 1,000 rows of HTML elements into the DOM at one time. I found that using IE’s insertAdjacentHTML method worked faster than utilizing jQuery append method so I branched the code to do browser specific code.
Previously, the client had a lot of controls in each of the grid row to further slow down the browser rendering. To mitigate this problem, I decided to make the initial display as read only and create a single row template containing the necessary controls for editing. When needed (I.e. when the user click on a particular row), I overlay this control template on top of the particular read-only row and push the underlying data into the control template for editing and push it back once editing is done to the read-only row behind the template.
The Good Stuff
Using this technique, we have a pretty lightweight grid that can render huge amount of data to the screen quickly.
The Bad Stuff
Disadvantage for using this method is when you are dealing with row lower down in the grid (i.e. record # 10,000), there seems to be a noticeable slowdown when navigating to that particular record. Also, it’s relying on the quirk mode that won’t guarantee compatibility in the future.
Prototype 2 (The Viewport):
I don’t know if most of us still remember the old day of programming console-based application such as DataFlex, where screen estate is basically limited. Sometimes you only have 80 character by 25 row to a screen, but still, even in the old days, you still need to display more data than there are screen space. So, what do you do? The answer to this question is what I like to call a viewport. A viewport is sort of like what paging will do for you, but the nice part is you can scroll it one record or multiple records at a time.
The way it work is when it’s time to scroll, you’ll literally re-render x amount of data back to the screen to simulate the scrolling. So, if your viewport can only display 15 rows at a time, you’ll just re-render 15 records starting from record x (depending whether you go down or up, etc.)
The Good Stuff
This method is quite fast since you always showing limited number of items at any given point in time.
The Bad Stuff
Disadvantage for this method is you need to create a keyboard handler or any other method for scrolling items in the viewport.
Some more test still need to be done to lessen the lag time when pulling data from a real data source. In my opinion, if you can pull all the (necessary) data across in the first try or in some kind of asynchronous data batching mechanism, it might net you more performance gain (or at least the appearance of one, which is, most of the time, will give perception of performance to the user of the application). But then again, this will be a design question which better left to the solution architect.
Anyone got better ideas?
Included in this post are the prototype examples for the solutions.
Both prototypes were tested only on IE8 and FF3 (on Windows 7 Beta). Mind you the code is a very rough prototype and might cause blindness when read. You’ve been warned ;-).