GRIDSCALE FAQ

Functionality

System Requirements

Functionality (Questions and Answers)

1. How is GRIDSCALE different from database clustering solutions for High Availability (HA)?

Database clusters address availability concerns with a traditional active/passive arrangement: the designated primary database server is available and active; however the other database server (the standby) is idle and becomes active only in the case of a failure of the primary database server. With the GRIDSCALE Database Load Balancer, all of the database servers in the cluster are in an active/active arrangement - they are available and active all the time. This enables GRIDSCALE to outperform traditional HA solutions while providing continuous availability.

2. How is GRIDSCALE different from database partitioning (e.g. DB2 DPF)?

Partitioning solutions like IBM DB2 DBF (Data Partitioning Facility) distribute a database's content, subdividing it between multiple data servers. Scalability improves if the data can be partitioned in the first place (e.g. an HR database might be partitioned by each distinct department such as accounting, administration, etc.) so that queries can work in parallel on the different partitions. GRIDSCALE is designed to handle any data without assuming whether it can or cannot be partitioned. Each database server in a GRIDSCALE cluster maintains an identical, consistent, and complete copy of the database. i.e. the database is not partitioned across the nodes in the cluster. Scalability results because queries can be load balanced against the multiple copies of data.

3. How is GRIDSCALE different from IBM DB2 ICE?

IBM DB2 ICE (Integrated Cluster Environment) is a packaged and pre-tested solution from IBM that leverages the DB2 DPF capabilities. See the question on DPF in this FAQ for information on IBM DB2 DPF.

4. How is GRIDSCALE different from database replication solutions (e.g. DB2 HADR)?

Database replication solutions (e.g. log shipping) like IBM DB2 HADR ensure that a standby database server is sufficiently up to date so that it can take over data processing tasks with minimal loss of data if the primary database server fails. Clients cannot use the standby database server while the primary is functioning - database performance is restricted to how well the single primary database server can process requests. In a GRIDSCALE cluster, all the database servers are active at all times. GRIDSCALE replicates all the data to every single database server in the cluster, and further ensures that the data will be identical on each server. This allows clients to access any database server in the cluster without the risk of accessing stale data.

5. How is GRIDSCALE different from Q Replication?

IBM DB2 Q Replication utilizes MQ Series to ship DB2 transaction log statements to one or more DB2 servers asynchronously. This allows clients to use the non-primary database servers for read requests. However, Q Replication does not guarantee a consistent replicated point-in-time image; hence, reads against the non-primary database can potentially return stale data. With GRIDSCALE, reads against any database server in the cluster are guaranteed to return the most recent and correct data.

6. Does GRIDSCALE support IBM's DB2 DPF? What about BCU? What about Data Warehouse Edition?

GRIDSCALE can be used with Data Partitioning Feature (DPF, aka Balanced Configuration Unit - BCU, aka Data Warehouse Edition) to provide continuous availability for a DPF cluster.

7. How does GRIDSCALE provide scalability?

By ensuring that each database server in the cluster is consistent and up-to-date, GRIDSCALE can load balance read operations across the database servers in the cluster. Having multiple, consistent sources of data allows applications to retrieve data with the shortest wait time.

8. How does GRIDSCALE ensure that the database servers are 100% consistent and up-to-date?

Write requests (insert, update, and deletes) are broadcast to each database server in the cluster, thereby ensuring that each server has the same data. However, the first database server to process the write provides the response to the application. Of course the other servers also process the write request, but the application is not waiting, unlike the case with a two-phase commit. Read requests are only sent to the database server that is best able to serve up consistent data in the shortest time.

9. Does GRIDSCALE support Stored Procedures, Views, Triggers, etc.?

Yes, GRIDSCALE supports the following database constructs:

  • Stored Procedures
  • Views
  • Triggers
  • Foreign Keys
  • Materialized Tables

10. Can patches/upgrades be performed on a node in the cluster without affecting the database service?

Yes - any database server in the cluster can be taken offline to perform maintenance operations without affecting database availability (assuming a minimum of database servers). GRIDSCALE also allows different database versions to exist in the database cluster. This allows customers to perform rolling upgrades without any scheduled downtime.

11. How do I make changes to the database schema using GRIDSCALE?

GRIDSCALE provides a command line interface (xdsql) for replicating SQL statements to the nodes in the cluster. This command line interface can be used for performing DDL (Database Definition Language) tasks.

12. Does GRIDSCALE support Transaction Isolation Levels?

Yes.

13. How does GRIDSCALE handle deadlocks?

GRIDSCALE keeps track of all client queries and monitors transactions that may cause deadlocks. Any query that may cause a deadlock in the database is rejected by GRIDSCALE.

14. Does GRIDSCALE require its own database login?

No. Database requests from the application are passed through and sent to the database server using the credentials provided by the application.

15. Is GRIDSCALE a single point of failure?

No. Two GRIDSCALE appliances can be clustered and configured in a hot-standby configuration.

16. Is GRIDSCALE a performance bottleneck?

No. Performance testing has shown that performance is dependent on the sizing of the database servers in the cluster and not on GRIDSCALE itself.

17. How many database server nodes can be clustered with GRIDSCALE?

Performance is application dependent. We have built eight (8) node clusters exhibiting up to 85% scalability. Modeling suggests that upwards of thirty (30) nodes can be clustered while still delivering horizontal scalability.

18. Is there a limitation on where the database server nodes can be located geographically?

No. GRIDSCALE uses standard TCP/IP to communicate with all database servers in the cluster. If there is a significant latency between GRIDSCALE and the database server, then that database server will share less of the read load then the other database servers. In the extreme case, the database server would receive no read requests and only have updates sent to it.

19. How does GRIDSCALE differ from other competitive solutions?

Some solutions are active/passive (traditional primary/standby) - these provide high availability (HA) but no scalability; GRIDSCALE provides both.

Some solutions are active/active like GRIDSCALE but do not scale very well since they are not completely asynchronous in writing data. GRIDSCALE writes asynchronously to all nodes in a cluster and does true load balancing of reads, providing up to 85% scalability.

Some solutions do not guarantee 100% consistency in the data they return to applications due to the way they sequence concurrent writes and reads. This presents obvious risk to the integrity of user applications.

GRIDSCALE always ensures consistency, in additon to on-demand scalability and non-stop availability.

System Requirements (Questions and Answers)

1. What databases, operating systems and architectures does GRIDSCALE support?

GRIDSCALE currently supports DB2 v8 (and up) on x86 Linux (SUSE LINUX Enterprise Server or Red Hat Enterprise Linux), Microsoft Windows, Sun Solaris 10 and IBM AIX 5 (and above). Future releases will support other databases (e.g. Sybase, MS SQL Server, and Oracle) as well as additional operating systems (e.g. MS Windows).

2. Does GRIDSCALE require any special networking equipment?

No. GRIDSCALE utilizes standard Ethernet networking interfaces. Gigabit Ethernet is recommended.

3. Does GRIDSCALE require any application code changes?

For most applications, only a configuration change is required to use xkoto's client drivers which will transparently connect the application to GRIDSCALE.

4. Does GRIDSCALE only support DB2 UDB Express Edition?

No. GRIDSCALE supports all DB2 UDB editions from Express to Enterprise Server Edition.