Today, Chris Reynolds, head of the Secretary of State's Political Reform Division, sent around a document explaining why their technical staff believes Internet access to the system went down and what they are doing to restore Internet access to the system as quickly as possible. His email also says the his office is available to assist people in the meantime by phone, email, fax, or in-person visit. The main number is (916) 653-6224, email address is http://www.sos.ca.gov/webcontact/general/question.aspx, fax is (916) 653-5045, street address is 1500 11th Street, Sacramento, CA 95814.
Here is the memo:
---------
What happened to CAL-ACCESS?
CAL-ACCESS (the
California Automated Lobbying and Campaign Contribution and Expenditure Search
System) is a suite of applications developed in 13
different programming languages.
CAL-ACCESS runs on a server cluster and associated components that are
more than 12 years old, and runs on an uncommon version of the Unix operating
system called Tru64.
On November 30, 2011, the disk array controller
experienced a physical memory failure that led to the loss of its disk array
configuration and the loss of three physical disk drives. The disk array
contains a total of 90 disk drives with 15 disk drives installed in each of six
drive enclosures. (The array
configuration defines which combination of physical disk drives form the logical
disk drive that is presented to the operating system.) When CAL-ACCESS was originally
architected in 1999, it was common to locate the operating system on the disk
array rather than on locally attached disks. This configuration created a
single point of failure in the array controller.
After replacing the failed memory
equipment, staff were able to reconfigure a very small portion of the disk
array that permitted the server cluster to start. The portion of the disk
array that houses the area where the databases reside was not immediately
recovered since a more extensive amount of time was needed to remap the entire
disk array. To make the system
available by Internet again as soon as possible, staff ported the server
cluster to use an alternate network-attached storage device and used a backup
to restore the data by December 7. The configuration functioned for about
30 hours before it failed again on December 9. Staff tried multiple
approaches to recover this configuration throughout Friday, Saturday and
Sunday.
On Monday, December 12, staff
initiated three concurrent recovery methods to restore services:
1. Porting CAL-ACCESS off of the Tru64 cluster to
a modern hardware architecture, which involves modifying the database and
reprogramming websites and applications in as many as 13 different coding
languages.
2. Virtualizing the Tru64 Unix environment to move
off of the aged equipment, which includes building new servers and installing
and configuring software that can emulate the DEC Alpha architecture to run on
an Intel architecture. Once this
environment has been established the Tru64 operating system can be installed
and configured to match the old production environment, and the databases and applications
can be restored from backup.
3. Rebuilding the original disk array, which is expected to take 10 to
14 days.
Work on the first method started
on December 12. The second and
third methods require contracted specialists and a state contract approval is
expected by December 15.
Then will CAL-ACCESS be
permanently fixed?
Created in 1999, CAL-ACCESS is
now very old and fragile, and few people in the United States are familiar with
the antiquated technology used to build and operate the system. The recovery efforts will make CAL-ACCESS
stable and get it running, but it can never be more robust or
feature-laden. Ideally, we need a
fresh start with an all-new CAL-ACCESS.
No comments:
Post a Comment