The guys at AppAssure asked me to write this blog post about how I came to buy their product. Basically I had a really bad day at work (I basically lost a critical disc and could not recover from my backups – not even my tape backups) and I was committed to never having a day like that again – that’s when I found Replay AppImage. If you have recently had a disaster like mine or if you want to make sure you don’t. you should give the AppAssure guys a shout or download a free trial of their software to check it out.

So here’s the story of my worst day at work:

When my company’s Microsoft Exchange Server failed at the end of the quarter, it could not have happened at a worse time. It began when the VP of Sales yelling “Email is down, and customers can’t send us their orders!” Then my Blackberry started going off, calls, emails, IMs —it was relentless. When I logged on to the Exchange Server, I found that some of my most critical mail stores were no longer mounted. When I tried to remount them, I received the ambiguous yet ominous JET-1601 JET_errRecordNotFound error message. I immediately connected to the replication server that runs at one of the company’s remote sites, only to find that I couldn’t mount those mail stores either.

When I called Microsoft, technicians prescribed the standard procedure of running Eseutil. They warned me, however, that the error message probably indicated a corruption problem deep within the database and that running Eseutil might result in cleaning the stores of all user data. “I took the leap, on the chance that it would be quicker than getting the restore process underway. Running Eseutil took hours, then failed with the even more ambiguous JET -1003 JET_errInvalidParameter.” At that point, I knew I HAD to go to the backup.

My company runs full backups every Saturday night and incremental backups the rest of the week. I started by recovering the most recent full backup, then applying the incrementals until I had the backup from the night before the failure. As you can imagine, the calls, emails, etc -- kept coming all the while I was copying the mail stores from my disk to disk backup—although they did taper off a bit after 11:00 P.M., when our west-coast office closed.

Once our data was back on the primary server, it was time to roll the logs and mount the database. However, when the logs were about 80 percent applied, they failed with the JET -501 JET_errLogFileCorrupt. At that point, Microsoft support could only suggest running Eseutil through my entire log chain, noting the corrupted log, deleting anything except log files from the log directory, and deleting the corrupted log and all the logs created thereafter. Then I could finally restart the log roll operation from scratch. This procedure took more than six hours. In the end, my company lost two days of email messages, and recovery took more than thirty hours. The cause turned out to be a problem with the RAID controller driver that had taken months to manifest itself after a previous server upgrade.

As you can imagine, executive management figured it cost the company about $50K so they definitely wanted to know what had happened and how it could have been prevented—and how it would be prevented from happening again. Let’s just say “wanted to know” means, if I didn’t have a good answer my name was going on the top of the next lay off list. I was seriously committed to finding a better recovery solution.

The Right Exchange Recovery Solution: Not Just Backups but Usable Data
After evaluating several potential solutions, of varying price ranges (from $199 to $100K), I liked AppAssure’s Replay AppImage right away. It’s a block-based imaging recovery solution that captures the entire Exchange server environment and supports recovery—anything from bare metal to an individual email message—in just a few clicks, simplifying the entire recovery process. What was totally cool was the Exchange “health checks” which made sure the data I was backing up was absolutely mountable. Before I made the recommendation to my boss, I wanted to be sure I was making the right choice and checked out a few of their customers: Jim Poehlman, Director of IT – Ubicom, needed to protect Exchange from user error and its databases from corruption. Jay Wessel, VP of Technology for the Boston Celtics, needed those capabilities in addition to being able to respond to legal and business discovery requests. Both said they found the solution they required in Replay AppImage and were happy with their choice.

In addition to capturing and validating your Exchange data, Replay AppImage employs a unique instant-replay capability that dramatically reduces volume recovery times from hours to minutes regardless of the data set size being recovered. After a rollback is initiated, the volume and storage groups are automatically and immediately mounted from the Replay AppImage server, providing users with access to email during the recovery process. Apparently they call it Live Replay; all I know is it saved me from being Dead Admin.
I’ve got
Replay AppImage up and running now and it lets me:
• Instantly roll back a server to a point in time before an outage occurred.
• Allow my users to access applications (including e-mail) during a live recovery.
• Recover entire applications from bare-metal in just a few clicks.
So here’s what I learned on my worst day as a Network Admin: You can have multiple copies of your data—on replicated servers, on disk, and on tape—but if you can’t mount the copies, they aren’t any good?

Comment

You need to be a member of AppAdmins to add comments!

Join AppAdmins

Joe Squirrel Comment by Joe Squirrel on August 5, 2009 at 10:48am
I wish I had appAssure Reply AppImage. My manager believes Microsoft DPM (Doesn't Protect Much) is superior. I fear the day we need to restore from backups....That will be the day I begin looking for a new job because the buck will stop with me even with all of my protesting. If anyone has an open position, please post because I do not want to be around here when something goes wrong.
Clarke Holmes Comment by Clarke Holmes on May 27, 2009 at 7:57am
My Worst Day:

Exchange 5.5 SP3 running on NT 4.0 SP 5 or 6, don't remember.

Technet disks arrive..I open the box, all the usual suspects are there, plus Exchange 5.5 SP4, along with a terse note: Apply this Service Pack IMMEDIATELY...."MMMMMMmm OK" I put the disk in the drive, install SP4, and restart the Exchange Server....After the obligatory 15 minute down-and-boot cycle, I see that no mail is flowing "Hmmmm What is up?"

I notice that the IMC and MTA Stacks services are not running..I select the services and attempt to start services...reach runs about 10 seconds and stops..no error/warning, etc in the event viewer.

I look though add/remove programs..no way to remove SP4. Place call to MS...after several transfers, I finally get to a tech. "Hey, I just put on Exchange Server SP4 and rebooted, now the IMC & MTA Stacks won't stay running. What gives?"

MS Tech: "Is this by any chance a Multi-Home Exchange Server?"

Me: "Yes, NIC 1 is on the 207.x.x.x subnet, and NIC 2 is on 192.168.1.x subnet. Why?"

MS Tech (In perfect British Accent): "Oh, Bloody Hell..."
2 days later, when MS finally released a tool to cleanly remove SP4, I COULD have fixed my Exchange Server easily..... Part of the delay was that MS had shut down parts of their various websites, due to some DDoS attacks. A snapped tape and many wraps of media inside the drive prevented an easy restore.

Lessons Learned...
{1.} NEVER buy or install V1.0 of ANYTHING.
{2.} Make sure you have a clear path to back off a Service Pack
{3.} If you think a tape drive is getting "flaky", it is already kaput...While threading up a tape, I heard one of the drive motors going into super-turbo warp speed, and then POP!...sound of a 4mm (dds3)tape snapping and wrapping itself around all the innards of the drive

© 2010   Created by Fred on Ning.   Create a Ning Network!

Badges  |  Report an Issue  |  Privacy  |  Terms of Service