This could be the one event that could shape Tabulas' future indefinitely.

The past service downtime to Tabulas has made one thing painfully clear: I cannot be expected to be a reliable systems administrator on top of developing and maintaining customer support. In a sense, although I can write Tabulas to be scalable, I am not scalable - I cannot simply throw myself at every issue that comes up and be expected to solve them all.

The main cost of running Tabulas per-user is the cost of storage space. Assuming I were to be backing up your static files (which I am not), I'd have to pay roughly $1-2/GB per month (I don't feel like looking this up at EV1Servers right now, but generally I'm paying $150/month for a server that has up to 240GB of storage, which is still realistically only 120GB of storage with backups). This solution would probably use RAID-0, which means it wouldn't protect me from the idiocy of "oops I deleted everything on the main drive and now RAID won't save me."

This is the reason why I can't offer unlimited* (the star, indicating, of course, it's not really unlimited, as other sites do) photohosting. Even ignoring the cost of backing up the file, Tabulas creates three copies of the file (thumbnail, web, and "large" sizes). This means every 60KB image you send nearly doubles in storage space. A long time ago, I decided that if I ever were to backup files, I'd just backup the original storage file; if the image server craps out, I can retrieve the original "big" and regenerate the smaller images (CPU cost is cheaper than the storage cost).

Amazon is now offering a webservice that essentially serves as a large data storage house. They offer redundancy, an easy API to view/delete data, and they offer this at a cost of $0.15/GB for storage and $0.20/GB for transfers (out, I'm presuming).

Since the original images will very rarely be called (only in the case of server failure), I'm being offered a very attractive storage opportunity. Because the system handles the scaling and the redundancy, that's more time that I don't have to spend with server work.

In a sense, this could be what sparks a HUGE increase in storage space for both free and patron Tabulas accounts - I could realistically (I'm doing the math off the top of my head) offer up to 500 images for free users, and up to 5,000 images for patron accounts without breaking a sweat. I would still maintain the front-end servers that do the actual display (because EV1Servers offer bandwidth pricing that is more competitive than Amazon), while the backend storage would be handled by S3.

I love Amazon. Now I need to find someone to write me a PHP class that interacts with S3; anybody up for it? (I can pay, but not well)

. . .

Matt's asked me to link, so I'll do it: Judge Orange County.

Currently listening to: With Broken Wings - A Beautiful Tragedy
Posted by roy on March 14, 2006 at 10:59 AM in Tabulas | 7 Comments

Related Entries

Linked Entries

These are Tabulas entries which have linked to this particular entry.

Want to comment with Tabulas?. Please login.

Comment posted on March 15th, 2006 at 12:46 PM
Hey Roy, I don't know too much about Amazon API's... but would this help you?

<a href="http://developer.amazonwebservices.com/connect/entry.jspa?externalID=126&categoryID=47">shortened link</a> [developer.amazonwebservices.com]

hapy (guest)

Comment posted on March 14th, 2006 at 09:54 PM
back everything on moonies' premium xanga site.
Comment posted on March 14th, 2006 at 02:05 PM
?[...] if the image server craps out, I can retrieve the original "big" and regenerate the smaller images (CPU cost is cheaper than the storage cost).?

Obviously not cheap enough to generate smalls on demand, thereby halving storage requirements?
Comment posted on March 14th, 2006 at 02:32 PM
The smalls are like 5KB each max, so I wouldn't even consider that. But for the "web" sizes (~450px), I could theoretically build them.

The reason I don't do that is because it's expensive to trigger the CPU to resize on demand when the data doesn't change - data that doesn't change often shouldn't be generated dynamically (the argument for caching).

However, how often are backups going to be triggered? It'll only happen if the drive fails - and if that happens, you have downtime regardless, so it's better not to be paying for storage when they can easily be restored from a larger version.
Comment posted on March 14th, 2006 at 03:40 PM
By 'smalls' I meant any downsizing from the original, although I see how it would make better sense for that to refer the 'web' size.

I suppose I'm rambling off-topic, but what are the statistics for image use (thumb views, 'web' views, normal views, uploads, etc)?

I was just poking at that assertion that CPU cycles are 'cheaper than storage'.
Comment posted on March 14th, 2006 at 06:23 PM
Gotcha.

I guess I should amend that to:

Assuming catastrophic failures are rare (rare meaning at most once every few months), the cost of one-time CPU to regenerate different image sizes is cheaper than storing each file type indefinitely, especially when I'm paying Amazon on a per-gig basis.
Comment posted on March 14th, 2006 at 12:53 PM
I'll write and teach that class, you better not be late you SOB. Also, pay me 10 pph (10 potatoes per hour). I will accept no less.