Discussion Forum: Suggestions: Message 497508
 Previous Message   Next Message 
 Author: Locutis View Messages Posted By Locutis
 Posted: Dec 11, 2010 12:27
 Subject: Re: Stop purging data
 Viewed: 79 times
 Topic: Suggestions
Cancel Message
Cancel
Reply to Message
Reply
BrickLink
ID Card

Locutis (69)

Location:  Canada, Manitoba
Member Since Contact Type Status
Jan 28, 2010 Contact Member Seller
Buying Privileges - OKSelling Privileges - OK
Store Closed Store: The Borg Collective
I forgot to mention something. Last year I looked forward towards the future,
and investigated High Availability Linux for our servers. It is actually quite
easy to setup and guides are everywhere for this. What does this do for you?
Well, using standard off-the-shelf hardware, and open source software, you can
build a cluster of Linux web and database servers for less than $5k that offer
you the ability to run and maintain a website that can server over 100,000 concurrent
database and web connections/second. As well, it provides data backup and redundancy
over 2+ computers. If one fails, using something called heartbeat, another server
simply takes the load while you replace the failed unit. Hard drives can be
swapped in and out and be added to the pool of data storage easily. Built in
to this system is load balancing. Just google "how to build a high availability
linux cluster" and you will find lots of instructions for this. I purchased
4 identical server hardware machines in 2009 with the hope of eventually putting
them into place. Currently we are using 2 of the 4 units, with the other 2 units
fully assembled and configured to be ready for an inplace replacement in case
of a catastrophic failure. What did all of this cost (including 15x 500 GB hard
drives for instant replacement of failed drives)? Around $4k.

I don't know if Bricklink already employs this, but at work I was looking into
it and never found the time to implement the cluster system. A cluster is easy
to maintain once setup. If you find your site is getting busier and load on
the server is overwhelming, simply setup another cluster machine, bring it online,
and add it to the pool. If you had 2 machines, and you add 1, now you've increased
your available processing capacity by 50% and done nothing to the existing 2
machines. As far as users are concerned, if you add or take away any machines
from the cluster, it only affects the time to access the site. Need to perform
maintenance and upgrade the OS or other software? Take everything but one offline
(which means the site still works), upgrade and test the "spares", then switch
everything over to the new system, and bring the older one up to date by adding
it now as a "spare" to the new system.

The clustering mechanism in Linux can scale easily from 2 to unlimited.

The software is called "drdb" and "heartbeat". When employed with RAID, and
an interconnected gigabit connection between the computers, you have a very fast
responding system, which scales easily with load over time, and highly reliable
because of the fault tolerance inherent in having more than 1 identical webserver
online. You can even have your servers setup this way in different locations,
different cities, even different countries! All of the data is mirrored and
kept in sync across all machines, and is online and accessible all of the time.

I'm just saying it's possible, can be done by virtually anyone in the computer
field who is knowledgeable on website setup with the guides that are out there,
and it's very inexpensive.

In Suggestions, locutis writes:
  Our internal work server (used for invoicing, accounting, etc.) has only 5 users.
However, our external webserver (online store, auction site, gold/silver bullion
quotes, etc.) receives many millions of hits per month. We send over 60Gb of
html data over our fibre internet connection every month.

Our customer database consists of 80,000 + active customers.

On our external server (same hardware as the internal, 2.8 GHz processor, 2x500
GB RAID 1 hard drives) we operate a standard webserver, an auction website (with
a current catalog of 4,000 active items, and 40,000 archived items) which accepts
and processes bids in realtime, an online store with over 2500 active items,
and a gold and silver bullion website which provides realtime quotes to 10's
of thousands of customers every minute.

Our website is high profile as well. I am one person programming in "spare time",
and have made all of this work, and work efficiently.

Cameron

In Suggestions, AggieSava writes:
  In Suggestions, locutis writes:
  In Suggestions, B0RIS writes:
  In Suggestions, Timothy_Smith writes:
  My suggestion: stop purging data.
It's the 21st century, mass storage is cheap.
There's no at all reason to purge data ever.

Data purging is not done to save storage cost. It is done to cut data access
time. My guess is all bricklink visitors are served by only 1 processor.

Boris.

I don't know how Bricklink works, but I know on my work server, I can search
through AND display over 20,000 records in less than 1 second. We have one single
server, with 2x 500GB hard drives on RAID 1 (which means they duplicate the data
over 2 drives), running Apache2, with a 2.8 GHz Pentium processor.

If I want to search the "archived data" which takes 10,000 records and magically
combines it into 1 for speedier access (I programmed it to archive records over
2 years old 10,000 at a time into one database entry), there's 200,000 records
to search, it takes only several seconds to search, compile, and display the
information.

Again, I don't know how Bricklink works, but I'm far from a professional programmer,
and I made the site at work operate efficiently this way. We never purge ANY
data, and have access to all data going back to 1997 when we started computerizing.
I'm one single computer person at work, and computer work isn't even my job,
I do it in my "spare time" while trying to manage the company.

Cameron

But how many people are using your work server? There are 152,351 registered
members of Bricklink as I'm writing this, and Bricklink already has an extensive
database to go through as it is for every single person. I can imagine how if,
as suggested, BL has only one processor, a database containing all the orders
and all the forum posts since the beginning would slow considerably for all these
users.

But as also already suggested, hardware is getting cheaper and cheaper.

--Tony

Message is in Reply To:

View Thread Re: Stop purging data - Locutis (69)
Our internal work server (used for invoicing, accounting, etc.) has only 5 users. However, our external webserver (online store, auction site, gold/silver bullion quotes, etc.) [...]
(164 months ago, Dec 11, 2010, to Suggestions)

19 Messages in this Thread:

 Msg 1 - Timothy_Smith (1537) 164 months ago Dec 11, 2010 to Suggestions
 Msg 2 - legomadsteve (72) 164 months ago Dec 11, 2010 to Suggestions
 Msg 3 - redbeardlegoman (83) 164 months ago Dec 11, 2010 to Suggestions
 Msg 4 - JoeMomma (1214) 164 months ago Dec 11, 2010 to Suggestions
 Msg 5 - bb166186 (89) 164 months ago Dec 11, 2010 to Suggestions
 Msg 6 - Timothy_Smith (1537) 164 months ago Dec 11, 2010 to Suggestions
 Msg 7 - Locutis (69) 164 months ago Dec 11, 2010 to Suggestions
 Msg 8 - AggieSava (992) 164 months ago Dec 11, 2010 to Suggestions
 Msg 9 - Locutis (69) 164 months ago Dec 11, 2010 to Suggestions
 Msg 10 « - Locutis (69) 164 months ago Dec 11, 2010 to Suggestions
 Msg 11 - Reki_Lobsheek (2465) 164 months ago Dec 11, 2010 to Suggestions
 Msg 12 - Locutis (69) 164 months ago Dec 11, 2010 to Suggestions
 Msg 13 - matthewcrandall (83) 164 months ago Dec 11, 2010 to Suggestions
 Msg 14 - Brickwilbo (1534) 164 months ago Dec 11, 2010 to Suggestions
 Msg 15 - Rbobo (3014) 164 months ago Dec 11, 2010 to Suggestions
 Msg 16 - wahiggin (2867) 164 months ago Dec 11, 2010 to Suggestions
 Msg 17 - eileenkeeney (1610) 164 months ago Dec 12, 2010 to Suggestions
 Msg 18 - tomte (82150) 162 months ago Jan 30, 2011 to Suggestions
 Msg 19 - BLUSER_228233 (114) 162 months ago Jan 30, 2011 to Suggestions

 Previous Message   Next Message 

Entire thread on one page
This message and all its replies on one page