We already know that the Large Hadron Collider (LHC) will be the biggest, most expensive physics experiment ever carried out by mankind. Colliding relativistic particles at energies previously unimaginable (up to the 14 TeV mark by the end of the decade) will generate millions of particles (known and as yet to be discovered), that need to be tracked and characterized by huge particle detectors. This historic experiment will require a massive data collection and storage effort, re-writing the rules of data handling. Every five seconds, LHC collisions will generate the equivalent of a DVD-worth of data, that's a data production rate of one gigabyte per second. To put this into perspective, an average household computer with a very good connection may be able to download data at a rate of one or two megabytes per second (if you are very lucky! I get 500 kilobytes/second). So, LHC engineers have designed a new kind of data handling method that can store and distribute petabytes (million-gigabytes) of data to LHC collaborators worldwide (without getting old and grey whilst waiting for a download).
In 1990, the European Organization for Nuclear Research (CERN) revolutionized the way in which we live. The previous year, Tim Berners-Lee, a CERN physicist, wrote a proposal for electronic information management. He put forward the idea that information could be transferred easily over the Internet using something called "hypertext." As time went on Berners-Lee and collaborator Robert Cailliau, a systems engineer also at CERN, pieced together a single information network to help CERN scientists collaborate and share information from their personal computers without having to save it on cumbersome storage devices. Hypertext enabled users to browse and share text via web pages using hyperlinks. Berners-Lee then went on to create a browser-editor and soon realised this new form of communication could be shared by vast numbers of people. By May 1990, the CERN scientists called this new collaborative network the World Wide Web. In fact, CERN was responsible for the world's first website:
http://info.cern.ch/ and an early example of what this site looked like can be found via the World Wide Web Consortium website.
So CERN is no stranger to managing data over the Internet, but the brand new LHC will require special treatment. As highlighted by David Bader, executive director of high performance computing at the Georgia Institute of Technology, the current bandwidth allowed by the Internet is a huge bottleneck, making other forms of data sharing more desirable. "If I look at the LHC and what it's doing for the future, the one thing that the Web hasn't been able to do is manage a phenomenal wealth of data," he said, meaning that it is easier to save large datasets on terabyte hard drives and then send them in the post to collaborators. Although CERN had addressed the collaborative nature of data sharing on the World Wide Web, the data the LHC will generate will easily overload the small bandwidths currently available.
How the LHC Computing Grid works (CERN/Scientific American)
This is why the LHC Computing Grid was designed. The grid handles vast LHC dataset production in tiers, the first (Tier 0) is located on-site at CERN near Geneva, Switzerland. Tier 0 consists of a huge parallel computer network containing 100,000 advanced CPUs that have been set up to immediately store and manage the raw data (1s and 0s of binary code) pumped out by the LHC. It is worth noting at this point, that not all the particle collisions will be detected by the sensors, only a very small fraction can be captured. Although only a comparatively small number of particles may be detected, this still translates into huge output.
more:
http://www.universetoday.com/2008/09/04/the-lhc-will-revolutionize-physics-can-it-revolutionize-the-internet-too/