Open Compute Project 2nd meeting in New York, NY October 27th, 2011
What is the Open Compute Project (OCP) - http://opencompute.org: Hacking Conventional Computing Infrastructure. Started by Facebook a little over a year ago with a pretty big goal: to build one of the most efficient computing infrastructures at the lowest possible cost. We decided to honor our hacker roots and challenge convention by custom designing and building our software, servers and data centers from the ground up – and then share these technologies as they evolve.
Funders: Facebook, Microsoft, Goldman Sachs, Intel, Arista Networks, Rackspace
Result (so far): The result is a data center full of vanity free servers, which are 38% more efficient and 24% less expensive to build and run than other state-of-the-art data centers.
Goal: By releasing Open Compute Project technologies as open hardware, our goal is to develop servers and data centers following the model traditionally associated with open source software projects.
Agenda for Oct 27th, 2011:
Keynotes / Presentations
Frank Frankovsky, FB Tech Director – What if Hardware Were Open
- Pace of innovation would increase
- Less environmental impacts (less material and power)
- Increased efficiency at scale
What is needed?
- intellectual property structure, contributions, enablement
- missions and principles
- Facebooks new Oregon data facility (total cost ~$700 billion) savings: CAPEX of 24%, OPEX of 37%
Structure of OCP and what does it take to contribute
- Download OCP contributor agreement, sign it and fax back
Board of Directors: Andy Bechtolshein (Arista Networks), Don Duet (Goldmacn Sachs), Frank Frankovsky (Facebook), Jason Waxman (Intel), Mark Roenigk (Rackspace)
The rack is becoming the chassis
“innovate more, differentiate less”
OCP has been focused on “greenfields” of new data centers, future need to focus on transition data centers from closed designs to OCP ones.
Andy Bechtelshein, Arista Networks
Innovation is our friend, gratuitous differentiation the enemy
- not standards
- not customer panel (look to Open Data Center Alliance for that)
- not marketing organization
OCP focus is on open standards and open specifications
Focus on large scale problems
Designs data centers as a system (optimized as a system versus individual piece parts)
James Hamilton, VP Amazon Web Services
WE need warehouse scale computing on demand
OCP can increase the pace of innovation in data center technology
Data centers have innovated more in the last 5 years than 15
Amazon has dropped its AWS price 11 times in the last 4 years
Need to think and act at SCALE!
Any workload than can be run at marginal costs of power on a server is a good deal. Because the server is a sunk cost, must try to use every server as much as possible at cost of electricity.
Brian Stephens, VO/CTO RedHat
Redhat has certified OCP server stacks for use by RHEL
Intel, Jason Waxman, GM High Density Computing
Intel has made open source hardware its motherboard spec’s
Dell, Jimmy Pike, Chief Architect & Tech.
OCP + open data center alliance + Openstack = where Dell plays
Technical workshops (I only attend one, below)
- Storage: Building for the 100-year standard.
- Open Rack: Mechanical design and modular power distribution.
- The Adjacent Possible: Virtual IO.
- Datacenter Design: Building for different geographies.
- Systems Management: Defining open systems management for the enterprise.
The systems management technical track is basing itself on the Apache Incubator process.
Overview & Mission Statement: Data Center Systems Management is focused on:
- Platform Management
- Hardware Management
- Firmware lifecycle
- Elements of payloads on machines
- Fault alerting
- Remote access
- Data centers
- Strategic / Forward Looking
Problems to solve:
- eliminate the data center “clown car” of carts with many CDs for upgrading firmware
- Stop/limit scripting madness
- Standards and options to look at:
- DMTF – Smbios
- Intel – DCMI, IPMI, IPMB, ICMB
- AMD – OPMA
- SAforum – DFMS, WSDM
- SMIF – SMBUS (org Intel), PMBUS
- PICMG – ACTG (IPMI, IPMB/ICMP mgmt)
Subtrack: Firmware Lifecycle
- common lifecycle to providing motherboard bios/firmware configuration, deployment, updating and auditing (WS-man, XML)
- common configuration, deployment, updating and auditing of RAID controller, MC and other firmware
- management framework / API for providing common service
- scalable to 100K + servers
Look at HPM .1 specification
Subtrack: Fault Altering
- define consistent alert # and associated event information, i.e., 404 can mean different things for different bios
- maybe SNMP for ‘base’ functions & WS-Man for ‘extended’
- for SNMP having a standard (OCP MAB) would define this message?
Subtrack: Remote Management
- 2 levels of remote machines access
- Base: power on/off
- Standardize method to access capabilities
- Discover a machine hardware/firmware configuration could be written & queried in a standard way (WS-man)
Subtrack: Strategic / Forward Looking
- how to operate at SCALE
- 12c-bus, SMBUS
- Consolidated hardware management practices