Need for Hyper-Scale Computing in the Enterprise

698 views
620 views

Published on

What is hyper-scale computing and how it revamps the enterprises of today. Learn more from this presentation by Chirag Jog, VP of Engineering for MSys, at SNIA's Data Storage Innovation Conference 2015.

Published in: Technology
0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
698
On SlideShare
0
From Embeds
0
Number of Embeds
144
Actions
Shares
0
Downloads
6
Comments
0
Likes
1
Embeds 0
No embeds

No notes for slide

Need for Hyper-Scale Computing in the Enterprise

  1. 1. Bringing HyperScale Computing to the Enterprise The need for Enterprises to overhaul their IT systems
  2. 2. Global Presence Atlanta Pune Bangalore Chennai About MSys Established In: 2007 Self-funded, profitable Over 350 employees Global Presence Product Engineering Services Technology Storage Services Product Engineering / Development Test Engineering Rapid Prototyping Maintenance & Support UI Engineering Cloud Computing Big Data Analytics Storage DevOps Infrastructure Tech Filesystem Development Kernel Development Backup & DR Cloud Storage Storage QA Storage Certifications MSys Advantages •  End-to-end Storage Expertise •  75% revenue from storage companies •  Dedicated team with focused skills
  3. 3. About Me:About the Speaker Chirag Jog VP, Engineering at Msys(Clogeny) Pune, India [email protected] @chirag_jog http://in.linkedin.com/in/chiragjog Architect with 10+ years of experience. Built cutting edge products in Storage, Cloud and Big data space Products: Cloud Mobility, Cloud Backup Products Open source contributor
  4. 4. Agenda •  Changing nature of Applications and Data in Enterprise •  Time to re-think the Approach to Infrastructure •  Current Challenges and Limitations •  Trends in Software and Hardware
  5. 5. Next-gen Applications and Data Ref: http://dionhinchcliffe.com/2012/10/23/imagining-the-future-of-the-enterprise/
  6. 6. Next-gen Application and Data •  Web-Scale IT http://www.itbusinessedge.com/imagesvr_ce/6074/Gartner2014Trends09.jpg
  7. 7. Next-gen Application and Data •  Web-Scale IT –  Available as-a-Service –  Scale on demand –  Always on operation –  Automated Provisioning –  Agility, Flexibility http://www.itbusinessedge.com/imagesvr_ce/6074/Gartner2014Trends09.jpg
  8. 8. Next-gen Application and Data •  Web-Scale IT •  Real-time Data streams Ref: http://dionhinchcliffe.com/2012/10/23/imagining-the-future-of-the-enterprise/
  9. 9. Next-gen Application and Data •  Real-time Data streams –  customers –  partners –  supply chains –  applications –  internet of things http://wethedata.org/wp-content/uploads/2012/12/CITRIS-JAN-2013-PANEL-DATA-STREAMS.jpg
  10. 10. Next-gen Application and Data •  Web-Scale IT •  Real-time Data streams •  Near real-time recommendations Ref: http://dionhinchcliffe.com/2012/10/23/imagining-the-future-of-the-enterprise/
  11. 11. Next-gen Application and Data •  Near real-time recommendations •  real-time transactional and analytical systems http://jeremystierwalt.com/2014/10/02/drive-real-time-right-time-decisions-with-analytics/
  12. 12. Next-gen Application and Data •  Web-Scale IT •  Real-time Data streams •  Near real-time recommendations •  Massive amounts of data Ref: http://dionhinchcliffe.com/2012/10/23/imagining-the-future-of-the-enterprise/
  13. 13. Next-gen Application and Data •  Massive amounts of data –  structured, unstructured –  archive –  readily available for analytics, compliance, outages http://www.coolinfographics.com/blog/2009/7/11/how-much-is-a-petabyte.html
  14. 14. Next-gen Application and Data •  Web-Scale IT •  Real-time Data streams •  Near real-time recommendations •  Massive amounts of data •  Cost effective solutions Ref: http://dionhinchcliffe.com/2012/10/23/imagining-the-future-of-the-enterprise/
  15. 15. Next-gen Application and Data •  Cost effective solutions –  cost-effect performance –  Minimize cost of Human resources –  Minimize costs of infrastructure http://kteceducation.com/wp-content/uploads/2013/12/cost.png
  16. 16. Next-gen Application and Data •  Web-Scale IT •  Real-time Data streams •  Near real-time recommendations •  Massive amounts of data •  Cost effective solutions Ref: http://dionhinchcliffe.com/2012/10/23/imagining-the-future-of-the-enterprise/
  17. 17. Current Constrains- Storage •  Multiple Storage Array Platforms and Arrays Ref: http://wikibon.org/w/images/f/f3/Softwareledstorage.png
  18. 18. Current Constrains- Storage •  Multiple Storage Array Platforms and Arrays •  Slow Access Speed and throughput of Disk Drives Ref: http://wikibon.org/w/images/f/f3/Softwareledstorage.png
  19. 19. Current Constrains- Storage •  Multiple Storage Array Platforms and Arrays •  Slow Access Speed and throughput of Disk Drives •  Difficulty of migrating current management of data on DAS and SAN to more flexible topologies Ref: http://wikibon.org/w/images/f/f3/Softwareledstorage.png
  20. 20. Current Constrains- Storage •  Multiple Storage Array Platforms and Arrays •  Slow Access Speed and throughput of Disk Drives •  Difficulty of migrating current management of data on DAS and SAN to more flexible topologies •  Difficulty in managing the volume, growth and complexity of unstructured data Ref: http://wikibon.org/w/images/f/f3/Softwareledstorage.png
  21. 21. Current Constrains- Storage •  Multiple Storage Array Platforms and Arrays •  Slow Access Speed and throughput of Disk Drives •  Difficulty of migrating current management of data on DAS and SAN to more flexible topologies •  Difficulty in managing the volume, growth and complexity of unstructured data •  A fragmented and often fragile Data backup process Ref: http://wikibon.org/w/images/f/f3/Softwareledstorage.png
  22. 22. Current Limitations - Applications •  Interconnected and Inter-dependent – Apps, Infra, Physical DC •  Apps in a silo – Optimized Infrastructure •  Rigid and easy to knock down •  1 admin per 300-700 servers. •  Managed infrastructure by hand – updating software, rearranging configuration, SLAs •  Inflexible to innovate – decisions are made for 15-20 years
  23. 23. How are exa-bytes of data going to be: – backed up? – restored? – accessed? Million Dollar Question or Exabyte Question
  24. 24. How did “they” do it?
  25. 25. Enterprise learning from Web-Scale Spend $ Save Time Spend time Save $
  26. 26. Enterprise learning from Web-Scale Spend $ Save Time Spend time Save $ Flash Server SAN Software-Defined
  27. 27. Enterprise learning from Web-Scale Spend $ Save Time Spend time Save $ Flash Server SAN Software-Defined
  28. 28. Enterprise learning from Web-Scale Spend $ Save Time Spend time Save $ Flash Server SAN Software-Defined Solutions
  29. 29. Software Led Infrastructure •  Dynamic Infrastructure – Scale Out
  30. 30. Software Led Infrastructure •  Dynamic Infrastructure – Scale Out •  Automation – Software led solutions
  31. 31. Software Led Infrastructure •  Dynamic Infrastructure – Scale Out •  Automation – Software led solutions •  Simplification – Rack Level convergence
  32. 32. Software Led Infrastructure •  Dynamic Infrastructure – Scale Out •  Automation – Software led solutions •  Simplification – Rack Level convergence •  New Design/Architecture – Flash – Cloud – Network – Distributed Systems
  33. 33. What did “they” do ? •  Commodity Hardware •  Software Defined Software (infrastructure) •  Avoid RAID •  Hyper-convergence •  Leverage Flash for Cache/Hot data
  34. 34. Enterprise Goals ! •  All Computer/Storage hardware is compatible across the data center •  Rack-level configurations are not dependent on any one piece of code. •  1 administrator for every 20,000 servers. •  Deploy hardware/software very quickly •  Invest in technical innovation •  Avoid siloes through transparency •  Use Automation
  35. 35. Micro- serviceMicro- serviceMicro- services Internet Scale Applications Real-time Analytics and Batch processing Real Time streams of data Containers Internet of things Supply Chains Click streams Security
  36. 36. SSDs HDDs Hypervisor VM Controller VM/ Software Controller VM/ Software Servers Servers Hyper-converged / Software defined All Flash/ Hybrid arrays VM VM VM VM SSDs HDDs
  37. 37. SSDs SSDs •  Flash Storage and Solid State Drives
  38. 38. SSDs Controller VM/ Software Servers All Flash/ Hybrid arrays SSDs HDDs •  Flash Storage and Solid State Drives •  All Flash or Hybrid Flash Arrays
  39. 39. SSDs HDDs Controller VM/ Software Servers Servers Hyper-converged / Software defined All Flash/ Hybrid arrays SSDs HDDs •  Flash Storage and Solid State Drives •  All Flash or Hybrid Flash Arrays •  Software Defined Software
  40. 40. SSDs HDDs Controller VM/ Software Servers Servers Hyper-converged / Software defined All Flash/ Hybrid arrays SSDs HDDs •  Flash Storage and Solid State Drives •  All Flash or Hybrid Flash Arrays •  Software Defined Software •  Hyper converged Infrastructure
  41. 41. SSDs HDDs Hypervisor VM Controller VM/ Software Controller VM/ Software Servers Servers Hyper-converged / Software defined All Flash/ Hybrid arrays VM VM VM VM SSDs HDDs •  Flash Storage and Solid State Drives •  All Flash or Hybrid Flash Arrays •  Software Defined Software •  Hyper converged Infrastructure •  IT-as-a- Service
  42. 42. SSDs HDDs Hypervisor VM Controller VM/ Software Controller VM/ Software Servers Servers Hyper-converged / Software defined All Flash/ Hybrid arrays VM VM VM VM SSDs HDDs Virtual Private Computing Platform
  43. 43. SSDs HDDs Hypervisor VM Controller VM/ Software Controller VM/ Software Servers Servers Hyper-converged / Software defined All Flash/ Hybrid arrays VM VM VM VM SSDs HDDs SSDs HDDs Scalable Exascale Distributed Object Storage Servers Erasure Coding RAIN Architecture High Performance
  44. 44. SSDs HDDs Hypervisor VM Controller VM/ Software Controller VM/ Software Servers Servers Hyper-converged / Software defined All Flash/ Hybrid arrays VM VM VM VM SSDs HDDs SSDs HDDs Scalable Exascale Distributed Object Storage Servers Erasure Coding RAIN Architecture High Performance •  Object Storage •  Erasure Coding •  Information Dispersal Algorithms (IDA) •  No RAID – but RAIN !
  45. 45. What about the Software to use ?
  46. 46. SSDs HDDs Hypervisor VM Controller VM/ Software Controller VM/ Software Servers Servers Hyper-converged / Software defined All Flash/ Hybrid arrays VM VM VM VM SSDs HDDs SSDs HDDs Scalable Exascale Distributed Object Storage Servers Erasure Coding RAIN Architecture High Performance
  47. 47. SSDs HDDs Hypervisor VM Controller VM/ Software Controller VM/ Software Servers Servers Hyper-converged / Software defined All Flash/ Hybrid arrays VM VM VM VM SSDs HDDs SSDs HDDs Scalable Exascale Distributed Object Storage Servers Erasure Coding RAIN Architecture High Performance Distributed Storage– Ceph, Gluster, Lustre, HDFS
  48. 48. SSDs HDDs Hypervisor VM Controller VM/ Software Controller VM/ Software Servers Servers Hyper-converged / Software defined All Flash/ Hybrid arrays VM VM VM VM SSDs HDDs SSDs HDDs Scalable Exascale Distributed Object Storage Servers Erasure Coding RAIN Architecture High Performance Distributed Storage– Ceph, Gluster, Lustre, HDFS •  Distributed Parallel Filesystems •  Non of complexities of RAID •  Replication, Zero Downtime and Self-healing •  Commodity hardware with lowest cost
  49. 49. SSDs HDDs Hypervisor VM Controller VM/ Software Controller VM/ Software Servers Servers Hyper-converged / Software defined All Flash/ Hybrid arrays VM VM VM VM SSDs HDDs SSDs HDDs Scalable Exascale Distributed Object Storage Servers Erasure Coding RAIN Architecture High Performance Distributed Systems – Mesos, Zookeeper Distributed Storage– Ceph, Gluster, Lustre, HDFS
  50. 50. SSDs HDDs Hypervisor VM Controller VM/ Software Controller VM/ Software Servers Servers Hyper-converged / Software defined All Flash/ Hybrid arrays VM VM VM VM SSDs HDDs SSDs HDDs Scalable Exascale Distributed Object Storage Servers Erasure Coding RAIN Architecture High Performance Distributed Systems – Mesos, Zookeeper Distributed Storage– Ceph, Gluster, Lustre, HDFS •  Apache Mesos - Operating system for the Datacenter •  Apache Zookeeper – Distributed Co-ordination Service •  Consul, Eureka – Service Discovery
  51. 51. Mesos – Server Abstraction
  52. 52. SSDs HDDs Hypervisor VM Controller VM/ Software Controller VM/ Software Servers Servers Hyper-converged / Software defined All Flash/ Hybrid arrays VM VM VM VM SSDs HDDs SSDs HDDs Scalable Exascale Distributed Object Storage Servers Erasure Coding RAIN Architecture High Performance Distributed Systems – Mesos, Zookeeper Distributed Storage– Ceph, Gluster, Lustre, HDFS
  53. 53. SSDs HDDs Hypervisor VM Controller VM/ Software Controller VM/ Software Servers Servers Hyper-converged / Software defined All Flash/ Hybrid arrays VM VM VM VM SSDs HDDs SSDs HDDs Scalable Exascale Distributed Object Storage Servers Erasure Coding RAIN Architecture High Performance Distributed Systems – Mesos, Zookeeper Distributed Storage– Ceph, Gluster, Lustre, HDFS In-memory Computing Burst Buffer Solutions
  54. 54. SSDs HDDs Hypervisor VM Controller VM/ Software Controller VM/ Software Servers Servers Hyper-converged / Software defined All Flash/ Hybrid arrays VM VM VM VM SSDs HDDs SSDs HDDs Scalable Exascale Distributed Object Storage Servers Erasure Coding RAIN Architecture High Performance Distributed Systems – Mesos, Zookeeper Distributed Storage– Ceph, Gluster, Lustre, HDFS In-memory Computing Burst Buffer Solutions •  Tachyon – Reliable data sharing at memory-speed across clusters
  55. 55. Tachyon – Current issues •  Data sharing is a bottleneck due to slow writes to disk •  Cache loss when process crashes •  In-memory data duplication and Java Garbage Collection
  56. 56. Tachyon – memory centric storage Reliable data sharing at memory-speed within and across cluster frameworks/ jobs
  57. 57. Tachyon – memory centric storage •  Memory-speed data sharing among jobs in different frameworks •  Keep in-memory data safe, even when a job crashes. •  No in-memory data duplication, much less GC
  58. 58. SSDs HDDs Hypervisor VM Controller VM/ Software Controller VM/ Software Servers Servers Hyper-converged / Software defined All Flash/ Hybrid arrays VM VM VM VM SSDs HDDs SSDs HDDs Scalable Exascale Distributed Object Storage Servers Erasure Coding RAIN Architecture High Performance Distributed Systems – Mesos, Zookeeper Distributed Storage– Ceph, Gluster, Lustre, HDFS In-memory Computing Burst Buffer Solutions •  Tachyon – Reliable data sharing at memory-speed across clusters •  Burst Buffers - Latency reduction, greater bandwidth and high IOPS performance
  59. 59. Burst Buffer Designs •  Introduce fast buffer layer •  Layer between memory and persistent storage •  pre-stage application data •  Buffer writes from memory to fast devices •  Store immediate application data •  Still a “mount point” – POSIX Compliance Reference: https://www.youtube.com/watch?v=l_aRU5x_SEo
  60. 60. SSDs HDDs Hypervisor VM Controller VM/ Software Controller VM/ Software Servers Servers Hyper-converged / Software defined All Flash/ Hybrid arrays VM VM Distributed Systems – Mesos, Zookeeper Micro- serviceMicro- serviceMicro- services Internet Scale Applications Distributed Storage– Ceph, Gluster, Lustre, HDFS VM VM Real-time Analytics and Batch processing SSDs HDDs SSDs HDDs Scalable Exascale Distributed Object Storage In-memory Computing Burst Buffer Solutions Servers Erasure Coding RAIN Architecture High Performance Real Time streams of data Containers Universe of things Supply Chains Click streams Security
  61. 61. SSDs HDDs Hypervisor VM Controller VM/ Software Controller VM/ Software Servers Servers Hyper-converged / Software defined All Flash/ Hybrid arrays VM VM Distributed Systems – Mesos, Zookeeper Micro- serviceMicro- serviceMicro- services Internet Scale Applications Distributed Storage– Ceph, Gluster, Lustre, HDFS VM VM Real-time Analytics and Batch processing SSDs HDDs SSDs HDDs Scalable Exascale Distributed Object Storage In-memory Computing Burst Buffer Solutions Servers Erasure Coding RAIN Architecture High Performance Real Time streams of data Containers Universe of things Supply Chains Click streams Security
  62. 62. SSDs HDDs Hypervisor VM Controller VM/ Software Controller VM/ Software Servers Servers Hyper-converged / Software defined All Flash/ Hybrid arrays VM VM Distributed Systems – Mesos, Zookeeper Micro- serviceMicro- serviceMicro- services Internet Scale Applications Distributed Storage– Ceph, Gluster, Lustre, HDFS VM VM Real-time Analytics and Batch processing SSDs HDDs SSDs HDDs Scalable Exascale Distributed Object Storage In-memory Computing Burst Buffer Solutions Servers Erasure Coding RAIN Architecture High Performance Real Time streams of data Containers Universe of things Supply Chains Click streams Security
  63. 63. SSDs HDDs Hypervisor VM Controller VM/ Software Controller VM/ Software Servers Servers Hyper-converged / Software defined All Flash/ Hybrid arrays VM VM Distributed Systems – Mesos, Zookeeper Micro- serviceMicro- serviceMicro- services Internet Scale Applications Distributed Storage– Ceph, Gluster, Lustre, HDFS VM VM Real-time Analytics and Batch processing SSDs HDDs SSDs HDDs Scalable Exascale Distributed Object Storage In-memory Computing Burst Buffer Solutions Servers Erasure Coding RAIN Architecture High Performance Real Time streams of data Containers Universe of things Supply Chains Click streams Security
  64. 64. Questions & Queries!
  65. 65. Thank You! MSys Chennai Bristol IT Park, 4th Floor, Plot No. 10, South Phase, Thiru Vi Ka Industrial Estate, Guindy, Chennai 600032 Ph. +91-44-39167015 MSys Bangalore No: 56/3, Ground Floor, Vakil Square, Bannerghatta Road, Bangalore -560 029 Ph. +91-80-41158363 MSys Georgia 4385 Kimball Bridge Rd, Suite 203, Johns Creek, Georgia – 30022 Ph. +1 770-809-3217 E: [email protected] W: www.msys-tech.com /msystech @msys_tech Clogeny Pune Plot no. 34/2, Rajiv Gandhi Infotech Park – Phase 1, Hinjewadi, Pune - 411 057 Ph. +91 20 661 43 482 US Ph.  +1 408 556 9645

×