SlideShare une entreprise Scribd logo
1  sur  28
Télécharger pour lire hors ligne
Disksim with SSD extension
          -- A develop's perspective


                    Jiannan Ouyang
                     PhD CS@PITT
                         2011/04/07
Outline

  Overview

  Disksim implementation

  SSD extension
Disksim

Disksim: An open source disk simulator originally developed at
UMich. and enhanced at CMU.
Disksim features

  Various device model including: disk, simpledisk,
  memsmodel

  Controller model: simple, smart(with cache)

  Trace synthesis and different trace file format

  DIXtrac: automatic disk characterization
ssdmodel

 Developed by Microsoft.

 NOT for any specific SSD Device

 For an idealized SSD that is parameterized by the
 properties of NAND flash chips

 Cache is NOT natively supported
Source Dir


     src/         disksim source (disksim_*.c/h)

     ssdmodel/    ssd extension source (ssd_*.c/h)

     diskmodel/   diskmodel layout and mech

     memsmodel/   MEMS device model

     libparam/    parameter processing lib

     ...
Outline

  Overview

  Disksim implementation

  SSD extension
Disksim source: src/


    disksim_main*         main entrance     main()

    disksim_iodriver*     driver            iodriver_send_event_down_path()

    dismsim_bus*          bus               bus_deliver_event()

    disksim_controller*   controller        controller_event_arrive()

    disksim_diskctlr*     disk controller   disk_event_arrive()

    ...
Disksim Control Path

Event Based System:
   various types of events: io, interrupt, timer...
   all event are stored in a global queue in time order
   addtointq() and removefromintq() are used to access the
   global queue

Equivalent code:
while(curr=getnextevent()){
  swith (curr->type){
     case IO_REQUEST_ARRIVE:
          iodriver_request(curr); break;
  }
}
Example

src/disksim_iosim.c io_internal_event()
 case IO_ACCESS_ARRIVE:
   iodriver_schedule(0, curr);
   break;

src/disksim_iodriver.c iodriver_schedule()
   iodriver_send_event_down_path(curr);

src/disksim_iodriver.c iodriver_send_event_down_path()
   bus_deliver_event(busno.byte[0], slotno.byte[0], curr);
Example con.

src/disksim_bus.c bus_deliver_event()
  case CONTROLLER:
   controller_event_arrive(devno, curr);
   break;

 case DEVICE:
  ASSERT(devno == curr->devno);
  device_event_arrive(curr);
  break;


This control flow is a simulation of an event.
Disksim & Device Interface

INLINE void device_event_arrive (ioreq_event *curr)
{
  ASSERT1 ((curr->devno >= 0) && (curr->devno <
numdevices), "curr->devno", curr->devno);
  return disksim->deviceinfo->devices[curr->devno]-
>event_arrive(curr);
}



Funtion pointer! By dynamic tracing using gdb, we found that
For disk, it jumps to disk_event_arrive()
For ssd, it jumps to ssd_event_arrive()
event_arrive: disk v.s. ssd
disk_event_arrive()                                 ssd_event_arrive()
case IO_ACCESS_ARRIVE:                              case DEVICE_OVERHEAD_COMPLETE:
   disk_request_arrive(curr);                           ssd_request_arrive(curr);
 case DEVICE_OVERHEAD_COMPLETE:
  disk_request_arrive(curr);
                                                    case DEVICE_ACCESS_COMPLETE:
 case DEVICE_BUFFER_SEEKDONE:                           ssd_access_complete (curr);
  disk_buffer_seekdone(currdisk, curr);             case DEVICE_DATA_TRANSFER_COMPLETE:
 case DEVICE_BUFFER_SECTOR_DONE:                        ssd_bustransfer_complete(curr);
  disk_buffer_sector_done(currdisk, curr);          case IO_INTERRUPT_COMPLETE:
 case DEVICE_GOTO_REMAPPED_SECTOR:
  disk_goto_remapped_sector(currdisk, curr);
                                                        ssd_interrupt_complete(curr);
 case DEVICE_GOT_REMAPPED_SECTOR:                   case SSD_CLEAN_GANG:
  disk_got_remapped_sector(currdisk, curr);              ssd_clean_gang_complete(curr);
 case DEVICE_PREPARE_FOR_DATA_TRANSFER:             case SSD_CLEAN_ELEMENT:
  disk_prepare_for_data_transfer(curr);                  ssd_clean_element_complete(curr);
 case DEVICE_DATA_TRANSFER_COMPLETE:
  disk_reconnection_or_transfer_complete(curr);
 case IO_INTERRUPT_COMPLETE:
  disk_interrupt_complete(curr);



"buffer" is cache related events.                   "clean" is garbage collection and wear-leveling
"remapped sector" seems to related to data layout   related. "Gang" and "Element" specify the
(not sure)                                          allocation and reclaim unit.
Outline

  Overview

  Disksim implementation

  SSD extension
ssdmodel features

 Add an auxiliary level of parallel elements, each with a
 closed queue, to represent flash elements or gangs
 Add logic to serialized request completions from these
 parallel elements
 For each elements, maintain data structures to represent
 SSD logical block maps, cleaning state and wear_leveling
 state
 Delay is introduced when request is processed
 Parameters including background cleaning, gang-size, gang
 organization, interleaving, overprovisioning
Flash Package Internal
Flash Chip Performance
1. Latency                   4. Bandwidth and Interleave
bus<->data reg      100us


media->reg: read    25us     src plane -> dest plane 4 page copying
                             (100us per page)
reg->media: write   200us


erease              1.5ms



2. Two-plane commands
can be executed on their
plane pairs 0&1 or 2&3

3. Support background copy
on the same plane
SSD Simulation

 Logical Block Map
    allocation pool

 Cleaning
    greedy or wear-leveling aware

 Parallelism and Interconnect Density
    ganging, interleaving, background cleaning

 Persistence
    saving mapping information per block in DRAM
Interconnection - Ganging

  A gang of flash packages
  can be utilized in synchrony
  to optimized a multi-page
  request.
  Allow multiple packages to
  be used in parallel while
  sharing one request queue
  A request queue can be
  associated to each gang or
  to each element (full
  interconnection mode)
Logical Block Map

 Use allocation pool to think about how an SSD allocates
 flash blocks to service write requests

 An allocation pool an be a flash package or a gang

 Static: a portion of each LBA constitutes a fixed mapping to
 a specific allocation pool

 Dynamic: the non-static portion of a LBA is the lookup key
 for a mapping within a pool
Garbage Collection (Cleaning)

  active block: block available to holding incoming writes in a
  pool

  superseded page: out-of-date page

  cleaning efficiency: (superseded / total pages) in a block

  a pure greedy approach: choosing blocks to clean based on
  potential cleaning efficiency
Wear-Leveling

   average remaining lifetime(ARL) of a block
   age variance (say 20%) of the ARL
   retirement age (say 85%) of the ARL

Wear-aware garbage collection:
1. If ARL < retirement, migrate cold data into this block from a
   migration-candidate queue, and recycle the head block of
   the queue. Populate the queue with new blocks with cold
   data.

   Otherwise, if ARL<age variance, then restrict recycling of
   the block with a probability that increases linearly as the
   remaining lifetime drops to 0. (80% of average ~ Prob of
   recycle = 1; 0% of average ~ 0)
Source: ssdmodel/

ssdmodel is very simple, all c files listed below:

      ssd.c         main                       ssd_event_arrive()

      ssd_clean.c gabege collection and wear   ssd_activate_gang()
                  leveling
      ssd_gang.c several flash packages        ssd_clean_blocks_greedy()
                 orgnised as gang

      ssd_timing.c timing model                ssd_compute_access_time()

      ssd_utils.c   util

      ssd_init.c    init
Example

event sequences for one request:
ssd_request_arrive->ssd_interrupt_complete(reconnect)->ssd_bustransfer_complete-
>ssd_access_complete->ssd_interrupt_complete(completion)

ssd_bustransfer_complete() -> ssd_media_access_request ();
ssdmodel/ssd.c: ssd_media_access_request ()
     case SSD_ALLOC_POOL_PLANE:
     case SSD_ALLOC_POOL_CHIP:
       ssd_media_access_request_element(curr);
     break;
     case SSD_ALLOC_POOL_GANG:
#if SYNC_GANG
       ssd_media_access_request_gang_sync(curr);
#else
       ssd_media_access_request_gang(curr);
#endif
     break;
Example con.

ssd_media_access_request_element()
  -> sse_activate_element()
       -> ssd_invoke_element_cleaning()
       -> ssd_compute_access_time(currdisk, elem_num,
read_reqs, read_total);
       -> add complete into global event queue
       -> ssd_compute_access_time(currdisk, elem_num,
write_reqs, write_total);
       -> add complete into global event queue
Parallel processing sequential complete is achieved by processing batch of requests
in parallel, however, generate the ACCESS_COMPLETE events sequencially
References

Disksim: http://www.pdl.cmu.edu/DiskSim/
Disksim Manual: http://www.pdl.cmu.edu/PDL-
FTP/DriveChar/CMU-PDL-08-101.pdf
Disksim implementation doc: src/doc/Outline.txt
SSD Extension: http://research.microsoft.com/en-
us/downloads/b41019e2-1d2b-44d8-b512-ba35ab814cd4/
SSD Extension paper: Design Tradeoffs for SSD
Performance, N Agrawal, 2008
Cache over SSD project: Group 6 on http://www-users.cselabs.
umn.edu/classes/Spring-2009/csci8980-ass/
Thanks

Q&A?
Block stripping
// blocks can be concatenated (chained) from each plane
//
// plane 0 plane 1 plane 2 plane 3
// ------------------------------------------
// blk 0       blk 2048 blk 4096 blk 6144
// blk 1       blk 2049 blk 4097 blk 6145
// ...      ...
// blk 2047 blk 4095 blk 6143 blk 8191

// blocks can be stripped across all the planes
//
// plane 0 plane 1 plane 2 plane 3
// ------------------------------------------
// blk 0       blk 1      blk 2      blk 3
// blk 4       blk 5      blk 6      blk 7
// ...      ...
// blk 8188 blk 8189 blk 8190 blk 8191
//

Contenu connexe

Tendances

12.mass stroage system
12.mass stroage system12.mass stroage system
12.mass stroage systemSenthil Kanth
 
Hpux AdvFS On Disk Structure Scoping
Hpux AdvFS On Disk Structure ScopingHpux AdvFS On Disk Structure Scoping
Hpux AdvFS On Disk Structure ScopingJustin Goldberg
 
Optimize Oracle On VMware (Sep 2011)
Optimize Oracle On VMware (Sep 2011)Optimize Oracle On VMware (Sep 2011)
Optimize Oracle On VMware (Sep 2011)Guy Harrison
 
Optimize oracle on VMware (April 2011)
Optimize oracle on VMware (April 2011)Optimize oracle on VMware (April 2011)
Optimize oracle on VMware (April 2011)Guy Harrison
 
W1.1 i os in database
W1.1   i os in databaseW1.1   i os in database
W1.1 i os in databasegafurov_x
 
Persistent Memory Programming with Java*
Persistent Memory Programming with Java*Persistent Memory Programming with Java*
Persistent Memory Programming with Java*Intel® Software
 
Backing Up the MySQL Database
Backing Up the MySQL DatabaseBacking Up the MySQL Database
Backing Up the MySQL DatabaseSanjay Manwani
 
Adaptec’s maxCache™ 3.0 Read and Write SSD Caching Solution
Adaptec’s maxCache™ 3.0 Read and Write SSD Caching SolutionAdaptec’s maxCache™ 3.0 Read and Write SSD Caching Solution
Adaptec’s maxCache™ 3.0 Read and Write SSD Caching SolutionAdaptec by PMC
 
SysInternals Disk2vhd - docs.pdf
SysInternals Disk2vhd - docs.pdfSysInternals Disk2vhd - docs.pdf
SysInternals Disk2vhd - docs.pdfhtdvul
 
Unitrends Overview 2012
Unitrends Overview 2012Unitrends Overview 2012
Unitrends Overview 2012Tracy Hawkey
 
Volatile Uses for Persistent Memory
Volatile Uses for Persistent MemoryVolatile Uses for Persistent Memory
Volatile Uses for Persistent MemoryIntel® Software
 
Mass storage structurefinal
Mass storage structurefinalMass storage structurefinal
Mass storage structurefinalmarangburu42
 
Ch11 - Silberschatz
Ch11 - SilberschatzCh11 - Silberschatz
Ch11 - SilberschatzMarcus Braga
 
OS Slide Ch12 13
OS Slide Ch12 13OS Slide Ch12 13
OS Slide Ch12 13庭緯 陳
 
Solid state devices
Solid state devicesSolid state devices
Solid state devicesAqib Mir
 
Seagate 7200 vs wd 5400
Seagate 7200 vs wd 5400Seagate 7200 vs wd 5400
Seagate 7200 vs wd 5400jebtang
 
Eonstor GSc family introduction
Eonstor GSc family introductionEonstor GSc family introduction
Eonstor GSc family introductioninfortrendgroup
 

Tendances (20)

12.mass stroage system
12.mass stroage system12.mass stroage system
12.mass stroage system
 
Dba tuning
Dba tuningDba tuning
Dba tuning
 
Hpux AdvFS On Disk Structure Scoping
Hpux AdvFS On Disk Structure ScopingHpux AdvFS On Disk Structure Scoping
Hpux AdvFS On Disk Structure Scoping
 
Optimize Oracle On VMware (Sep 2011)
Optimize Oracle On VMware (Sep 2011)Optimize Oracle On VMware (Sep 2011)
Optimize Oracle On VMware (Sep 2011)
 
Optimize oracle on VMware (April 2011)
Optimize oracle on VMware (April 2011)Optimize oracle on VMware (April 2011)
Optimize oracle on VMware (April 2011)
 
W1.1 i os in database
W1.1   i os in databaseW1.1   i os in database
W1.1 i os in database
 
Persistent Memory Programming with Java*
Persistent Memory Programming with Java*Persistent Memory Programming with Java*
Persistent Memory Programming with Java*
 
Backing Up the MySQL Database
Backing Up the MySQL DatabaseBacking Up the MySQL Database
Backing Up the MySQL Database
 
Adaptec’s maxCache™ 3.0 Read and Write SSD Caching Solution
Adaptec’s maxCache™ 3.0 Read and Write SSD Caching SolutionAdaptec’s maxCache™ 3.0 Read and Write SSD Caching Solution
Adaptec’s maxCache™ 3.0 Read and Write SSD Caching Solution
 
SysInternals Disk2vhd - docs.pdf
SysInternals Disk2vhd - docs.pdfSysInternals Disk2vhd - docs.pdf
SysInternals Disk2vhd - docs.pdf
 
Bare metal restore.
Bare metal restore.Bare metal restore.
Bare metal restore.
 
Unitrends Overview 2012
Unitrends Overview 2012Unitrends Overview 2012
Unitrends Overview 2012
 
Volatile Uses for Persistent Memory
Volatile Uses for Persistent MemoryVolatile Uses for Persistent Memory
Volatile Uses for Persistent Memory
 
Mass storage structurefinal
Mass storage structurefinalMass storage structurefinal
Mass storage structurefinal
 
Ch11 - Silberschatz
Ch11 - SilberschatzCh11 - Silberschatz
Ch11 - Silberschatz
 
OS Slide Ch12 13
OS Slide Ch12 13OS Slide Ch12 13
OS Slide Ch12 13
 
Solid state devices
Solid state devicesSolid state devices
Solid state devices
 
2 db2 instance creation
2 db2 instance creation2 db2 instance creation
2 db2 instance creation
 
Seagate 7200 vs wd 5400
Seagate 7200 vs wd 5400Seagate 7200 vs wd 5400
Seagate 7200 vs wd 5400
 
Eonstor GSc family introduction
Eonstor GSc family introductionEonstor GSc family introduction
Eonstor GSc family introduction
 

En vedette

Supporting Debian machines for friends and family
Supporting Debian machines for friends and familySupporting Debian machines for friends and family
Supporting Debian machines for friends and familyFrancois Marier
 
Swift at Scale: The IBM SoftLayer Story
Swift at Scale: The IBM SoftLayer StorySwift at Scale: The IBM SoftLayer Story
Swift at Scale: The IBM SoftLayer StoryBrian Cline
 
How to build Debian packages
How to build Debian packages How to build Debian packages
How to build Debian packages Priyank Kapadia
 
Dockerize the World - presentation from Hradec Kralove
Dockerize the World - presentation from Hradec KraloveDockerize the World - presentation from Hradec Kralove
Dockerize the World - presentation from Hradec Kralovedamovsky
 
Debian Cloud - building the Debian AMIs
Debian Cloud - building the Debian AMIsDebian Cloud - building the Debian AMIs
Debian Cloud - building the Debian AMIsJames Bromberger
 
Debian 套件打包教學指南 v0.19 - 繁體中文翻譯
Debian 套件打包教學指南 v0.19 - 繁體中文翻譯Debian 套件打包教學指南 v0.19 - 繁體中文翻譯
Debian 套件打包教學指南 v0.19 - 繁體中文翻譯SZ Lin
 
SR-IOV+KVM on Debian/Stable
SR-IOV+KVM on Debian/StableSR-IOV+KVM on Debian/Stable
SR-IOV+KVM on Debian/Stablejuet-y
 
Debian Packaging tutorial
Debian Packaging tutorialDebian Packaging tutorial
Debian Packaging tutorialnussbauml
 
Deep Dive: Maximizing EC2 and EBS Performance
Deep Dive: Maximizing EC2 and EBS PerformanceDeep Dive: Maximizing EC2 and EBS Performance
Deep Dive: Maximizing EC2 and EBS PerformanceAmazon Web Services
 
Debian Linux on Zynq (Xilinx ARM-SoC FPGA) Setup Flow (Vivado 2015.4)
Debian Linux on Zynq (Xilinx ARM-SoC FPGA) Setup Flow (Vivado 2015.4)Debian Linux on Zynq (Xilinx ARM-SoC FPGA) Setup Flow (Vivado 2015.4)
Debian Linux on Zynq (Xilinx ARM-SoC FPGA) Setup Flow (Vivado 2015.4)Shinya Takamaeda-Y
 
Embedded Linux/ Debian with ARM64 Platform
Embedded Linux/ Debian with ARM64 PlatformEmbedded Linux/ Debian with ARM64 Platform
Embedded Linux/ Debian with ARM64 PlatformSZ Lin
 
Optimizing Oracle databases with SSD - April 2014
Optimizing Oracle databases with SSD - April 2014Optimizing Oracle databases with SSD - April 2014
Optimizing Oracle databases with SSD - April 2014Guy Harrison
 
SSD Deployment Strategies for MySQL
SSD Deployment Strategies for MySQLSSD Deployment Strategies for MySQL
SSD Deployment Strategies for MySQLYoshinori Matsunobu
 

En vedette (17)

Supporting Debian machines for friends and family
Supporting Debian machines for friends and familySupporting Debian machines for friends and family
Supporting Debian machines for friends and family
 
Swift at Scale: The IBM SoftLayer Story
Swift at Scale: The IBM SoftLayer StorySwift at Scale: The IBM SoftLayer Story
Swift at Scale: The IBM SoftLayer Story
 
How to build Debian packages
How to build Debian packages How to build Debian packages
How to build Debian packages
 
MySQL and SSD
MySQL and SSDMySQL and SSD
MySQL and SSD
 
Dockerize the World - presentation from Hradec Kralove
Dockerize the World - presentation from Hradec KraloveDockerize the World - presentation from Hradec Kralove
Dockerize the World - presentation from Hradec Kralove
 
Debian Cloud - building the Debian AMIs
Debian Cloud - building the Debian AMIsDebian Cloud - building the Debian AMIs
Debian Cloud - building the Debian AMIs
 
Debian 套件打包教學指南 v0.19 - 繁體中文翻譯
Debian 套件打包教學指南 v0.19 - 繁體中文翻譯Debian 套件打包教學指南 v0.19 - 繁體中文翻譯
Debian 套件打包教學指南 v0.19 - 繁體中文翻譯
 
SR-IOV+KVM on Debian/Stable
SR-IOV+KVM on Debian/StableSR-IOV+KVM on Debian/Stable
SR-IOV+KVM on Debian/Stable
 
Debian Packaging tutorial
Debian Packaging tutorialDebian Packaging tutorial
Debian Packaging tutorial
 
Deep Dive: Maximizing EC2 and EBS Performance
Deep Dive: Maximizing EC2 and EBS PerformanceDeep Dive: Maximizing EC2 and EBS Performance
Deep Dive: Maximizing EC2 and EBS Performance
 
Debian Linux on Zynq (Xilinx ARM-SoC FPGA) Setup Flow (Vivado 2015.4)
Debian Linux on Zynq (Xilinx ARM-SoC FPGA) Setup Flow (Vivado 2015.4)Debian Linux on Zynq (Xilinx ARM-SoC FPGA) Setup Flow (Vivado 2015.4)
Debian Linux on Zynq (Xilinx ARM-SoC FPGA) Setup Flow (Vivado 2015.4)
 
Embedded Linux/ Debian with ARM64 Platform
Embedded Linux/ Debian with ARM64 PlatformEmbedded Linux/ Debian with ARM64 Platform
Embedded Linux/ Debian with ARM64 Platform
 
Solid state drives
Solid state drivesSolid state drives
Solid state drives
 
Optimizing Oracle databases with SSD - April 2014
Optimizing Oracle databases with SSD - April 2014Optimizing Oracle databases with SSD - April 2014
Optimizing Oracle databases with SSD - April 2014
 
Linux introduction
Linux introductionLinux introduction
Linux introduction
 
SSD Deployment Strategies for MySQL
SSD Deployment Strategies for MySQLSSD Deployment Strategies for MySQL
SSD Deployment Strategies for MySQL
 
SSD: Single Shot MultiBox Detector (UPC Reading Group)
SSD: Single Shot MultiBox Detector (UPC Reading Group)SSD: Single Shot MultiBox Detector (UPC Reading Group)
SSD: Single Shot MultiBox Detector (UPC Reading Group)
 

Similaire à Disksim with SSD_extension

Operation System
Operation SystemOperation System
Operation SystemANANTHI1997
 
Physical Memory Management.pdf
Physical Memory Management.pdfPhysical Memory Management.pdf
Physical Memory Management.pdfAdrian Huang
 
Open Source Systems Performance
Open Source Systems PerformanceOpen Source Systems Performance
Open Source Systems PerformanceBrendan Gregg
 
U boot porting guide for SoC
U boot porting guide for SoCU boot porting guide for SoC
U boot porting guide for SoCMacpaul Lin
 
Ajuste (tuning) del rendimiento de SQL Server 2008
Ajuste (tuning) del rendimiento de SQL Server 2008Ajuste (tuning) del rendimiento de SQL Server 2008
Ajuste (tuning) del rendimiento de SQL Server 2008Eduardo Castro
 
SQL Server Performance Analysis
SQL Server Performance AnalysisSQL Server Performance Analysis
SQL Server Performance AnalysisEduardo Castro
 
Debugging & Tuning in Spark
Debugging & Tuning in SparkDebugging & Tuning in Spark
Debugging & Tuning in SparkShiao-An Yuan
 
Hadoop + Cassandra: Fast queries on data lakes, and wikipedia search tutorial.
Hadoop + Cassandra: Fast queries on data lakes, and  wikipedia search tutorial.Hadoop + Cassandra: Fast queries on data lakes, and  wikipedia search tutorial.
Hadoop + Cassandra: Fast queries on data lakes, and wikipedia search tutorial.Natalino Busa
 
SparkR - Play Spark Using R (20160909 HadoopCon)
SparkR - Play Spark Using R (20160909 HadoopCon)SparkR - Play Spark Using R (20160909 HadoopCon)
SparkR - Play Spark Using R (20160909 HadoopCon)wqchen
 
Sector Sphere 2009
Sector Sphere 2009Sector Sphere 2009
Sector Sphere 2009lilyco
 
sector-sphere
sector-spheresector-sphere
sector-spherexlight
 
Ch14 OS
Ch14 OSCh14 OS
Ch14 OSC.U
 
What every data programmer needs to know about disks
What every data programmer needs to know about disksWhat every data programmer needs to know about disks
What every data programmer needs to know about disksiammutex
 
PT-4057, Automated CUDA-to-OpenCL™ Translation with CU2CL: What's Next?, by W...
PT-4057, Automated CUDA-to-OpenCL™ Translation with CU2CL: What's Next?, by W...PT-4057, Automated CUDA-to-OpenCL™ Translation with CU2CL: What's Next?, by W...
PT-4057, Automated CUDA-to-OpenCL™ Translation with CU2CL: What's Next?, by W...AMD Developer Central
 

Similaire à Disksim with SSD_extension (20)

Les 01 Arch
Les 01 ArchLes 01 Arch
Les 01 Arch
 
Operation System
Operation SystemOperation System
Operation System
 
Physical Memory Management.pdf
Physical Memory Management.pdfPhysical Memory Management.pdf
Physical Memory Management.pdf
 
Vmfs
VmfsVmfs
Vmfs
 
Operation System
Operation SystemOperation System
Operation System
 
Open Source Systems Performance
Open Source Systems PerformanceOpen Source Systems Performance
Open Source Systems Performance
 
U boot porting guide for SoC
U boot porting guide for SoCU boot porting guide for SoC
U boot porting guide for SoC
 
Memory
MemoryMemory
Memory
 
Ajuste (tuning) del rendimiento de SQL Server 2008
Ajuste (tuning) del rendimiento de SQL Server 2008Ajuste (tuning) del rendimiento de SQL Server 2008
Ajuste (tuning) del rendimiento de SQL Server 2008
 
SQL Server Performance Analysis
SQL Server Performance AnalysisSQL Server Performance Analysis
SQL Server Performance Analysis
 
Debugging & Tuning in Spark
Debugging & Tuning in SparkDebugging & Tuning in Spark
Debugging & Tuning in Spark
 
Hadoop + Cassandra: Fast queries on data lakes, and wikipedia search tutorial.
Hadoop + Cassandra: Fast queries on data lakes, and  wikipedia search tutorial.Hadoop + Cassandra: Fast queries on data lakes, and  wikipedia search tutorial.
Hadoop + Cassandra: Fast queries on data lakes, and wikipedia search tutorial.
 
SparkR - Play Spark Using R (20160909 HadoopCon)
SparkR - Play Spark Using R (20160909 HadoopCon)SparkR - Play Spark Using R (20160909 HadoopCon)
SparkR - Play Spark Using R (20160909 HadoopCon)
 
Sector Sphere 2009
Sector Sphere 2009Sector Sphere 2009
Sector Sphere 2009
 
sector-sphere
sector-spheresector-sphere
sector-sphere
 
OSCh14
OSCh14OSCh14
OSCh14
 
Ch14 OS
Ch14 OSCh14 OS
Ch14 OS
 
OS_Ch14
OS_Ch14OS_Ch14
OS_Ch14
 
What every data programmer needs to know about disks
What every data programmer needs to know about disksWhat every data programmer needs to know about disks
What every data programmer needs to know about disks
 
PT-4057, Automated CUDA-to-OpenCL™ Translation with CU2CL: What's Next?, by W...
PT-4057, Automated CUDA-to-OpenCL™ Translation with CU2CL: What's Next?, by W...PT-4057, Automated CUDA-to-OpenCL™ Translation with CU2CL: What's Next?, by W...
PT-4057, Automated CUDA-to-OpenCL™ Translation with CU2CL: What's Next?, by W...
 

Dernier

Connector Corner: Extending LLM automation use cases with UiPath GenAI connec...
Connector Corner: Extending LLM automation use cases with UiPath GenAI connec...Connector Corner: Extending LLM automation use cases with UiPath GenAI connec...
Connector Corner: Extending LLM automation use cases with UiPath GenAI connec...DianaGray10
 
How Accurate are Carbon Emissions Projections?
How Accurate are Carbon Emissions Projections?How Accurate are Carbon Emissions Projections?
How Accurate are Carbon Emissions Projections?IES VE
 
AI Fame Rush Review – Virtual Influencer Creation In Just Minutes
AI Fame Rush Review – Virtual Influencer Creation In Just MinutesAI Fame Rush Review – Virtual Influencer Creation In Just Minutes
AI Fame Rush Review – Virtual Influencer Creation In Just MinutesMd Hossain Ali
 
Bird eye's view on Camunda open source ecosystem
Bird eye's view on Camunda open source ecosystemBird eye's view on Camunda open source ecosystem
Bird eye's view on Camunda open source ecosystemAsko Soukka
 
KubeConEU24-Monitoring Kubernetes and Cloud Spend with OpenCost
KubeConEU24-Monitoring Kubernetes and Cloud Spend with OpenCostKubeConEU24-Monitoring Kubernetes and Cloud Spend with OpenCost
KubeConEU24-Monitoring Kubernetes and Cloud Spend with OpenCostMatt Ray
 
Computer 10: Lesson 10 - Online Crimes and Hazards
Computer 10: Lesson 10 - Online Crimes and HazardsComputer 10: Lesson 10 - Online Crimes and Hazards
Computer 10: Lesson 10 - Online Crimes and HazardsSeth Reyes
 
Building AI-Driven Apps Using Semantic Kernel.pptx
Building AI-Driven Apps Using Semantic Kernel.pptxBuilding AI-Driven Apps Using Semantic Kernel.pptx
Building AI-Driven Apps Using Semantic Kernel.pptxUdaiappa Ramachandran
 
UiPath Studio Web workshop series - Day 7
UiPath Studio Web workshop series - Day 7UiPath Studio Web workshop series - Day 7
UiPath Studio Web workshop series - Day 7DianaGray10
 
Apres-Cyber - The Data Dilemma: Bridging Offensive Operations and Machine Lea...
Apres-Cyber - The Data Dilemma: Bridging Offensive Operations and Machine Lea...Apres-Cyber - The Data Dilemma: Bridging Offensive Operations and Machine Lea...
Apres-Cyber - The Data Dilemma: Bridging Offensive Operations and Machine Lea...Will Schroeder
 
Using IESVE for Loads, Sizing and Heat Pump Modeling to Achieve Decarbonization
Using IESVE for Loads, Sizing and Heat Pump Modeling to Achieve DecarbonizationUsing IESVE for Loads, Sizing and Heat Pump Modeling to Achieve Decarbonization
Using IESVE for Loads, Sizing and Heat Pump Modeling to Achieve DecarbonizationIES VE
 
UiPath Platform: The Backend Engine Powering Your Automation - Session 1
UiPath Platform: The Backend Engine Powering Your Automation - Session 1UiPath Platform: The Backend Engine Powering Your Automation - Session 1
UiPath Platform: The Backend Engine Powering Your Automation - Session 1DianaGray10
 
20230202 - Introduction to tis-py
20230202 - Introduction to tis-py20230202 - Introduction to tis-py
20230202 - Introduction to tis-pyJamie (Taka) Wang
 
Introduction to Matsuo Laboratory (ENG).pptx
Introduction to Matsuo Laboratory (ENG).pptxIntroduction to Matsuo Laboratory (ENG).pptx
Introduction to Matsuo Laboratory (ENG).pptxMatsuo Lab
 
Igniting Next Level Productivity with AI-Infused Data Integration Workflows
Igniting Next Level Productivity with AI-Infused Data Integration WorkflowsIgniting Next Level Productivity with AI-Infused Data Integration Workflows
Igniting Next Level Productivity with AI-Infused Data Integration WorkflowsSafe Software
 
Meet the new FSP 3000 M-Flex800™
Meet the new FSP 3000 M-Flex800™Meet the new FSP 3000 M-Flex800™
Meet the new FSP 3000 M-Flex800™Adtran
 
Comparing Sidecar-less Service Mesh from Cilium and Istio
Comparing Sidecar-less Service Mesh from Cilium and IstioComparing Sidecar-less Service Mesh from Cilium and Istio
Comparing Sidecar-less Service Mesh from Cilium and IstioChristian Posta
 
UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...
UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...
UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...UbiTrack UK
 

Dernier (20)

Connector Corner: Extending LLM automation use cases with UiPath GenAI connec...
Connector Corner: Extending LLM automation use cases with UiPath GenAI connec...Connector Corner: Extending LLM automation use cases with UiPath GenAI connec...
Connector Corner: Extending LLM automation use cases with UiPath GenAI connec...
 
How Accurate are Carbon Emissions Projections?
How Accurate are Carbon Emissions Projections?How Accurate are Carbon Emissions Projections?
How Accurate are Carbon Emissions Projections?
 
201610817 - edge part1
201610817 - edge part1201610817 - edge part1
201610817 - edge part1
 
20150722 - AGV
20150722 - AGV20150722 - AGV
20150722 - AGV
 
AI Fame Rush Review – Virtual Influencer Creation In Just Minutes
AI Fame Rush Review – Virtual Influencer Creation In Just MinutesAI Fame Rush Review – Virtual Influencer Creation In Just Minutes
AI Fame Rush Review – Virtual Influencer Creation In Just Minutes
 
Bird eye's view on Camunda open source ecosystem
Bird eye's view on Camunda open source ecosystemBird eye's view on Camunda open source ecosystem
Bird eye's view on Camunda open source ecosystem
 
20230104 - machine vision
20230104 - machine vision20230104 - machine vision
20230104 - machine vision
 
KubeConEU24-Monitoring Kubernetes and Cloud Spend with OpenCost
KubeConEU24-Monitoring Kubernetes and Cloud Spend with OpenCostKubeConEU24-Monitoring Kubernetes and Cloud Spend with OpenCost
KubeConEU24-Monitoring Kubernetes and Cloud Spend with OpenCost
 
Computer 10: Lesson 10 - Online Crimes and Hazards
Computer 10: Lesson 10 - Online Crimes and HazardsComputer 10: Lesson 10 - Online Crimes and Hazards
Computer 10: Lesson 10 - Online Crimes and Hazards
 
Building AI-Driven Apps Using Semantic Kernel.pptx
Building AI-Driven Apps Using Semantic Kernel.pptxBuilding AI-Driven Apps Using Semantic Kernel.pptx
Building AI-Driven Apps Using Semantic Kernel.pptx
 
UiPath Studio Web workshop series - Day 7
UiPath Studio Web workshop series - Day 7UiPath Studio Web workshop series - Day 7
UiPath Studio Web workshop series - Day 7
 
Apres-Cyber - The Data Dilemma: Bridging Offensive Operations and Machine Lea...
Apres-Cyber - The Data Dilemma: Bridging Offensive Operations and Machine Lea...Apres-Cyber - The Data Dilemma: Bridging Offensive Operations and Machine Lea...
Apres-Cyber - The Data Dilemma: Bridging Offensive Operations and Machine Lea...
 
Using IESVE for Loads, Sizing and Heat Pump Modeling to Achieve Decarbonization
Using IESVE for Loads, Sizing and Heat Pump Modeling to Achieve DecarbonizationUsing IESVE for Loads, Sizing and Heat Pump Modeling to Achieve Decarbonization
Using IESVE for Loads, Sizing and Heat Pump Modeling to Achieve Decarbonization
 
UiPath Platform: The Backend Engine Powering Your Automation - Session 1
UiPath Platform: The Backend Engine Powering Your Automation - Session 1UiPath Platform: The Backend Engine Powering Your Automation - Session 1
UiPath Platform: The Backend Engine Powering Your Automation - Session 1
 
20230202 - Introduction to tis-py
20230202 - Introduction to tis-py20230202 - Introduction to tis-py
20230202 - Introduction to tis-py
 
Introduction to Matsuo Laboratory (ENG).pptx
Introduction to Matsuo Laboratory (ENG).pptxIntroduction to Matsuo Laboratory (ENG).pptx
Introduction to Matsuo Laboratory (ENG).pptx
 
Igniting Next Level Productivity with AI-Infused Data Integration Workflows
Igniting Next Level Productivity with AI-Infused Data Integration WorkflowsIgniting Next Level Productivity with AI-Infused Data Integration Workflows
Igniting Next Level Productivity with AI-Infused Data Integration Workflows
 
Meet the new FSP 3000 M-Flex800™
Meet the new FSP 3000 M-Flex800™Meet the new FSP 3000 M-Flex800™
Meet the new FSP 3000 M-Flex800™
 
Comparing Sidecar-less Service Mesh from Cilium and Istio
Comparing Sidecar-less Service Mesh from Cilium and IstioComparing Sidecar-less Service Mesh from Cilium and Istio
Comparing Sidecar-less Service Mesh from Cilium and Istio
 
UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...
UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...
UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...
 

Disksim with SSD_extension

  • 1. Disksim with SSD extension -- A develop's perspective Jiannan Ouyang PhD CS@PITT 2011/04/07
  • 2. Outline Overview Disksim implementation SSD extension
  • 3. Disksim Disksim: An open source disk simulator originally developed at UMich. and enhanced at CMU.
  • 4. Disksim features Various device model including: disk, simpledisk, memsmodel Controller model: simple, smart(with cache) Trace synthesis and different trace file format DIXtrac: automatic disk characterization
  • 5. ssdmodel Developed by Microsoft. NOT for any specific SSD Device For an idealized SSD that is parameterized by the properties of NAND flash chips Cache is NOT natively supported
  • 6. Source Dir src/ disksim source (disksim_*.c/h) ssdmodel/ ssd extension source (ssd_*.c/h) diskmodel/ diskmodel layout and mech memsmodel/ MEMS device model libparam/ parameter processing lib ...
  • 7. Outline Overview Disksim implementation SSD extension
  • 8. Disksim source: src/ disksim_main* main entrance main() disksim_iodriver* driver iodriver_send_event_down_path() dismsim_bus* bus bus_deliver_event() disksim_controller* controller controller_event_arrive() disksim_diskctlr* disk controller disk_event_arrive() ...
  • 9. Disksim Control Path Event Based System: various types of events: io, interrupt, timer... all event are stored in a global queue in time order addtointq() and removefromintq() are used to access the global queue Equivalent code: while(curr=getnextevent()){ swith (curr->type){ case IO_REQUEST_ARRIVE: iodriver_request(curr); break; } }
  • 10. Example src/disksim_iosim.c io_internal_event() case IO_ACCESS_ARRIVE: iodriver_schedule(0, curr); break; src/disksim_iodriver.c iodriver_schedule() iodriver_send_event_down_path(curr); src/disksim_iodriver.c iodriver_send_event_down_path() bus_deliver_event(busno.byte[0], slotno.byte[0], curr);
  • 11. Example con. src/disksim_bus.c bus_deliver_event() case CONTROLLER: controller_event_arrive(devno, curr); break; case DEVICE: ASSERT(devno == curr->devno); device_event_arrive(curr); break; This control flow is a simulation of an event.
  • 12. Disksim & Device Interface INLINE void device_event_arrive (ioreq_event *curr) { ASSERT1 ((curr->devno >= 0) && (curr->devno < numdevices), "curr->devno", curr->devno); return disksim->deviceinfo->devices[curr->devno]- >event_arrive(curr); } Funtion pointer! By dynamic tracing using gdb, we found that For disk, it jumps to disk_event_arrive() For ssd, it jumps to ssd_event_arrive()
  • 13. event_arrive: disk v.s. ssd disk_event_arrive() ssd_event_arrive() case IO_ACCESS_ARRIVE: case DEVICE_OVERHEAD_COMPLETE: disk_request_arrive(curr); ssd_request_arrive(curr); case DEVICE_OVERHEAD_COMPLETE: disk_request_arrive(curr); case DEVICE_ACCESS_COMPLETE: case DEVICE_BUFFER_SEEKDONE: ssd_access_complete (curr); disk_buffer_seekdone(currdisk, curr); case DEVICE_DATA_TRANSFER_COMPLETE: case DEVICE_BUFFER_SECTOR_DONE: ssd_bustransfer_complete(curr); disk_buffer_sector_done(currdisk, curr); case IO_INTERRUPT_COMPLETE: case DEVICE_GOTO_REMAPPED_SECTOR: disk_goto_remapped_sector(currdisk, curr); ssd_interrupt_complete(curr); case DEVICE_GOT_REMAPPED_SECTOR: case SSD_CLEAN_GANG: disk_got_remapped_sector(currdisk, curr); ssd_clean_gang_complete(curr); case DEVICE_PREPARE_FOR_DATA_TRANSFER: case SSD_CLEAN_ELEMENT: disk_prepare_for_data_transfer(curr); ssd_clean_element_complete(curr); case DEVICE_DATA_TRANSFER_COMPLETE: disk_reconnection_or_transfer_complete(curr); case IO_INTERRUPT_COMPLETE: disk_interrupt_complete(curr); "buffer" is cache related events. "clean" is garbage collection and wear-leveling "remapped sector" seems to related to data layout related. "Gang" and "Element" specify the (not sure) allocation and reclaim unit.
  • 14. Outline Overview Disksim implementation SSD extension
  • 15. ssdmodel features Add an auxiliary level of parallel elements, each with a closed queue, to represent flash elements or gangs Add logic to serialized request completions from these parallel elements For each elements, maintain data structures to represent SSD logical block maps, cleaning state and wear_leveling state Delay is introduced when request is processed Parameters including background cleaning, gang-size, gang organization, interleaving, overprovisioning
  • 17. Flash Chip Performance 1. Latency 4. Bandwidth and Interleave bus<->data reg 100us media->reg: read 25us src plane -> dest plane 4 page copying (100us per page) reg->media: write 200us erease 1.5ms 2. Two-plane commands can be executed on their plane pairs 0&1 or 2&3 3. Support background copy on the same plane
  • 18. SSD Simulation Logical Block Map allocation pool Cleaning greedy or wear-leveling aware Parallelism and Interconnect Density ganging, interleaving, background cleaning Persistence saving mapping information per block in DRAM
  • 19. Interconnection - Ganging A gang of flash packages can be utilized in synchrony to optimized a multi-page request. Allow multiple packages to be used in parallel while sharing one request queue A request queue can be associated to each gang or to each element (full interconnection mode)
  • 20. Logical Block Map Use allocation pool to think about how an SSD allocates flash blocks to service write requests An allocation pool an be a flash package or a gang Static: a portion of each LBA constitutes a fixed mapping to a specific allocation pool Dynamic: the non-static portion of a LBA is the lookup key for a mapping within a pool
  • 21. Garbage Collection (Cleaning) active block: block available to holding incoming writes in a pool superseded page: out-of-date page cleaning efficiency: (superseded / total pages) in a block a pure greedy approach: choosing blocks to clean based on potential cleaning efficiency
  • 22. Wear-Leveling average remaining lifetime(ARL) of a block age variance (say 20%) of the ARL retirement age (say 85%) of the ARL Wear-aware garbage collection: 1. If ARL < retirement, migrate cold data into this block from a migration-candidate queue, and recycle the head block of the queue. Populate the queue with new blocks with cold data. Otherwise, if ARL<age variance, then restrict recycling of the block with a probability that increases linearly as the remaining lifetime drops to 0. (80% of average ~ Prob of recycle = 1; 0% of average ~ 0)
  • 23. Source: ssdmodel/ ssdmodel is very simple, all c files listed below: ssd.c main ssd_event_arrive() ssd_clean.c gabege collection and wear ssd_activate_gang() leveling ssd_gang.c several flash packages ssd_clean_blocks_greedy() orgnised as gang ssd_timing.c timing model ssd_compute_access_time() ssd_utils.c util ssd_init.c init
  • 24. Example event sequences for one request: ssd_request_arrive->ssd_interrupt_complete(reconnect)->ssd_bustransfer_complete- >ssd_access_complete->ssd_interrupt_complete(completion) ssd_bustransfer_complete() -> ssd_media_access_request (); ssdmodel/ssd.c: ssd_media_access_request () case SSD_ALLOC_POOL_PLANE: case SSD_ALLOC_POOL_CHIP: ssd_media_access_request_element(curr); break; case SSD_ALLOC_POOL_GANG: #if SYNC_GANG ssd_media_access_request_gang_sync(curr); #else ssd_media_access_request_gang(curr); #endif break;
  • 25. Example con. ssd_media_access_request_element() -> sse_activate_element() -> ssd_invoke_element_cleaning() -> ssd_compute_access_time(currdisk, elem_num, read_reqs, read_total); -> add complete into global event queue -> ssd_compute_access_time(currdisk, elem_num, write_reqs, write_total); -> add complete into global event queue Parallel processing sequential complete is achieved by processing batch of requests in parallel, however, generate the ACCESS_COMPLETE events sequencially
  • 26. References Disksim: http://www.pdl.cmu.edu/DiskSim/ Disksim Manual: http://www.pdl.cmu.edu/PDL- FTP/DriveChar/CMU-PDL-08-101.pdf Disksim implementation doc: src/doc/Outline.txt SSD Extension: http://research.microsoft.com/en- us/downloads/b41019e2-1d2b-44d8-b512-ba35ab814cd4/ SSD Extension paper: Design Tradeoffs for SSD Performance, N Agrawal, 2008 Cache over SSD project: Group 6 on http://www-users.cselabs. umn.edu/classes/Spring-2009/csci8980-ass/
  • 28. Block stripping // blocks can be concatenated (chained) from each plane // // plane 0 plane 1 plane 2 plane 3 // ------------------------------------------ // blk 0 blk 2048 blk 4096 blk 6144 // blk 1 blk 2049 blk 4097 blk 6145 // ... ... // blk 2047 blk 4095 blk 6143 blk 8191 // blocks can be stripped across all the planes // // plane 0 plane 1 plane 2 plane 3 // ------------------------------------------ // blk 0 blk 1 blk 2 blk 3 // blk 4 blk 5 blk 6 blk 7 // ... ... // blk 8188 blk 8189 blk 8190 blk 8191 //