SlideShare une entreprise Scribd logo
1  sur  47
Télécharger pour lire hors ligne
Mark Kilgard, Principal System Software Engineer
OpenGL for 2015
Page 2
Thirteen new standard OpenGL extensions for 2015
•New ARB extensions
- New shader, texture, and graphics pipeline functionality
- Proven standard technology
- Mostly existed previously as vendor extensions
- Now officially standardized by Khronos
- Ensures OpenGL is a proper super-set of ES 3.2
•Not a new core standard update but
- Eighth consecutive year of Khronos
updates to OpenGL at SIGGRAPH
- Also did Vulkan this year 
- Core version remains OpenGL 4.5
Page 3
Khronos 2015 Announcement for OpenGL
• August 10, 2015
- At SIGGRAPH
• “A set of OpenGL
extensions will …
expose the very
latest capabilities of
desktop hardware.”
Page 4
Same Day: NVIDIA has driver with full support
• August 10, 2015
- Tradition that NVIDIA releases “zero
day” driver with full functionality at
Khronos announcement
- Done for past several OpenGL
releases
• Ready today for developers to begin
coding against latest standard
extensions
- Technically a “beta” driver but fully
functional
- Intended for developers
- Official support for end-user drivers
coming soon
Page 5
Broad Categories of New OpenGL Functionality
•NEW graphics pipeline operation
•NEW texture mapping functionality
•NEW shader functionality
Page 6
NEW Graphics Pipeline Operation
• Fragment shader interlock
- ARB_fragment_shader_interlock
• Programmable sample positions for rasterization
- ARB_sample_locations
• Post-depth coverage version of sample mask
- ARB_post_depth_coverage
• Vertex shader viewport & layer output
- ARB_shader_viewport_layer_array
• Tessellation bounding box
- ARB_ES3_2_compatibility
Details…
Page 7
Fragment Shader Interlock
•NEW extension: ARB_fragment_shader_interlock
- Provides reliable means to read/write fragment’s pixel state
within a fragment shader
- GPU managed, no explicit barriers needed
•Uses
- Custom blend modes
- Deferred shading algorithms
- E.g. screen space decals
•Adds GLSL functions to begin/end interlock
- void beginInvocationInterlockARB(void);
- void endInvocationInterlockARB(void);
•Why is a fragment shader interlock needed? ...
Image credit: David Bookout (Intel),
Programmable Blend with Pixel
Shader Ordering
Shared exponent (rgb9e5)
format blending via
fragment shader interlock
Page 8
Pixel Update Preserves Primitive Rasterization Order
Same Pixel—covered by 3 overlapping primitives
OpenGL requires stencil/depth/blend operations
be observed to match rendering order, so:
Primitive
rasterization
order
rasterized
primitive #1
rasterized
primitive #2
rasterized
primitive #3
, ,
Page 9
Yet Fragment Shading is Massively Parallel
+ 1000’s of other fragments
GPU Fragment Shading: parallel execution of fragment shader threads
scores of
+ other
primitives
Conventional Approach
Batch as many fragments
in parallel as possible,
maximum efficiency
batch
executing
in
parallel
Page 10
Post-Shader Pixel Updates Respect Rasterization Order
+ 1000’s of other fragments
Fragment Shading: parallel execution of fragment shader threads
1st
blend
2nd
blend
3rd
blend
Shader results feed fixed-function Pixel Update (stencil test, depth test, & blend)
Page 11
However, Shader Access to Framebuffer Unsafe!
+ 1000’s of other fragments
GPU Fragment Shading: parallel execution of fragment shader threads
Pixel updates by fragment
shader instances
executing in parallel
cannot guarantee
primitive rasterization
order!
imageLoad,
imageStore
Exact behavior varies by GPU and timing
dependent for any particular GPU—so both
undefined & unreliable
Page 12
Interlock Guarantees Pixel Ordering of Shading
+ ….
GPU Fragment Shading: parallel execution of fragment shader threads
scores of
+ other
primitives
Interlock Approach
Batch but disallow
fragments for same pixel
in parallel execution of
fragment shader interlock
+ ….+ ….
batch
#1
batch
#2
batch
#3
Page 13
Fragment Shader Interlock Example
• We want to draw a grid of Stanford bunnies…
…stamped with a few brick normal maps … and then bump-map shaded
Image credit: Jiho Choi (NVIDIA), GameWorks NormalBlendedDecal example
Page 14
Motivation: Bullet holes and dynamic scuffs
• Desire: Dynamically add apparently geometric details as “after effects”
Without screen-space decals With screen-space decals
Normal Map Normal MapShaded color result Shaded color result
Image credit: Pope Kim, Screen Space Decals in Warhammer 40,000: Space Marine
Page 15
Screen Space Decal Approach
• Draw scene to G-buffer
- Renders world-space normals to “normal image” framebuffer
• Draw screen-space box for each screen space decal
- If pixel’s world-space position in G-buffer isn’t in box, discard fragment
- Avoids drawing decal on incorrect surface (one too close or too far)
- Fetch decal’s tangent-space normal from decal’s normal map
- Within fragment shader interlock
- Fetch pixel’s world-space normal from “normal image” framebuffer
- Rotate decal normal to world space
- Using tangent basis constructed from world-space normal
- Then blend (and renormalize) decal normal with pixel’s normal
- Replace pixel’s world-space normal in “normal image” with blended normal
• Do deferred shading on G-buffer, using “normal image” perturbed by decals
Page 16
Screen Space Decal Approach Visualized
Visualization of decal
boxes overlaid on scene
“Normal image”
after blended
normal decals
“Normal image”
before blended
normal decals
Brick pattern
normal map decals
applied to decal
boxes
Final shaded color result
Bunny shading
includes brick pattern
brick normals
blended with
fragment shader
interlock
Page 17
GLSL Fragment Interlock Usage
• Fragment interlock portion of surface space decal GLSL fragment shader
beginInvocationInterlockARB(); {
// Read “normal image” framebuffer's world space normal
vec3 destNormalWS = normalize(imageLoad(uNormalImage, ivec2(gl_FragCoord.xy)).xyz);
// Read decal's tangent space normal
vec3 decalNormalTS = normalize(textureLod(uDecalNormalTex, uv, 0.0).xyz * 2 - 1);
// Rotate decal's normal from tangent space to world space
vec3 tangentWS = vec3(1, 0, 0);
vec3 newNormalWS = normalize(mat3x3(tangentWS,
cross(destNormalWS, tangentWS),
destNormalWS) * decalNormalTS);
// Blend world space normal vectors
vec3 destNewNormalWS = normalize(mix(newNormalWS, destNormalWS, uBlendWeight));
// Write new blended normal into “normal image” framebuffer
imageStore(uNormalImage, ivec2(gl_FragCoord.xy), vec4(destNewNormalWS,0));
} endInvocationInterlockARB();
Page 18
Blend Equation Advanced vs. Shader Interlock
Shader Interlock (2015)
• Advantages
- Arbitrary shading operations allowed
- Very powerful & general
- No explicit barrier needed
• Disadvantages
- Requires putting color blending in every
fragment shader
- Lengthens shader
- Not orthogonal to multisampling
- Fragment shader responsible for
reading/writing every color sample
- Unavailable for legacy fixed-function
- Needs latest GPU generation
Blend Equation Advanced (2014)
• Advantages
- Supports for established blend modes
- Same as Photoshop, PDF, Flash, SVG
- Optimized for their numeric precision
requirements
- Orthogonal to fragment shading
- Just like conventional blending
- Just works with multisampling & sRGB
- Works with fixed-function rendering in
compatibility context
- Same “KHR” extension for OpenGL ES
- Available on older hardware
- But needs glFramebufferBarrier
• Disadvantages
- Blend modes limited pre-defined set
- Limited to 1 color attachment
Similar, but different functionality
Each extension makes sense
in its intended context
Page 19
Programmable Sample Positions
• Conventional OpenGL
- Multisample rasterization has fixed sample positions
• NEW ARB_sample_locations extension
- glFramebufferSampleLocationsfvARB specifies sample positions on sub-pixel grid
Default 8x
multisample pattern
Application-specified 8x
multisample pattern,
oriented for horizontal sampling
Same triangle
but covers
sample
patterns
differently
Page 20
Application: Temporal Antialiasing
• Reprogram samples different every frame and render continuously
• Done well, can double effective antialiasing quality “for free”
- Needs vertical refresh synchronization
- And app must render at rate matching refresh rate (e.g. 60 Hz)
Default 2x
multisample
pattern
Alternative 2x
multisample
pattern
Temporal virtual 4x antialiasing
Animated GIF
when in
slideshow mode
Page 21
Post Depth Coverage
• Normally in OpenGL stencil and depth tests are specified to be after fragment
shader execution
- Allows shader to discard fragments prior to these tests
- So avoids the depth and stencil buffer update side-effects of these tests
• OpenGL 4.2 add ability for fragment shader to force fragment shader to run after
the stencil and depth tests
- Part of ARB_shader_image_load_store extension
- Indicated in GLSL fragment shader by layout(early_fragment_tests) in;
• NEW extension ARB_post_depth_coverage
- Controls where fragment shader sample mask gl_SampleMaskIn[] reflect the
coverage before or after application of the early depth and stencil tests
- Allows shader to know what samples survived stencil & depth tests
- What you really want if you are using early fragment tests + sample mask
- Indicated in GLSL fragment shader by layout(post_depth_coverage) in;
Page 22
Early Fragment Tests & Post Depth Coverage
rasterizer
fragment
shader
stencil test
depth test
color blending
gl_SampleMaskIn
rasterizer
fragment
shader
stencil test
depth test
color blending
gl_SampleMaskIn
rasterizer
fragment
shader
stencil test
depth test
color blending
gl_SampleMaskIn
• Late stencil-depth tests
• Rasterizer determines
sample mask
• Early stencil-depth tests
• Rasterizer determines
sample mask
• Early stencil-depth tests
• Post-depth coverage
determines sample mask
Default behavior layout(early_fragment_tests) in;
layout(early_fragment_tests) in;
layout(post_depth_coverage) in;
Page 23
Vertex Shader Viewport & Layer Output
• NEW extension ARB_shader_viewport_layer_array
• Previously geometry shader needed to write viewport index and layer
- Forced layered rendering to use geometry shaders
- Even if a geometry shader wasn’t otherwise needed
• New vertex shader (or tessellation evaluation shader) outputs
- out int gl_ViewportIndex
- out int gl_Layer
Page 24
ES 3.2 Compatibility (tessellation, queries)
• NEW extension ARB_ES3_2_compatibility
• Command to specify bounding box for evaluated tessellated vertices in Normalized Device Coordinate
(NDC) space
- glPrimitiveBoundingBox(float minX, float minY, float minZ,
float maxX, float maxY, float maxZ)
- Initial space accepts entirety of NDC space (effectively not limiting tessellation)
- Implementations may be able to optimize performance, assuming accurate bounds
- ES 3.2 added this to make tessellation more friendly to mobile use cases
- Hint: Expect today’s desktop GPUs are likely to simply ignore this but API matches ES 3.2
• Bonus:
- OpenGL ES 3.2 adds two implementation-dependent constants related to multisample line rasterization
- GL_MULTISAMPLE_LINE_WIDTH_RANGE_ARB
- GL_MULTISAMPLE_LINE_WIDTH_GRANULARITY_ARB
- Same toke values as ES 3.2
- These queries supported for completeness (yawn)
Page 25
NEW Texture Mapping Functionality
• Texture Reduction Modes: Min/Max
- ARB_texture_filter_minmax
• Sparse Textures, done right
- ARB_sparse_texture2
• Sparse Texture Clamping
- ARB_sparse_texture_clamp
Details…
Page 26
New Texture Reduction Modes: Min/Max
•Standard texturing behavior
- Texture fetch result = weighted average of sampled texel values
- What you want for color images, etc.
•NEW extension: ARB_texture_filter_minmax
- Texture fetch result = minimum or maximum of all sampled texel values
•Adds NEW “reduction mode” for texture parameter
- Choices: GL_WEIGHTED_AVERAGE_ARB (initial state), GL_MIN, or GL_MAX
- Use with glTexParameteri, glSamplerPatameteri, etc.
•Example applications
- Estimating variance or range when sampling data in textures
- Conservative texture sampling
- E.g. Maximum Intensity Projection for medical imaging
Page 27
Application: Maximum Intensity Projection
• Radiologist interpret 3D visualizations
of CT scans
• Volume rendering simulates opacity
attenuated ray casting
- Good for visualizing 3D structure
• Maximum Intensity Projection (MIP)
rendering shows maximum intensity along
any ray
- Good for highlighting features without
regard to occlusion
- Avoids missing significant features
Volume
rendering
Maximum
Intensity
Projection
Texture
reduction mode
GL_WEIGHTED_AVERAGE_ARB
Texture
reduction mode
GL_MAX
Image credit: Fishman et al. Volume Rendering versus Maximum Intensity
Projection in CT Angiography: What Works Best, When, and Why
Page 28
Maximum Intensity Projection vs.
Volume Rendering Visualized
Axial view of human middle torso
Volume Rendering Maximum Intensity Projection
Good at mapping arterial structure,
despite occlusion
Provides more 3D feel by
accounting for occlusion
Image credit: Fishman et al. Volume Rendering versus Maximum Intensity
Projection in CT Angiography: What Works Best, When, and Why
Page 29
Spare Textures Visualized
• Textures can be HUGE
- Think of satellite data
- Or all the terrain in a huge game level
- Or medical or seismic imaging
• We don’t never expect to be looking at
everything at once!
- When textures are huge, can we just
make resident what we need?
- YES, that’s sparse texture
• ARB_sparse_texture standardized in 2013
- Reflected limitations of original sparse
texture hardware implementations
- Now we can do better…
Mipmap chain of a spare texture
Only limited number of pages are resident
Image credit: AMD
Page 30
Sparse Textures, done right
• NEW extension ARB_sparse_texture2
- Builds on prior ARB_spare_texture (2013) extension
- Original concept: intended for enormous textures, allows less than the
complete set of “pages” of the texture image set to be resident
- Primary limitation:
- Fetching non-resident data returned undefined results without indication
- So no way to know if non-resident data was fetched
- This reflected hardware limitations of the time, fixed in newer hardware
• Sparse Texture version 2 is friendly to dynamically detecting non-resident access
- Fetch of non-resident data now reliably returns zero values
- spareTextureARB GLSL texture fetch functions return residency information integer
- And 11 other variations of spareTexture*ARB GLSL functions as well
- sparseTexelsResidentARB GLSL function maps returned integer as Boolean residency
- Now supports sparse multisample and multisample texture arrays
Page 31
Sparse Texture, done even better
• NEW extension ARB_sparse_texture_clamp
• Adds new GLSL texture fetch variant functions
- Includes 10 additional level-of-detal (LOD) parameter to provide a per-fetch floor
on the hardware-computed LOD
- I.e. the minimum lodClamp parameter
- Sparse texture variants
- sparseTextureClampARB, sparseTextureOffsetClampARB,
sparseTextureGradClampARB, sparseTextureGradOffsetClampARB
- Non-spare texture versions too
- textureClampARB, textureOffsetClampARB, textureGradClampARB,
textureGradOffsetClampARB
• Benefit for sparse texture fetches
- Shaders can avoid accessing unpopulated portions of high-resolution levels of detail
when knowing texture detail is unpopulated
- Either from a priori knowledge
- Or feedback from previously executed "sparse" texture lookup functions
Page 32
Sparse Texture Clamp Example
• Naively fetch sparse texture until you get a valid texel
vec4 texel;
int code = spareTextureARB(spare_texture,
uv, texel);
float minLodClamp = 1;
while (!sparseTexelsResidentARB(code)) {
code = sparseTextureClampARB(sparseTexture,
uv, texel,
minLodClamp);
minLodClamp += 1.0f;
}
1 fetch
2 fetches, 1 missed
3 fetches, 2 missed
Page 33
NEW Shader Functionality
• OpenGL ES.2 Shading Language Compatibility
- ARB_ES3_2_compatibility
• Parallel Compile & Link of GLSL
- ARB_parallel_shader_compile
• 64-bit Integers Data Types
- ARB_gpu_shader_int64
• Shader Atomic Counter Operations
- ARB_shader_atomic_counter_ops
• Query Clock Counter
- ARB_shader_clock
• Shader Ballot and Broadcast
- ARB_shader_ballot Details…
Page 34
ES 3.2 Compatibility (shader support)
• NEW extension ARB_ES3_2_compatibility
• Just say #version 320 es in your GLSL shader
- Develop and use OpenGL ES 3.2’s GLSL dialect from regular OpenGL
- Helps desktop developers target mobile and embedded devices
• ES 3.2 GLSL adds functionality already in OpenGL
- KHR_blend_equation_advanced, OES_sample_variables,
OES_shader_image_atomic, OES_shader_multisample_interpolation,
OES_texture_storage_multisample_2d_array, OES_geometry_shader,
OES_gpu_shader5, OES_primitive_bounding_box,
OES_shader_io_blocks, OES_tessellation_shader,
OES_texture_buffer, OES_texture_cube_map_array,
KHR_robustness
- Notably Shader Model 5.0, geometry & tessellation shaders
Page 35
Parallel Compile & Link of GLSL
• NEW extension ARB_parallel_shader_compile
- Facilitates OpenGL implementations to distribute GLSL shader compilation and program
linking to multiple CPU threads to speed compilation throughput
- Allows apps to better manage GLSL compilation overheads
- Benefit: Faster load time for new shaders and programs on multi-core CPU systems
- Good practice: Construct multiple GLSL shaders/programs—defer querying state or using
for as long as possible or completion status is true
• Part 1: Tells OpenGL’s GLSL compiler how many CPU threads to use for parallel compilation
- void glMaxShaderCompilerThreadsARB(GLuint threadCount)
- Initially allows implementation-dependent maximum (initial value 0xFFFFFFFF)
- Zero means do not use parallel GLSL complication
• Part 2: Shader and program query if compile or link is complete
- Call glGetShaderiv or glGetProgramiv on GL_COMPLETION_STATUS_ARB parameter
- Returns true when compile is complete, false if still compiling
- Unlike other queries, will not block for compilation to complete.
Page 36
64-bit Integer Data Types in GLSL
• GLSL has had 32-bit integer and 64-bit floating-point for a while…
• Now adds 64-bit integers
- NEW extension ARB_gpu_shader_int64
• New data types
- Signed: int64_t, i64vec2, i64vec3, i64vec4,
- Unsigned: uint64_t, u64vec2, u64vec3, u64vec4
- Supported for uniforms, buffers, transform feedback, and shader input/outputs
• Standard library extended to 64-bit integers
• Programming interface
- Uniform setting
- glUniform{1,2,3,4}i{,v}64ARB
- glUniform{1,2,3,4}ui{,v}64ARB
- Direct state access (DSA) variants as well
- glProgramlUniform{1,2,3,4}i{,v}64ARB
- glProgramlUniform{1,2,3,4}ui{,v}64ARB
- Queries for 64-bit uniform integer data
Page 37
Shader Atomic Counter Operations in GLSL
• NEW ARB_shader_atomic_counter_ops extension
- Builds on ARB_shader_atomic_counters extension (2011, OpenGL 4.2)
- Original atomic counters quite limited
- Could only increment, decrement, and query
• New operations supported on counters
- Addition and subtraction: atomicCounterAddARB, atomicCounterSubtractARB
- Minimum and maximum: atomicCounterMinARB, atomicCounterMaxARB
- Bitwise operators (AND, OR, XOR, etc.)
- atomicCounterAndARB, atomicCounterOrARB, atomicCounterXorARB
- Exchange: atomicCounterExchangeARB
- Compare and Exchange: atomicCounterCompSwapARB
Page 38
Query Clock Counter in GLSL
• NEW extension ARB_shader_clock
• New functions query a free-running “clock”
- 64-bit monotonically incrementing shader counter
- uint64_t clockARB(void)
- uvec2 clock2x32ARB(void)
- Avoids requiring 64-bit integers, instead returns two 32-bit unsigned integers
• Similar to Win32’s QueryPerformanceCounter
- But within the GPU shader complex
• Can allow shaders to monitor their performance
- Details implementation-dependent
Page 39
Shader Ballot and Broadcast
• NEW extension ARB_shader_ballot
- Assumes 64-bit integers
• Concept
- Group of invocations (shader threads) which execute in lockstep can do a limited forms of
cross-invocation communication via a group broadcast of a invocation value, or broadcast of
a bitarray representing a predicate value from each invocation in the group
- Allows efficient collective decisions within a group of invocations
• New built-in data types
- Uniform: gl_SubGroupSizeARB
- Integer input: gl_SubGroupInvocationARB
- Mask input: gl_SubGroupEqMaskARB, gl_SubGroupGeMaskARB, gl_SubGroupGtMaskARB,
gl_SubGroupLeMaskARB, gl_SubGroupLtMaskARB
• New GLSL functions
- uint64_t ballotARB(bool value)
Page 40
GLEW Support Available NOW
•GLEW = The OpenGL Extension Wrangler Library
- Open source library
- http://glew.sourceforge.net/
- Your one-stop-shop for API support for all OpenGL extension APIs
•GLEW 1.13.0 provides API support for all 13 extensions NOW
•Thanks to Nigel Stewart and Jon Leech for this
Page 41
• Graphics pipeline operation
•ARB_fragment_shader_interlock
•ARB_sample_locations
•ARB_post_depth_coverage
•ARB_ES3_2_compatibility
•Tessellation bounding box
•Multisample line width query
•ARB_shader_viewport_layer_array
• Texture mapping functionality
•ARB_texture_filter_minmax
•ARB_sparse_texture2
•ARB_sparse_texture_clamp
• Shader functionality
•ARB_ES3_2_compatibility
•ES 3.2 shading language support
•ARB_parallel_shader_compile
•ARB_gpu_shader_int64
•ARB_shader_atomic_counter_ops
•ARB_shader_clock
•ARB_shader_ballot
In Review
•OpenGL in 2015 has 13 new standard extensions
Page 42
GPU Hardware Support
Extension Fermi Kepler Maxwell 1, K1* Maxwell 2, X1*
ARB_ES3_2_compatibility ✓ ✓ ✓ ✓
ARB_parallel_shader_compile ✓ ✓ ✓ ✓
ARB_gpu_shader_int64 ✓ ✓ ✓ ✓
ARB_shader_atomic_counter_ops ✓ ✓ ✓ ✓
ARB_shader_clock ✓ ✓ ✓
ARB_shader_ballot ✓ ✓ ✓
ARB_fragment_shader_interlock ✓
ARB_sample_locations ✓
ARB_post_depth_coverage ✓
ARB_shader_viewport_layer_array ✓
ARB_texture_filter_minmax ✓
ARB_sparse_texture2 ✓ †
ARB_sparse_texture_clamp ✓ †
* = Tegra driver support later
† = assumes OS support for sparse resources
Page 43
Thanks
•Multi-vendor effort!
•Particular thanks to specification leads
- Pat Brown (NVIDIA)
- Piers Daniell (NVIDIA)
- Slawomir Grajewski (Intel)
- Daniel Koch (NVIDIA)
- Jon Leech (Khronos)
- Timothy Lottes (AMD)
- Daniel Rakos (AMD)
- Graham Sellers (AMD)
- Eric Werness (NVIDIA)
Page 44
How to get OpenGL 2015 drivers now
• NVIDIA developer web site
- https://developer.nvidia.com/opengl-driver
• For Quadro and GeForce
- Windows, version 355.58
- Linux, version 355.00.05
- Newer versions may be available
Support NVIDIA GPU generations
- Maxwell
- Many extensions in set, such as ARB_fragment_shader_interlock, needs new
Maxwell 2 GPU generation
- Example: GeForce 9xx, Titan X, Quadro M6000
- Kepler
- Fermi
Page 45
NVIDIA’s driver also includes OpenGL ES 3.2
• Desktop OpenGL driver can create a compliant ES 3.2 context
- Develop on a PC, then move your working ES 3.2 code to a mobile device
- OpenGL 3.2 is basically Android Extension Pack (AEP), standardized by Khronos now
• The extensions below are part of OpenGL ES 3.2 core specification now, but they can
still be used in contexts below OpenGL ES 3.2 as extensions on supported hardware:
- OES_gpu_shader5
- OES_primitive_bounding_box
- OES_shader_io_blocks
- OES_tessellation_shader
- OES_texture_border_clamp
- OES_texture_buffer
- OES_texture_cube_map_array
- OES_draw_elements_base_vertex
- KHR_robustness
- EXT_color_buffer_float
- KHR_debug
- KHR_texture_compression_astc_ldr
- KHR_blend_equation_advanced
- OES_sample_shading
- OES_sample_variables
- OES_shader_image_atomic
- OES_shader_multisample_interpolation
- OES_texture_stencil8
- OES_texture_storage_multisample_2d_array
- OES_copy_image
- OES_draw_buffers_indexed
- OES_geometry_shader
Page 46
Conclusions
•NEW standard OpenGL Extensions announced at SIGGRAPH for 2015
•NVIDIA already shipping support for all these extensions
- Released same day Khronos announced the functionality
•Get latest Maxwell 2 generation GPU to access extensions
depending on latest hardware
OpenGL for 2015

Contenu connexe

Tendances

SPU-Based Deferred Shading in BATTLEFIELD 3 for Playstation 3
SPU-Based Deferred Shading in BATTLEFIELD 3 for Playstation 3SPU-Based Deferred Shading in BATTLEFIELD 3 for Playstation 3
SPU-Based Deferred Shading in BATTLEFIELD 3 for Playstation 3Electronic Arts / DICE
 
Frostbite Rendering Architecture and Real-time Procedural Shading & Texturing...
Frostbite Rendering Architecture and Real-time Procedural Shading & Texturing...Frostbite Rendering Architecture and Real-time Procedural Shading & Texturing...
Frostbite Rendering Architecture and Real-time Procedural Shading & Texturing...Johan Andersson
 
Taking Killzone Shadow Fall Image Quality Into The Next Generation
Taking Killzone Shadow Fall Image Quality Into The Next GenerationTaking Killzone Shadow Fall Image Quality Into The Next Generation
Taking Killzone Shadow Fall Image Quality Into The Next GenerationGuerrilla
 
Rendering Technologies from Crysis 3 (GDC 2013)
Rendering Technologies from Crysis 3 (GDC 2013)Rendering Technologies from Crysis 3 (GDC 2013)
Rendering Technologies from Crysis 3 (GDC 2013)Tiago Sousa
 
Low-level Shader Optimization for Next-Gen and DX11 by Emil Persson
Low-level Shader Optimization for Next-Gen and DX11 by Emil PerssonLow-level Shader Optimization for Next-Gen and DX11 by Emil Persson
Low-level Shader Optimization for Next-Gen and DX11 by Emil PerssonAMD Developer Central
 
FrameGraph: Extensible Rendering Architecture in Frostbite
FrameGraph: Extensible Rendering Architecture in FrostbiteFrameGraph: Extensible Rendering Architecture in Frostbite
FrameGraph: Extensible Rendering Architecture in FrostbiteElectronic Arts / DICE
 
SIGGRAPH 2010 Water Flow in Portal 2
SIGGRAPH 2010 Water Flow in Portal 2SIGGRAPH 2010 Water Flow in Portal 2
SIGGRAPH 2010 Water Flow in Portal 2Alex Vlachos
 
OpenGL 4.4 - Scene Rendering Techniques
OpenGL 4.4 - Scene Rendering TechniquesOpenGL 4.4 - Scene Rendering Techniques
OpenGL 4.4 - Scene Rendering TechniquesNarann29
 
Killzone Shadow Fall Demo Postmortem
Killzone Shadow Fall Demo PostmortemKillzone Shadow Fall Demo Postmortem
Killzone Shadow Fall Demo PostmortemGuerrilla
 
A Bit More Deferred Cry Engine3
A Bit More Deferred   Cry Engine3A Bit More Deferred   Cry Engine3
A Bit More Deferred Cry Engine3guest11b095
 
A Bizarre Way to do Real-Time Lighting
A Bizarre Way to do Real-Time LightingA Bizarre Way to do Real-Time Lighting
A Bizarre Way to do Real-Time LightingSteven Tovey
 
Destruction Masking in Frostbite 2 using Volume Distance Fields
Destruction Masking in Frostbite 2 using Volume Distance FieldsDestruction Masking in Frostbite 2 using Volume Distance Fields
Destruction Masking in Frostbite 2 using Volume Distance FieldsElectronic Arts / DICE
 
Decima Engine: Visibility in Horizon Zero Dawn
Decima Engine: Visibility in Horizon Zero DawnDecima Engine: Visibility in Horizon Zero Dawn
Decima Engine: Visibility in Horizon Zero DawnGuerrilla
 
OIT to Volumetric Shadow Mapping, 101 Uses for Raster-Ordered Views using Dir...
OIT to Volumetric Shadow Mapping, 101 Uses for Raster-Ordered Views using Dir...OIT to Volumetric Shadow Mapping, 101 Uses for Raster-Ordered Views using Dir...
OIT to Volumetric Shadow Mapping, 101 Uses for Raster-Ordered Views using Dir...Gael Hofemeier
 
OpenGL NVIDIA Command-List: Approaching Zero Driver Overhead
OpenGL NVIDIA Command-List: Approaching Zero Driver OverheadOpenGL NVIDIA Command-List: Approaching Zero Driver Overhead
OpenGL NVIDIA Command-List: Approaching Zero Driver OverheadTristan Lorach
 
Dissecting the Rendering of The Surge
Dissecting the Rendering of The SurgeDissecting the Rendering of The Surge
Dissecting the Rendering of The SurgePhilip Hammer
 
Physically Based Lighting in Unreal Engine 4
Physically Based Lighting in Unreal Engine 4Physically Based Lighting in Unreal Engine 4
Physically Based Lighting in Unreal Engine 4Lukas Lang
 

Tendances (20)

SPU-Based Deferred Shading in BATTLEFIELD 3 for Playstation 3
SPU-Based Deferred Shading in BATTLEFIELD 3 for Playstation 3SPU-Based Deferred Shading in BATTLEFIELD 3 for Playstation 3
SPU-Based Deferred Shading in BATTLEFIELD 3 for Playstation 3
 
Frostbite Rendering Architecture and Real-time Procedural Shading & Texturing...
Frostbite Rendering Architecture and Real-time Procedural Shading & Texturing...Frostbite Rendering Architecture and Real-time Procedural Shading & Texturing...
Frostbite Rendering Architecture and Real-time Procedural Shading & Texturing...
 
Taking Killzone Shadow Fall Image Quality Into The Next Generation
Taking Killzone Shadow Fall Image Quality Into The Next GenerationTaking Killzone Shadow Fall Image Quality Into The Next Generation
Taking Killzone Shadow Fall Image Quality Into The Next Generation
 
Rendering Technologies from Crysis 3 (GDC 2013)
Rendering Technologies from Crysis 3 (GDC 2013)Rendering Technologies from Crysis 3 (GDC 2013)
Rendering Technologies from Crysis 3 (GDC 2013)
 
Low-level Shader Optimization for Next-Gen and DX11 by Emil Persson
Low-level Shader Optimization for Next-Gen and DX11 by Emil PerssonLow-level Shader Optimization for Next-Gen and DX11 by Emil Persson
Low-level Shader Optimization for Next-Gen and DX11 by Emil Persson
 
FrameGraph: Extensible Rendering Architecture in Frostbite
FrameGraph: Extensible Rendering Architecture in FrostbiteFrameGraph: Extensible Rendering Architecture in Frostbite
FrameGraph: Extensible Rendering Architecture in Frostbite
 
SIGGRAPH 2010 Water Flow in Portal 2
SIGGRAPH 2010 Water Flow in Portal 2SIGGRAPH 2010 Water Flow in Portal 2
SIGGRAPH 2010 Water Flow in Portal 2
 
OpenGL 4.4 - Scene Rendering Techniques
OpenGL 4.4 - Scene Rendering TechniquesOpenGL 4.4 - Scene Rendering Techniques
OpenGL 4.4 - Scene Rendering Techniques
 
Bending the Graphics Pipeline
Bending the Graphics PipelineBending the Graphics Pipeline
Bending the Graphics Pipeline
 
Killzone Shadow Fall Demo Postmortem
Killzone Shadow Fall Demo PostmortemKillzone Shadow Fall Demo Postmortem
Killzone Shadow Fall Demo Postmortem
 
A Bit More Deferred Cry Engine3
A Bit More Deferred   Cry Engine3A Bit More Deferred   Cry Engine3
A Bit More Deferred Cry Engine3
 
A Bizarre Way to do Real-Time Lighting
A Bizarre Way to do Real-Time LightingA Bizarre Way to do Real-Time Lighting
A Bizarre Way to do Real-Time Lighting
 
Destruction Masking in Frostbite 2 using Volume Distance Fields
Destruction Masking in Frostbite 2 using Volume Distance FieldsDestruction Masking in Frostbite 2 using Volume Distance Fields
Destruction Masking in Frostbite 2 using Volume Distance Fields
 
Decima Engine: Visibility in Horizon Zero Dawn
Decima Engine: Visibility in Horizon Zero DawnDecima Engine: Visibility in Horizon Zero Dawn
Decima Engine: Visibility in Horizon Zero Dawn
 
OIT to Volumetric Shadow Mapping, 101 Uses for Raster-Ordered Views using Dir...
OIT to Volumetric Shadow Mapping, 101 Uses for Raster-Ordered Views using Dir...OIT to Volumetric Shadow Mapping, 101 Uses for Raster-Ordered Views using Dir...
OIT to Volumetric Shadow Mapping, 101 Uses for Raster-Ordered Views using Dir...
 
OpenGL NVIDIA Command-List: Approaching Zero Driver Overhead
OpenGL NVIDIA Command-List: Approaching Zero Driver OverheadOpenGL NVIDIA Command-List: Approaching Zero Driver Overhead
OpenGL NVIDIA Command-List: Approaching Zero Driver Overhead
 
Lighting the City of Glass
Lighting the City of GlassLighting the City of Glass
Lighting the City of Glass
 
Light prepass
Light prepassLight prepass
Light prepass
 
Dissecting the Rendering of The Surge
Dissecting the Rendering of The SurgeDissecting the Rendering of The Surge
Dissecting the Rendering of The Surge
 
Physically Based Lighting in Unreal Engine 4
Physically Based Lighting in Unreal Engine 4Physically Based Lighting in Unreal Engine 4
Physically Based Lighting in Unreal Engine 4
 

En vedette

1.1 introduction to business management
1.1 introduction to business management1.1 introduction to business management
1.1 introduction to business managementAprajita Verma
 
Concrete Honeycombs
Concrete Honeycombs Concrete Honeycombs
Concrete Honeycombs Dimuthuchat
 
Raster Scan and Raster Scan Displays
Raster Scan and Raster Scan DisplaysRaster Scan and Raster Scan Displays
Raster Scan and Raster Scan DisplaysSaravana Priya
 
6 Key Principles Of Making A Web Design
6 Key Principles Of Making A Web Design6 Key Principles Of Making A Web Design
6 Key Principles Of Making A Web DesignClippingimages
 
Raster Scan And Random Scan
Raster Scan And Random ScanRaster Scan And Random Scan
Raster Scan And Random ScanDevika Rangnekar
 
Problem Solving Aspect of Swapping Two Integers using a Temporary Variable
Problem Solving Aspect of Swapping Two Integers using a Temporary VariableProblem Solving Aspect of Swapping Two Integers using a Temporary Variable
Problem Solving Aspect of Swapping Two Integers using a Temporary VariableSaravana Priya
 
D11: a high-performance, protocol-optional, transport-optional, window system...
D11: a high-performance, protocol-optional, transport-optional, window system...D11: a high-performance, protocol-optional, transport-optional, window system...
D11: a high-performance, protocol-optional, transport-optional, window system...Mark Kilgard
 
Case Studies Power Point
Case Studies Power PointCase Studies Power Point
Case Studies Power Pointguest3762ea6
 
Taming the ever-evolving Compliance Beast : Lessons learnt at LinkedIn [Strat...
Taming the ever-evolving Compliance Beast : Lessons learnt at LinkedIn [Strat...Taming the ever-evolving Compliance Beast : Lessons learnt at LinkedIn [Strat...
Taming the ever-evolving Compliance Beast : Lessons learnt at LinkedIn [Strat...Shirshanka Das
 

En vedette (10)

1.1 introduction to business management
1.1 introduction to business management1.1 introduction to business management
1.1 introduction to business management
 
Concrete Honeycombs
Concrete Honeycombs Concrete Honeycombs
Concrete Honeycombs
 
Raster Scan and Raster Scan Displays
Raster Scan and Raster Scan DisplaysRaster Scan and Raster Scan Displays
Raster Scan and Raster Scan Displays
 
6 Key Principles Of Making A Web Design
6 Key Principles Of Making A Web Design6 Key Principles Of Making A Web Design
6 Key Principles Of Making A Web Design
 
Raster Scan And Random Scan
Raster Scan And Random ScanRaster Scan And Random Scan
Raster Scan And Random Scan
 
Problem Solving Aspect of Swapping Two Integers using a Temporary Variable
Problem Solving Aspect of Swapping Two Integers using a Temporary VariableProblem Solving Aspect of Swapping Two Integers using a Temporary Variable
Problem Solving Aspect of Swapping Two Integers using a Temporary Variable
 
Case study
Case studyCase study
Case study
 
D11: a high-performance, protocol-optional, transport-optional, window system...
D11: a high-performance, protocol-optional, transport-optional, window system...D11: a high-performance, protocol-optional, transport-optional, window system...
D11: a high-performance, protocol-optional, transport-optional, window system...
 
Case Studies Power Point
Case Studies Power PointCase Studies Power Point
Case Studies Power Point
 
Taming the ever-evolving Compliance Beast : Lessons learnt at LinkedIn [Strat...
Taming the ever-evolving Compliance Beast : Lessons learnt at LinkedIn [Strat...Taming the ever-evolving Compliance Beast : Lessons learnt at LinkedIn [Strat...
Taming the ever-evolving Compliance Beast : Lessons learnt at LinkedIn [Strat...
 

Similaire à OpenGL for 2015

Android open gl2_droidcon_2014
Android open gl2_droidcon_2014Android open gl2_droidcon_2014
Android open gl2_droidcon_2014Droidcon Berlin
 
OpenGL ES and Mobile GPU
OpenGL ES and Mobile GPUOpenGL ES and Mobile GPU
OpenGL ES and Mobile GPUJiansong Chen
 
Hardware Shaders
Hardware ShadersHardware Shaders
Hardware Shadersgueste52f1b
 
Game Programming 12 - Shaders
Game Programming 12 - ShadersGame Programming 12 - Shaders
Game Programming 12 - ShadersNick Pruehs
 
Computer Graphics - Lecture 01 - 3D Programming I
Computer Graphics - Lecture 01 - 3D Programming IComputer Graphics - Lecture 01 - 3D Programming I
Computer Graphics - Lecture 01 - 3D Programming I💻 Anton Gerdelan
 
Introduction of openGL
Introduction  of openGLIntroduction  of openGL
Introduction of openGLGary Yeh
 
OpenGL Shading Language
OpenGL Shading LanguageOpenGL Shading Language
OpenGL Shading LanguageJungsoo Nam
 
Migrating from OpenGL to Vulkan
Migrating from OpenGL to VulkanMigrating from OpenGL to Vulkan
Migrating from OpenGL to VulkanMark Kilgard
 
Advanced Graphics Workshop - GFX2011
Advanced Graphics Workshop - GFX2011Advanced Graphics Workshop - GFX2011
Advanced Graphics Workshop - GFX2011Prabindh Sundareson
 
Visibility Optimization for Games
Visibility Optimization for GamesVisibility Optimization for Games
Visibility Optimization for GamesUmbra
 
Visibility Optimization for Games
Visibility Optimization for GamesVisibility Optimization for Games
Visibility Optimization for GamesSampo Lappalainen
 
High Fidelity Games: Real Examples, Best Practices ... | Oleksii Vasylenko
High Fidelity Games: Real Examples, Best Practices ... | Oleksii VasylenkoHigh Fidelity Games: Real Examples, Best Practices ... | Oleksii Vasylenko
High Fidelity Games: Real Examples, Best Practices ... | Oleksii VasylenkoJessica Tams
 
Unite Berlin 2018 - Book of the Dead Optimizing Performance for High End Cons...
Unite Berlin 2018 - Book of the Dead Optimizing Performance for High End Cons...Unite Berlin 2018 - Book of the Dead Optimizing Performance for High End Cons...
Unite Berlin 2018 - Book of the Dead Optimizing Performance for High End Cons...Unity Technologies
 
Hpg2011 papers kazakov
Hpg2011 papers kazakovHpg2011 papers kazakov
Hpg2011 papers kazakovmistercteam
 
Данило Ульянич “C89 OpenGL for ARM microcontrollers on Cortex-M. Basic functi...
Данило Ульянич “C89 OpenGL for ARM microcontrollers on Cortex-M. Basic functi...Данило Ульянич “C89 OpenGL for ARM microcontrollers on Cortex-M. Basic functi...
Данило Ульянич “C89 OpenGL for ARM microcontrollers on Cortex-M. Basic functi...Lviv Startup Club
 
Use Variable Rate Shading (VRS) to Improve the User Experience in Real-Time G...
Use Variable Rate Shading (VRS) to Improve the User Experience in Real-Time G...Use Variable Rate Shading (VRS) to Improve the User Experience in Real-Time G...
Use Variable Rate Shading (VRS) to Improve the User Experience in Real-Time G...Intel® Software
 
CS 354 Viewing Stuff
CS 354 Viewing StuffCS 354 Viewing Stuff
CS 354 Viewing StuffMark Kilgard
 
Smedberg niklas bringing_aaa_graphics
Smedberg niklas bringing_aaa_graphicsSmedberg niklas bringing_aaa_graphics
Smedberg niklas bringing_aaa_graphicschangehee lee
 
WT-4069, WebCL: Enabling OpenCL Acceleration of Web Applications, by Mikael ...
WT-4069, WebCL: Enabling OpenCL Acceleration of Web Applications, by  Mikael ...WT-4069, WebCL: Enabling OpenCL Acceleration of Web Applications, by  Mikael ...
WT-4069, WebCL: Enabling OpenCL Acceleration of Web Applications, by Mikael ...AMD Developer Central
 

Similaire à OpenGL for 2015 (20)

Android open gl2_droidcon_2014
Android open gl2_droidcon_2014Android open gl2_droidcon_2014
Android open gl2_droidcon_2014
 
OpenGL ES and Mobile GPU
OpenGL ES and Mobile GPUOpenGL ES and Mobile GPU
OpenGL ES and Mobile GPU
 
Hardware Shaders
Hardware ShadersHardware Shaders
Hardware Shaders
 
Game Programming 12 - Shaders
Game Programming 12 - ShadersGame Programming 12 - Shaders
Game Programming 12 - Shaders
 
Computer Graphics - Lecture 01 - 3D Programming I
Computer Graphics - Lecture 01 - 3D Programming IComputer Graphics - Lecture 01 - 3D Programming I
Computer Graphics - Lecture 01 - 3D Programming I
 
Introduction of openGL
Introduction  of openGLIntroduction  of openGL
Introduction of openGL
 
OpenGL Shading Language
OpenGL Shading LanguageOpenGL Shading Language
OpenGL Shading Language
 
Migrating from OpenGL to Vulkan
Migrating from OpenGL to VulkanMigrating from OpenGL to Vulkan
Migrating from OpenGL to Vulkan
 
Chapter-3.pdf
Chapter-3.pdfChapter-3.pdf
Chapter-3.pdf
 
Advanced Graphics Workshop - GFX2011
Advanced Graphics Workshop - GFX2011Advanced Graphics Workshop - GFX2011
Advanced Graphics Workshop - GFX2011
 
Visibility Optimization for Games
Visibility Optimization for GamesVisibility Optimization for Games
Visibility Optimization for Games
 
Visibility Optimization for Games
Visibility Optimization for GamesVisibility Optimization for Games
Visibility Optimization for Games
 
High Fidelity Games: Real Examples, Best Practices ... | Oleksii Vasylenko
High Fidelity Games: Real Examples, Best Practices ... | Oleksii VasylenkoHigh Fidelity Games: Real Examples, Best Practices ... | Oleksii Vasylenko
High Fidelity Games: Real Examples, Best Practices ... | Oleksii Vasylenko
 
Unite Berlin 2018 - Book of the Dead Optimizing Performance for High End Cons...
Unite Berlin 2018 - Book of the Dead Optimizing Performance for High End Cons...Unite Berlin 2018 - Book of the Dead Optimizing Performance for High End Cons...
Unite Berlin 2018 - Book of the Dead Optimizing Performance for High End Cons...
 
Hpg2011 papers kazakov
Hpg2011 papers kazakovHpg2011 papers kazakov
Hpg2011 papers kazakov
 
Данило Ульянич “C89 OpenGL for ARM microcontrollers on Cortex-M. Basic functi...
Данило Ульянич “C89 OpenGL for ARM microcontrollers on Cortex-M. Basic functi...Данило Ульянич “C89 OpenGL for ARM microcontrollers on Cortex-M. Basic functi...
Данило Ульянич “C89 OpenGL for ARM microcontrollers on Cortex-M. Basic functi...
 
Use Variable Rate Shading (VRS) to Improve the User Experience in Real-Time G...
Use Variable Rate Shading (VRS) to Improve the User Experience in Real-Time G...Use Variable Rate Shading (VRS) to Improve the User Experience in Real-Time G...
Use Variable Rate Shading (VRS) to Improve the User Experience in Real-Time G...
 
CS 354 Viewing Stuff
CS 354 Viewing StuffCS 354 Viewing Stuff
CS 354 Viewing Stuff
 
Smedberg niklas bringing_aaa_graphics
Smedberg niklas bringing_aaa_graphicsSmedberg niklas bringing_aaa_graphics
Smedberg niklas bringing_aaa_graphics
 
WT-4069, WebCL: Enabling OpenCL Acceleration of Web Applications, by Mikael ...
WT-4069, WebCL: Enabling OpenCL Acceleration of Web Applications, by  Mikael ...WT-4069, WebCL: Enabling OpenCL Acceleration of Web Applications, by  Mikael ...
WT-4069, WebCL: Enabling OpenCL Acceleration of Web Applications, by Mikael ...
 

Plus de Mark Kilgard

Computers, Graphics, Engineering, Math, and Video Games for High School Students
Computers, Graphics, Engineering, Math, and Video Games for High School StudentsComputers, Graphics, Engineering, Math, and Video Games for High School Students
Computers, Graphics, Engineering, Math, and Video Games for High School StudentsMark Kilgard
 
NVIDIA OpenGL and Vulkan Support for 2017
NVIDIA OpenGL and Vulkan Support for 2017NVIDIA OpenGL and Vulkan Support for 2017
NVIDIA OpenGL and Vulkan Support for 2017Mark Kilgard
 
NVIDIA OpenGL 4.6 in 2017
NVIDIA OpenGL 4.6 in 2017NVIDIA OpenGL 4.6 in 2017
NVIDIA OpenGL 4.6 in 2017Mark Kilgard
 
NVIDIA OpenGL in 2016
NVIDIA OpenGL in 2016NVIDIA OpenGL in 2016
NVIDIA OpenGL in 2016Mark Kilgard
 
Virtual Reality Features of NVIDIA GPUs
Virtual Reality Features of NVIDIA GPUsVirtual Reality Features of NVIDIA GPUs
Virtual Reality Features of NVIDIA GPUsMark Kilgard
 
EXT_window_rectangles
EXT_window_rectanglesEXT_window_rectangles
EXT_window_rectanglesMark Kilgard
 
Slides: Accelerating Vector Graphics Rendering using the Graphics Hardware Pi...
Slides: Accelerating Vector Graphics Rendering using the Graphics Hardware Pi...Slides: Accelerating Vector Graphics Rendering using the Graphics Hardware Pi...
Slides: Accelerating Vector Graphics Rendering using the Graphics Hardware Pi...Mark Kilgard
 
Accelerating Vector Graphics Rendering using the Graphics Hardware Pipeline
Accelerating Vector Graphics Rendering using the Graphics Hardware PipelineAccelerating Vector Graphics Rendering using the Graphics Hardware Pipeline
Accelerating Vector Graphics Rendering using the Graphics Hardware PipelineMark Kilgard
 
NV_path rendering Functional Improvements
NV_path rendering Functional ImprovementsNV_path rendering Functional Improvements
NV_path rendering Functional ImprovementsMark Kilgard
 
OpenGL 4.5 Update for NVIDIA GPUs
OpenGL 4.5 Update for NVIDIA GPUsOpenGL 4.5 Update for NVIDIA GPUs
OpenGL 4.5 Update for NVIDIA GPUsMark Kilgard
 
SIGGRAPH Asia 2012: GPU-accelerated Path Rendering
SIGGRAPH Asia 2012: GPU-accelerated Path RenderingSIGGRAPH Asia 2012: GPU-accelerated Path Rendering
SIGGRAPH Asia 2012: GPU-accelerated Path RenderingMark Kilgard
 
SIGGRAPH Asia 2012 Exhibitor Talk: OpenGL 4.3 and Beyond
SIGGRAPH Asia 2012 Exhibitor Talk: OpenGL 4.3 and BeyondSIGGRAPH Asia 2012 Exhibitor Talk: OpenGL 4.3 and Beyond
SIGGRAPH Asia 2012 Exhibitor Talk: OpenGL 4.3 and BeyondMark Kilgard
 
Programming with NV_path_rendering: An Annex to the SIGGRAPH Asia 2012 paper...
Programming with NV_path_rendering:  An Annex to the SIGGRAPH Asia 2012 paper...Programming with NV_path_rendering:  An Annex to the SIGGRAPH Asia 2012 paper...
Programming with NV_path_rendering: An Annex to the SIGGRAPH Asia 2012 paper...Mark Kilgard
 
GPU accelerated path rendering fastforward
GPU accelerated path rendering fastforwardGPU accelerated path rendering fastforward
GPU accelerated path rendering fastforwardMark Kilgard
 
GPU-accelerated Path Rendering
GPU-accelerated Path RenderingGPU-accelerated Path Rendering
GPU-accelerated Path RenderingMark Kilgard
 
SIGGRAPH 2012: GPU-Accelerated 2D and Web Rendering
SIGGRAPH 2012: GPU-Accelerated 2D and Web RenderingSIGGRAPH 2012: GPU-Accelerated 2D and Web Rendering
SIGGRAPH 2012: GPU-Accelerated 2D and Web RenderingMark Kilgard
 
SIGGRAPH 2012: NVIDIA OpenGL for 2012
SIGGRAPH 2012: NVIDIA OpenGL for 2012SIGGRAPH 2012: NVIDIA OpenGL for 2012
SIGGRAPH 2012: NVIDIA OpenGL for 2012Mark Kilgard
 
GTC 2012: GPU-Accelerated Path Rendering
GTC 2012: GPU-Accelerated Path RenderingGTC 2012: GPU-Accelerated Path Rendering
GTC 2012: GPU-Accelerated Path Rendering Mark Kilgard
 
GTC 2012: NVIDIA OpenGL in 2012
GTC 2012: NVIDIA OpenGL in 2012GTC 2012: NVIDIA OpenGL in 2012
GTC 2012: NVIDIA OpenGL in 2012Mark Kilgard
 
CS 354 Final Exam Review
CS 354 Final Exam ReviewCS 354 Final Exam Review
CS 354 Final Exam ReviewMark Kilgard
 

Plus de Mark Kilgard (20)

Computers, Graphics, Engineering, Math, and Video Games for High School Students
Computers, Graphics, Engineering, Math, and Video Games for High School StudentsComputers, Graphics, Engineering, Math, and Video Games for High School Students
Computers, Graphics, Engineering, Math, and Video Games for High School Students
 
NVIDIA OpenGL and Vulkan Support for 2017
NVIDIA OpenGL and Vulkan Support for 2017NVIDIA OpenGL and Vulkan Support for 2017
NVIDIA OpenGL and Vulkan Support for 2017
 
NVIDIA OpenGL 4.6 in 2017
NVIDIA OpenGL 4.6 in 2017NVIDIA OpenGL 4.6 in 2017
NVIDIA OpenGL 4.6 in 2017
 
NVIDIA OpenGL in 2016
NVIDIA OpenGL in 2016NVIDIA OpenGL in 2016
NVIDIA OpenGL in 2016
 
Virtual Reality Features of NVIDIA GPUs
Virtual Reality Features of NVIDIA GPUsVirtual Reality Features of NVIDIA GPUs
Virtual Reality Features of NVIDIA GPUs
 
EXT_window_rectangles
EXT_window_rectanglesEXT_window_rectangles
EXT_window_rectangles
 
Slides: Accelerating Vector Graphics Rendering using the Graphics Hardware Pi...
Slides: Accelerating Vector Graphics Rendering using the Graphics Hardware Pi...Slides: Accelerating Vector Graphics Rendering using the Graphics Hardware Pi...
Slides: Accelerating Vector Graphics Rendering using the Graphics Hardware Pi...
 
Accelerating Vector Graphics Rendering using the Graphics Hardware Pipeline
Accelerating Vector Graphics Rendering using the Graphics Hardware PipelineAccelerating Vector Graphics Rendering using the Graphics Hardware Pipeline
Accelerating Vector Graphics Rendering using the Graphics Hardware Pipeline
 
NV_path rendering Functional Improvements
NV_path rendering Functional ImprovementsNV_path rendering Functional Improvements
NV_path rendering Functional Improvements
 
OpenGL 4.5 Update for NVIDIA GPUs
OpenGL 4.5 Update for NVIDIA GPUsOpenGL 4.5 Update for NVIDIA GPUs
OpenGL 4.5 Update for NVIDIA GPUs
 
SIGGRAPH Asia 2012: GPU-accelerated Path Rendering
SIGGRAPH Asia 2012: GPU-accelerated Path RenderingSIGGRAPH Asia 2012: GPU-accelerated Path Rendering
SIGGRAPH Asia 2012: GPU-accelerated Path Rendering
 
SIGGRAPH Asia 2012 Exhibitor Talk: OpenGL 4.3 and Beyond
SIGGRAPH Asia 2012 Exhibitor Talk: OpenGL 4.3 and BeyondSIGGRAPH Asia 2012 Exhibitor Talk: OpenGL 4.3 and Beyond
SIGGRAPH Asia 2012 Exhibitor Talk: OpenGL 4.3 and Beyond
 
Programming with NV_path_rendering: An Annex to the SIGGRAPH Asia 2012 paper...
Programming with NV_path_rendering:  An Annex to the SIGGRAPH Asia 2012 paper...Programming with NV_path_rendering:  An Annex to the SIGGRAPH Asia 2012 paper...
Programming with NV_path_rendering: An Annex to the SIGGRAPH Asia 2012 paper...
 
GPU accelerated path rendering fastforward
GPU accelerated path rendering fastforwardGPU accelerated path rendering fastforward
GPU accelerated path rendering fastforward
 
GPU-accelerated Path Rendering
GPU-accelerated Path RenderingGPU-accelerated Path Rendering
GPU-accelerated Path Rendering
 
SIGGRAPH 2012: GPU-Accelerated 2D and Web Rendering
SIGGRAPH 2012: GPU-Accelerated 2D and Web RenderingSIGGRAPH 2012: GPU-Accelerated 2D and Web Rendering
SIGGRAPH 2012: GPU-Accelerated 2D and Web Rendering
 
SIGGRAPH 2012: NVIDIA OpenGL for 2012
SIGGRAPH 2012: NVIDIA OpenGL for 2012SIGGRAPH 2012: NVIDIA OpenGL for 2012
SIGGRAPH 2012: NVIDIA OpenGL for 2012
 
GTC 2012: GPU-Accelerated Path Rendering
GTC 2012: GPU-Accelerated Path RenderingGTC 2012: GPU-Accelerated Path Rendering
GTC 2012: GPU-Accelerated Path Rendering
 
GTC 2012: NVIDIA OpenGL in 2012
GTC 2012: NVIDIA OpenGL in 2012GTC 2012: NVIDIA OpenGL in 2012
GTC 2012: NVIDIA OpenGL in 2012
 
CS 354 Final Exam Review
CS 354 Final Exam ReviewCS 354 Final Exam Review
CS 354 Final Exam Review
 

Dernier

COMPUTER 10 Lesson 8 - Building a Website
COMPUTER 10 Lesson 8 - Building a WebsiteCOMPUTER 10 Lesson 8 - Building a Website
COMPUTER 10 Lesson 8 - Building a Websitedgelyza
 
Cybersecurity Workshop #1.pptx
Cybersecurity Workshop #1.pptxCybersecurity Workshop #1.pptx
Cybersecurity Workshop #1.pptxGDSC PJATK
 
IESVE Software for Florida Code Compliance Using ASHRAE 90.1-2019
IESVE Software for Florida Code Compliance Using ASHRAE 90.1-2019IESVE Software for Florida Code Compliance Using ASHRAE 90.1-2019
IESVE Software for Florida Code Compliance Using ASHRAE 90.1-2019IES VE
 
UiPath Solutions Management Preview - Northern CA Chapter - March 22.pdf
UiPath Solutions Management Preview - Northern CA Chapter - March 22.pdfUiPath Solutions Management Preview - Northern CA Chapter - March 22.pdf
UiPath Solutions Management Preview - Northern CA Chapter - March 22.pdfDianaGray10
 
Meet the new FSP 3000 M-Flex800™
Meet the new FSP 3000 M-Flex800™Meet the new FSP 3000 M-Flex800™
Meet the new FSP 3000 M-Flex800™Adtran
 
The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data a...
The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data a...The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data a...
The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data a...Aggregage
 
Comparing Sidecar-less Service Mesh from Cilium and Istio
Comparing Sidecar-less Service Mesh from Cilium and IstioComparing Sidecar-less Service Mesh from Cilium and Istio
Comparing Sidecar-less Service Mesh from Cilium and IstioChristian Posta
 
UiPath Studio Web workshop series - Day 7
UiPath Studio Web workshop series - Day 7UiPath Studio Web workshop series - Day 7
UiPath Studio Web workshop series - Day 7DianaGray10
 
Introduction to Quantum Computing
Introduction to Quantum ComputingIntroduction to Quantum Computing
Introduction to Quantum ComputingGDSC PJATK
 
Do we need a new standard for visualizing the invisible?
Do we need a new standard for visualizing the invisible?Do we need a new standard for visualizing the invisible?
Do we need a new standard for visualizing the invisible?SANGHEE SHIN
 
Crea il tuo assistente AI con lo Stregatto (open source python framework)
Crea il tuo assistente AI con lo Stregatto (open source python framework)Crea il tuo assistente AI con lo Stregatto (open source python framework)
Crea il tuo assistente AI con lo Stregatto (open source python framework)Commit University
 
Things you didn't know you can use in your Salesforce
Things you didn't know you can use in your SalesforceThings you didn't know you can use in your Salesforce
Things you didn't know you can use in your SalesforceMartin Humpolec
 
GenAI and AI GCC State of AI_Object Automation Inc
GenAI and AI GCC State of AI_Object Automation IncGenAI and AI GCC State of AI_Object Automation Inc
GenAI and AI GCC State of AI_Object Automation IncObject Automation
 
UiPath Studio Web workshop series - Day 6
UiPath Studio Web workshop series - Day 6UiPath Studio Web workshop series - Day 6
UiPath Studio Web workshop series - Day 6DianaGray10
 
9 Steps For Building Winning Founding Team
9 Steps For Building Winning Founding Team9 Steps For Building Winning Founding Team
9 Steps For Building Winning Founding TeamAdam Moalla
 
UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...
UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...
UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...UbiTrack UK
 
Machine Learning Model Validation (Aijun Zhang 2024).pdf
Machine Learning Model Validation (Aijun Zhang 2024).pdfMachine Learning Model Validation (Aijun Zhang 2024).pdf
Machine Learning Model Validation (Aijun Zhang 2024).pdfAijun Zhang
 
Using IESVE for Loads, Sizing and Heat Pump Modeling to Achieve Decarbonization
Using IESVE for Loads, Sizing and Heat Pump Modeling to Achieve DecarbonizationUsing IESVE for Loads, Sizing and Heat Pump Modeling to Achieve Decarbonization
Using IESVE for Loads, Sizing and Heat Pump Modeling to Achieve DecarbonizationIES VE
 
UiPath Platform: The Backend Engine Powering Your Automation - Session 1
UiPath Platform: The Backend Engine Powering Your Automation - Session 1UiPath Platform: The Backend Engine Powering Your Automation - Session 1
UiPath Platform: The Backend Engine Powering Your Automation - Session 1DianaGray10
 
UiPath Community: AI for UiPath Automation Developers
UiPath Community: AI for UiPath Automation DevelopersUiPath Community: AI for UiPath Automation Developers
UiPath Community: AI for UiPath Automation DevelopersUiPathCommunity
 

Dernier (20)

COMPUTER 10 Lesson 8 - Building a Website
COMPUTER 10 Lesson 8 - Building a WebsiteCOMPUTER 10 Lesson 8 - Building a Website
COMPUTER 10 Lesson 8 - Building a Website
 
Cybersecurity Workshop #1.pptx
Cybersecurity Workshop #1.pptxCybersecurity Workshop #1.pptx
Cybersecurity Workshop #1.pptx
 
IESVE Software for Florida Code Compliance Using ASHRAE 90.1-2019
IESVE Software for Florida Code Compliance Using ASHRAE 90.1-2019IESVE Software for Florida Code Compliance Using ASHRAE 90.1-2019
IESVE Software for Florida Code Compliance Using ASHRAE 90.1-2019
 
UiPath Solutions Management Preview - Northern CA Chapter - March 22.pdf
UiPath Solutions Management Preview - Northern CA Chapter - March 22.pdfUiPath Solutions Management Preview - Northern CA Chapter - March 22.pdf
UiPath Solutions Management Preview - Northern CA Chapter - March 22.pdf
 
Meet the new FSP 3000 M-Flex800™
Meet the new FSP 3000 M-Flex800™Meet the new FSP 3000 M-Flex800™
Meet the new FSP 3000 M-Flex800™
 
The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data a...
The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data a...The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data a...
The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data a...
 
Comparing Sidecar-less Service Mesh from Cilium and Istio
Comparing Sidecar-less Service Mesh from Cilium and IstioComparing Sidecar-less Service Mesh from Cilium and Istio
Comparing Sidecar-less Service Mesh from Cilium and Istio
 
UiPath Studio Web workshop series - Day 7
UiPath Studio Web workshop series - Day 7UiPath Studio Web workshop series - Day 7
UiPath Studio Web workshop series - Day 7
 
Introduction to Quantum Computing
Introduction to Quantum ComputingIntroduction to Quantum Computing
Introduction to Quantum Computing
 
Do we need a new standard for visualizing the invisible?
Do we need a new standard for visualizing the invisible?Do we need a new standard for visualizing the invisible?
Do we need a new standard for visualizing the invisible?
 
Crea il tuo assistente AI con lo Stregatto (open source python framework)
Crea il tuo assistente AI con lo Stregatto (open source python framework)Crea il tuo assistente AI con lo Stregatto (open source python framework)
Crea il tuo assistente AI con lo Stregatto (open source python framework)
 
Things you didn't know you can use in your Salesforce
Things you didn't know you can use in your SalesforceThings you didn't know you can use in your Salesforce
Things you didn't know you can use in your Salesforce
 
GenAI and AI GCC State of AI_Object Automation Inc
GenAI and AI GCC State of AI_Object Automation IncGenAI and AI GCC State of AI_Object Automation Inc
GenAI and AI GCC State of AI_Object Automation Inc
 
UiPath Studio Web workshop series - Day 6
UiPath Studio Web workshop series - Day 6UiPath Studio Web workshop series - Day 6
UiPath Studio Web workshop series - Day 6
 
9 Steps For Building Winning Founding Team
9 Steps For Building Winning Founding Team9 Steps For Building Winning Founding Team
9 Steps For Building Winning Founding Team
 
UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...
UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...
UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...
 
Machine Learning Model Validation (Aijun Zhang 2024).pdf
Machine Learning Model Validation (Aijun Zhang 2024).pdfMachine Learning Model Validation (Aijun Zhang 2024).pdf
Machine Learning Model Validation (Aijun Zhang 2024).pdf
 
Using IESVE for Loads, Sizing and Heat Pump Modeling to Achieve Decarbonization
Using IESVE for Loads, Sizing and Heat Pump Modeling to Achieve DecarbonizationUsing IESVE for Loads, Sizing and Heat Pump Modeling to Achieve Decarbonization
Using IESVE for Loads, Sizing and Heat Pump Modeling to Achieve Decarbonization
 
UiPath Platform: The Backend Engine Powering Your Automation - Session 1
UiPath Platform: The Backend Engine Powering Your Automation - Session 1UiPath Platform: The Backend Engine Powering Your Automation - Session 1
UiPath Platform: The Backend Engine Powering Your Automation - Session 1
 
UiPath Community: AI for UiPath Automation Developers
UiPath Community: AI for UiPath Automation DevelopersUiPath Community: AI for UiPath Automation Developers
UiPath Community: AI for UiPath Automation Developers
 

OpenGL for 2015

  • 1. Mark Kilgard, Principal System Software Engineer OpenGL for 2015
  • 2. Page 2 Thirteen new standard OpenGL extensions for 2015 •New ARB extensions - New shader, texture, and graphics pipeline functionality - Proven standard technology - Mostly existed previously as vendor extensions - Now officially standardized by Khronos - Ensures OpenGL is a proper super-set of ES 3.2 •Not a new core standard update but - Eighth consecutive year of Khronos updates to OpenGL at SIGGRAPH - Also did Vulkan this year  - Core version remains OpenGL 4.5
  • 3. Page 3 Khronos 2015 Announcement for OpenGL • August 10, 2015 - At SIGGRAPH • “A set of OpenGL extensions will … expose the very latest capabilities of desktop hardware.”
  • 4. Page 4 Same Day: NVIDIA has driver with full support • August 10, 2015 - Tradition that NVIDIA releases “zero day” driver with full functionality at Khronos announcement - Done for past several OpenGL releases • Ready today for developers to begin coding against latest standard extensions - Technically a “beta” driver but fully functional - Intended for developers - Official support for end-user drivers coming soon
  • 5. Page 5 Broad Categories of New OpenGL Functionality •NEW graphics pipeline operation •NEW texture mapping functionality •NEW shader functionality
  • 6. Page 6 NEW Graphics Pipeline Operation • Fragment shader interlock - ARB_fragment_shader_interlock • Programmable sample positions for rasterization - ARB_sample_locations • Post-depth coverage version of sample mask - ARB_post_depth_coverage • Vertex shader viewport & layer output - ARB_shader_viewport_layer_array • Tessellation bounding box - ARB_ES3_2_compatibility Details…
  • 7. Page 7 Fragment Shader Interlock •NEW extension: ARB_fragment_shader_interlock - Provides reliable means to read/write fragment’s pixel state within a fragment shader - GPU managed, no explicit barriers needed •Uses - Custom blend modes - Deferred shading algorithms - E.g. screen space decals •Adds GLSL functions to begin/end interlock - void beginInvocationInterlockARB(void); - void endInvocationInterlockARB(void); •Why is a fragment shader interlock needed? ... Image credit: David Bookout (Intel), Programmable Blend with Pixel Shader Ordering Shared exponent (rgb9e5) format blending via fragment shader interlock
  • 8. Page 8 Pixel Update Preserves Primitive Rasterization Order Same Pixel—covered by 3 overlapping primitives OpenGL requires stencil/depth/blend operations be observed to match rendering order, so: Primitive rasterization order rasterized primitive #1 rasterized primitive #2 rasterized primitive #3 , ,
  • 9. Page 9 Yet Fragment Shading is Massively Parallel + 1000’s of other fragments GPU Fragment Shading: parallel execution of fragment shader threads scores of + other primitives Conventional Approach Batch as many fragments in parallel as possible, maximum efficiency batch executing in parallel
  • 10. Page 10 Post-Shader Pixel Updates Respect Rasterization Order + 1000’s of other fragments Fragment Shading: parallel execution of fragment shader threads 1st blend 2nd blend 3rd blend Shader results feed fixed-function Pixel Update (stencil test, depth test, & blend)
  • 11. Page 11 However, Shader Access to Framebuffer Unsafe! + 1000’s of other fragments GPU Fragment Shading: parallel execution of fragment shader threads Pixel updates by fragment shader instances executing in parallel cannot guarantee primitive rasterization order! imageLoad, imageStore Exact behavior varies by GPU and timing dependent for any particular GPU—so both undefined & unreliable
  • 12. Page 12 Interlock Guarantees Pixel Ordering of Shading + …. GPU Fragment Shading: parallel execution of fragment shader threads scores of + other primitives Interlock Approach Batch but disallow fragments for same pixel in parallel execution of fragment shader interlock + ….+ …. batch #1 batch #2 batch #3
  • 13. Page 13 Fragment Shader Interlock Example • We want to draw a grid of Stanford bunnies… …stamped with a few brick normal maps … and then bump-map shaded Image credit: Jiho Choi (NVIDIA), GameWorks NormalBlendedDecal example
  • 14. Page 14 Motivation: Bullet holes and dynamic scuffs • Desire: Dynamically add apparently geometric details as “after effects” Without screen-space decals With screen-space decals Normal Map Normal MapShaded color result Shaded color result Image credit: Pope Kim, Screen Space Decals in Warhammer 40,000: Space Marine
  • 15. Page 15 Screen Space Decal Approach • Draw scene to G-buffer - Renders world-space normals to “normal image” framebuffer • Draw screen-space box for each screen space decal - If pixel’s world-space position in G-buffer isn’t in box, discard fragment - Avoids drawing decal on incorrect surface (one too close or too far) - Fetch decal’s tangent-space normal from decal’s normal map - Within fragment shader interlock - Fetch pixel’s world-space normal from “normal image” framebuffer - Rotate decal normal to world space - Using tangent basis constructed from world-space normal - Then blend (and renormalize) decal normal with pixel’s normal - Replace pixel’s world-space normal in “normal image” with blended normal • Do deferred shading on G-buffer, using “normal image” perturbed by decals
  • 16. Page 16 Screen Space Decal Approach Visualized Visualization of decal boxes overlaid on scene “Normal image” after blended normal decals “Normal image” before blended normal decals Brick pattern normal map decals applied to decal boxes Final shaded color result Bunny shading includes brick pattern brick normals blended with fragment shader interlock
  • 17. Page 17 GLSL Fragment Interlock Usage • Fragment interlock portion of surface space decal GLSL fragment shader beginInvocationInterlockARB(); { // Read “normal image” framebuffer's world space normal vec3 destNormalWS = normalize(imageLoad(uNormalImage, ivec2(gl_FragCoord.xy)).xyz); // Read decal's tangent space normal vec3 decalNormalTS = normalize(textureLod(uDecalNormalTex, uv, 0.0).xyz * 2 - 1); // Rotate decal's normal from tangent space to world space vec3 tangentWS = vec3(1, 0, 0); vec3 newNormalWS = normalize(mat3x3(tangentWS, cross(destNormalWS, tangentWS), destNormalWS) * decalNormalTS); // Blend world space normal vectors vec3 destNewNormalWS = normalize(mix(newNormalWS, destNormalWS, uBlendWeight)); // Write new blended normal into “normal image” framebuffer imageStore(uNormalImage, ivec2(gl_FragCoord.xy), vec4(destNewNormalWS,0)); } endInvocationInterlockARB();
  • 18. Page 18 Blend Equation Advanced vs. Shader Interlock Shader Interlock (2015) • Advantages - Arbitrary shading operations allowed - Very powerful & general - No explicit barrier needed • Disadvantages - Requires putting color blending in every fragment shader - Lengthens shader - Not orthogonal to multisampling - Fragment shader responsible for reading/writing every color sample - Unavailable for legacy fixed-function - Needs latest GPU generation Blend Equation Advanced (2014) • Advantages - Supports for established blend modes - Same as Photoshop, PDF, Flash, SVG - Optimized for their numeric precision requirements - Orthogonal to fragment shading - Just like conventional blending - Just works with multisampling & sRGB - Works with fixed-function rendering in compatibility context - Same “KHR” extension for OpenGL ES - Available on older hardware - But needs glFramebufferBarrier • Disadvantages - Blend modes limited pre-defined set - Limited to 1 color attachment Similar, but different functionality Each extension makes sense in its intended context
  • 19. Page 19 Programmable Sample Positions • Conventional OpenGL - Multisample rasterization has fixed sample positions • NEW ARB_sample_locations extension - glFramebufferSampleLocationsfvARB specifies sample positions on sub-pixel grid Default 8x multisample pattern Application-specified 8x multisample pattern, oriented for horizontal sampling Same triangle but covers sample patterns differently
  • 20. Page 20 Application: Temporal Antialiasing • Reprogram samples different every frame and render continuously • Done well, can double effective antialiasing quality “for free” - Needs vertical refresh synchronization - And app must render at rate matching refresh rate (e.g. 60 Hz) Default 2x multisample pattern Alternative 2x multisample pattern Temporal virtual 4x antialiasing Animated GIF when in slideshow mode
  • 21. Page 21 Post Depth Coverage • Normally in OpenGL stencil and depth tests are specified to be after fragment shader execution - Allows shader to discard fragments prior to these tests - So avoids the depth and stencil buffer update side-effects of these tests • OpenGL 4.2 add ability for fragment shader to force fragment shader to run after the stencil and depth tests - Part of ARB_shader_image_load_store extension - Indicated in GLSL fragment shader by layout(early_fragment_tests) in; • NEW extension ARB_post_depth_coverage - Controls where fragment shader sample mask gl_SampleMaskIn[] reflect the coverage before or after application of the early depth and stencil tests - Allows shader to know what samples survived stencil & depth tests - What you really want if you are using early fragment tests + sample mask - Indicated in GLSL fragment shader by layout(post_depth_coverage) in;
  • 22. Page 22 Early Fragment Tests & Post Depth Coverage rasterizer fragment shader stencil test depth test color blending gl_SampleMaskIn rasterizer fragment shader stencil test depth test color blending gl_SampleMaskIn rasterizer fragment shader stencil test depth test color blending gl_SampleMaskIn • Late stencil-depth tests • Rasterizer determines sample mask • Early stencil-depth tests • Rasterizer determines sample mask • Early stencil-depth tests • Post-depth coverage determines sample mask Default behavior layout(early_fragment_tests) in; layout(early_fragment_tests) in; layout(post_depth_coverage) in;
  • 23. Page 23 Vertex Shader Viewport & Layer Output • NEW extension ARB_shader_viewport_layer_array • Previously geometry shader needed to write viewport index and layer - Forced layered rendering to use geometry shaders - Even if a geometry shader wasn’t otherwise needed • New vertex shader (or tessellation evaluation shader) outputs - out int gl_ViewportIndex - out int gl_Layer
  • 24. Page 24 ES 3.2 Compatibility (tessellation, queries) • NEW extension ARB_ES3_2_compatibility • Command to specify bounding box for evaluated tessellated vertices in Normalized Device Coordinate (NDC) space - glPrimitiveBoundingBox(float minX, float minY, float minZ, float maxX, float maxY, float maxZ) - Initial space accepts entirety of NDC space (effectively not limiting tessellation) - Implementations may be able to optimize performance, assuming accurate bounds - ES 3.2 added this to make tessellation more friendly to mobile use cases - Hint: Expect today’s desktop GPUs are likely to simply ignore this but API matches ES 3.2 • Bonus: - OpenGL ES 3.2 adds two implementation-dependent constants related to multisample line rasterization - GL_MULTISAMPLE_LINE_WIDTH_RANGE_ARB - GL_MULTISAMPLE_LINE_WIDTH_GRANULARITY_ARB - Same toke values as ES 3.2 - These queries supported for completeness (yawn)
  • 25. Page 25 NEW Texture Mapping Functionality • Texture Reduction Modes: Min/Max - ARB_texture_filter_minmax • Sparse Textures, done right - ARB_sparse_texture2 • Sparse Texture Clamping - ARB_sparse_texture_clamp Details…
  • 26. Page 26 New Texture Reduction Modes: Min/Max •Standard texturing behavior - Texture fetch result = weighted average of sampled texel values - What you want for color images, etc. •NEW extension: ARB_texture_filter_minmax - Texture fetch result = minimum or maximum of all sampled texel values •Adds NEW “reduction mode” for texture parameter - Choices: GL_WEIGHTED_AVERAGE_ARB (initial state), GL_MIN, or GL_MAX - Use with glTexParameteri, glSamplerPatameteri, etc. •Example applications - Estimating variance or range when sampling data in textures - Conservative texture sampling - E.g. Maximum Intensity Projection for medical imaging
  • 27. Page 27 Application: Maximum Intensity Projection • Radiologist interpret 3D visualizations of CT scans • Volume rendering simulates opacity attenuated ray casting - Good for visualizing 3D structure • Maximum Intensity Projection (MIP) rendering shows maximum intensity along any ray - Good for highlighting features without regard to occlusion - Avoids missing significant features Volume rendering Maximum Intensity Projection Texture reduction mode GL_WEIGHTED_AVERAGE_ARB Texture reduction mode GL_MAX Image credit: Fishman et al. Volume Rendering versus Maximum Intensity Projection in CT Angiography: What Works Best, When, and Why
  • 28. Page 28 Maximum Intensity Projection vs. Volume Rendering Visualized Axial view of human middle torso Volume Rendering Maximum Intensity Projection Good at mapping arterial structure, despite occlusion Provides more 3D feel by accounting for occlusion Image credit: Fishman et al. Volume Rendering versus Maximum Intensity Projection in CT Angiography: What Works Best, When, and Why
  • 29. Page 29 Spare Textures Visualized • Textures can be HUGE - Think of satellite data - Or all the terrain in a huge game level - Or medical or seismic imaging • We don’t never expect to be looking at everything at once! - When textures are huge, can we just make resident what we need? - YES, that’s sparse texture • ARB_sparse_texture standardized in 2013 - Reflected limitations of original sparse texture hardware implementations - Now we can do better… Mipmap chain of a spare texture Only limited number of pages are resident Image credit: AMD
  • 30. Page 30 Sparse Textures, done right • NEW extension ARB_sparse_texture2 - Builds on prior ARB_spare_texture (2013) extension - Original concept: intended for enormous textures, allows less than the complete set of “pages” of the texture image set to be resident - Primary limitation: - Fetching non-resident data returned undefined results without indication - So no way to know if non-resident data was fetched - This reflected hardware limitations of the time, fixed in newer hardware • Sparse Texture version 2 is friendly to dynamically detecting non-resident access - Fetch of non-resident data now reliably returns zero values - spareTextureARB GLSL texture fetch functions return residency information integer - And 11 other variations of spareTexture*ARB GLSL functions as well - sparseTexelsResidentARB GLSL function maps returned integer as Boolean residency - Now supports sparse multisample and multisample texture arrays
  • 31. Page 31 Sparse Texture, done even better • NEW extension ARB_sparse_texture_clamp • Adds new GLSL texture fetch variant functions - Includes 10 additional level-of-detal (LOD) parameter to provide a per-fetch floor on the hardware-computed LOD - I.e. the minimum lodClamp parameter - Sparse texture variants - sparseTextureClampARB, sparseTextureOffsetClampARB, sparseTextureGradClampARB, sparseTextureGradOffsetClampARB - Non-spare texture versions too - textureClampARB, textureOffsetClampARB, textureGradClampARB, textureGradOffsetClampARB • Benefit for sparse texture fetches - Shaders can avoid accessing unpopulated portions of high-resolution levels of detail when knowing texture detail is unpopulated - Either from a priori knowledge - Or feedback from previously executed "sparse" texture lookup functions
  • 32. Page 32 Sparse Texture Clamp Example • Naively fetch sparse texture until you get a valid texel vec4 texel; int code = spareTextureARB(spare_texture, uv, texel); float minLodClamp = 1; while (!sparseTexelsResidentARB(code)) { code = sparseTextureClampARB(sparseTexture, uv, texel, minLodClamp); minLodClamp += 1.0f; } 1 fetch 2 fetches, 1 missed 3 fetches, 2 missed
  • 33. Page 33 NEW Shader Functionality • OpenGL ES.2 Shading Language Compatibility - ARB_ES3_2_compatibility • Parallel Compile & Link of GLSL - ARB_parallel_shader_compile • 64-bit Integers Data Types - ARB_gpu_shader_int64 • Shader Atomic Counter Operations - ARB_shader_atomic_counter_ops • Query Clock Counter - ARB_shader_clock • Shader Ballot and Broadcast - ARB_shader_ballot Details…
  • 34. Page 34 ES 3.2 Compatibility (shader support) • NEW extension ARB_ES3_2_compatibility • Just say #version 320 es in your GLSL shader - Develop and use OpenGL ES 3.2’s GLSL dialect from regular OpenGL - Helps desktop developers target mobile and embedded devices • ES 3.2 GLSL adds functionality already in OpenGL - KHR_blend_equation_advanced, OES_sample_variables, OES_shader_image_atomic, OES_shader_multisample_interpolation, OES_texture_storage_multisample_2d_array, OES_geometry_shader, OES_gpu_shader5, OES_primitive_bounding_box, OES_shader_io_blocks, OES_tessellation_shader, OES_texture_buffer, OES_texture_cube_map_array, KHR_robustness - Notably Shader Model 5.0, geometry & tessellation shaders
  • 35. Page 35 Parallel Compile & Link of GLSL • NEW extension ARB_parallel_shader_compile - Facilitates OpenGL implementations to distribute GLSL shader compilation and program linking to multiple CPU threads to speed compilation throughput - Allows apps to better manage GLSL compilation overheads - Benefit: Faster load time for new shaders and programs on multi-core CPU systems - Good practice: Construct multiple GLSL shaders/programs—defer querying state or using for as long as possible or completion status is true • Part 1: Tells OpenGL’s GLSL compiler how many CPU threads to use for parallel compilation - void glMaxShaderCompilerThreadsARB(GLuint threadCount) - Initially allows implementation-dependent maximum (initial value 0xFFFFFFFF) - Zero means do not use parallel GLSL complication • Part 2: Shader and program query if compile or link is complete - Call glGetShaderiv or glGetProgramiv on GL_COMPLETION_STATUS_ARB parameter - Returns true when compile is complete, false if still compiling - Unlike other queries, will not block for compilation to complete.
  • 36. Page 36 64-bit Integer Data Types in GLSL • GLSL has had 32-bit integer and 64-bit floating-point for a while… • Now adds 64-bit integers - NEW extension ARB_gpu_shader_int64 • New data types - Signed: int64_t, i64vec2, i64vec3, i64vec4, - Unsigned: uint64_t, u64vec2, u64vec3, u64vec4 - Supported for uniforms, buffers, transform feedback, and shader input/outputs • Standard library extended to 64-bit integers • Programming interface - Uniform setting - glUniform{1,2,3,4}i{,v}64ARB - glUniform{1,2,3,4}ui{,v}64ARB - Direct state access (DSA) variants as well - glProgramlUniform{1,2,3,4}i{,v}64ARB - glProgramlUniform{1,2,3,4}ui{,v}64ARB - Queries for 64-bit uniform integer data
  • 37. Page 37 Shader Atomic Counter Operations in GLSL • NEW ARB_shader_atomic_counter_ops extension - Builds on ARB_shader_atomic_counters extension (2011, OpenGL 4.2) - Original atomic counters quite limited - Could only increment, decrement, and query • New operations supported on counters - Addition and subtraction: atomicCounterAddARB, atomicCounterSubtractARB - Minimum and maximum: atomicCounterMinARB, atomicCounterMaxARB - Bitwise operators (AND, OR, XOR, etc.) - atomicCounterAndARB, atomicCounterOrARB, atomicCounterXorARB - Exchange: atomicCounterExchangeARB - Compare and Exchange: atomicCounterCompSwapARB
  • 38. Page 38 Query Clock Counter in GLSL • NEW extension ARB_shader_clock • New functions query a free-running “clock” - 64-bit monotonically incrementing shader counter - uint64_t clockARB(void) - uvec2 clock2x32ARB(void) - Avoids requiring 64-bit integers, instead returns two 32-bit unsigned integers • Similar to Win32’s QueryPerformanceCounter - But within the GPU shader complex • Can allow shaders to monitor their performance - Details implementation-dependent
  • 39. Page 39 Shader Ballot and Broadcast • NEW extension ARB_shader_ballot - Assumes 64-bit integers • Concept - Group of invocations (shader threads) which execute in lockstep can do a limited forms of cross-invocation communication via a group broadcast of a invocation value, or broadcast of a bitarray representing a predicate value from each invocation in the group - Allows efficient collective decisions within a group of invocations • New built-in data types - Uniform: gl_SubGroupSizeARB - Integer input: gl_SubGroupInvocationARB - Mask input: gl_SubGroupEqMaskARB, gl_SubGroupGeMaskARB, gl_SubGroupGtMaskARB, gl_SubGroupLeMaskARB, gl_SubGroupLtMaskARB • New GLSL functions - uint64_t ballotARB(bool value)
  • 40. Page 40 GLEW Support Available NOW •GLEW = The OpenGL Extension Wrangler Library - Open source library - http://glew.sourceforge.net/ - Your one-stop-shop for API support for all OpenGL extension APIs •GLEW 1.13.0 provides API support for all 13 extensions NOW •Thanks to Nigel Stewart and Jon Leech for this
  • 41. Page 41 • Graphics pipeline operation •ARB_fragment_shader_interlock •ARB_sample_locations •ARB_post_depth_coverage •ARB_ES3_2_compatibility •Tessellation bounding box •Multisample line width query •ARB_shader_viewport_layer_array • Texture mapping functionality •ARB_texture_filter_minmax •ARB_sparse_texture2 •ARB_sparse_texture_clamp • Shader functionality •ARB_ES3_2_compatibility •ES 3.2 shading language support •ARB_parallel_shader_compile •ARB_gpu_shader_int64 •ARB_shader_atomic_counter_ops •ARB_shader_clock •ARB_shader_ballot In Review •OpenGL in 2015 has 13 new standard extensions
  • 42. Page 42 GPU Hardware Support Extension Fermi Kepler Maxwell 1, K1* Maxwell 2, X1* ARB_ES3_2_compatibility ✓ ✓ ✓ ✓ ARB_parallel_shader_compile ✓ ✓ ✓ ✓ ARB_gpu_shader_int64 ✓ ✓ ✓ ✓ ARB_shader_atomic_counter_ops ✓ ✓ ✓ ✓ ARB_shader_clock ✓ ✓ ✓ ARB_shader_ballot ✓ ✓ ✓ ARB_fragment_shader_interlock ✓ ARB_sample_locations ✓ ARB_post_depth_coverage ✓ ARB_shader_viewport_layer_array ✓ ARB_texture_filter_minmax ✓ ARB_sparse_texture2 ✓ † ARB_sparse_texture_clamp ✓ † * = Tegra driver support later † = assumes OS support for sparse resources
  • 43. Page 43 Thanks •Multi-vendor effort! •Particular thanks to specification leads - Pat Brown (NVIDIA) - Piers Daniell (NVIDIA) - Slawomir Grajewski (Intel) - Daniel Koch (NVIDIA) - Jon Leech (Khronos) - Timothy Lottes (AMD) - Daniel Rakos (AMD) - Graham Sellers (AMD) - Eric Werness (NVIDIA)
  • 44. Page 44 How to get OpenGL 2015 drivers now • NVIDIA developer web site - https://developer.nvidia.com/opengl-driver • For Quadro and GeForce - Windows, version 355.58 - Linux, version 355.00.05 - Newer versions may be available Support NVIDIA GPU generations - Maxwell - Many extensions in set, such as ARB_fragment_shader_interlock, needs new Maxwell 2 GPU generation - Example: GeForce 9xx, Titan X, Quadro M6000 - Kepler - Fermi
  • 45. Page 45 NVIDIA’s driver also includes OpenGL ES 3.2 • Desktop OpenGL driver can create a compliant ES 3.2 context - Develop on a PC, then move your working ES 3.2 code to a mobile device - OpenGL 3.2 is basically Android Extension Pack (AEP), standardized by Khronos now • The extensions below are part of OpenGL ES 3.2 core specification now, but they can still be used in contexts below OpenGL ES 3.2 as extensions on supported hardware: - OES_gpu_shader5 - OES_primitive_bounding_box - OES_shader_io_blocks - OES_tessellation_shader - OES_texture_border_clamp - OES_texture_buffer - OES_texture_cube_map_array - OES_draw_elements_base_vertex - KHR_robustness - EXT_color_buffer_float - KHR_debug - KHR_texture_compression_astc_ldr - KHR_blend_equation_advanced - OES_sample_shading - OES_sample_variables - OES_shader_image_atomic - OES_shader_multisample_interpolation - OES_texture_stencil8 - OES_texture_storage_multisample_2d_array - OES_copy_image - OES_draw_buffers_indexed - OES_geometry_shader
  • 46. Page 46 Conclusions •NEW standard OpenGL Extensions announced at SIGGRAPH for 2015 •NVIDIA already shipping support for all these extensions - Released same day Khronos announced the functionality •Get latest Maxwell 2 generation GPU to access extensions depending on latest hardware