SlideShare une entreprise Scribd logo
1  sur  81
Télécharger pour lire hors ligne
SG4121: OPENGL 4.5 UPDATE FOR
NVIDIA GPUS
Mark Kilgard
Principal System Software Engineer, NVIDIA
Piers Daniell
Senior Graphics Software Engineer, NVIDIA
Mark Kilgard
• Principal System Software Engineer
– OpenGL driver and API evolution
– Cg (“C for graphics”) shading language
– GPU-accelerated path rendering
• OpenGL Utility Toolkit (GLUT) implementer
• Author of OpenGL for the X Window System
• Co-author of Cg Tutorial
• Worked on OpenGL for 20+ years
Piers Daniell
• Senior Graphics Software Engineer
• NVIDIA’s Khronos OpenGL representative
– Since 2010
– Authored numerous OpenGL
extension specifications now core
• Leads OpenGL version updates
– Since OpenGL 4.1
• 10+ years with NVIDIA
NVIDIA’s OpenGL Leverage
Debugging with
Nsight
Programmable
Graphics
Tegra
Quadro
OptiX
GeForce
Adobe Creative Cloud
Single 3D API for Every Platform
OS X
Linux
FreeBSD
Solaris
Android
Windows
Adobe Creative Cloud:
GPU-accelerated Illustrator
• 27 year old application
– World’s leading graphics
design application
• 6 million users
– Never used the GPU
• Until this June 2014
• Adobe and NVIDIA worked to
integrate NV_path_rendering
into Illustrator CC 2014
OpenGL 4.x Evolution
 Major revision of OpenGL every year since OpenGL 3.0, 2008
 Maintained full backwards compatibility
2010 2011 2012 2013 2014
OpenGL 4.0: Tessellation
OpenGL 4.1: Shader mix-and-match, ES2 compatibility
OpenGL 4.2: GLSL upgrades and shader image load store
OpenGL 4.3: Compute shaders, SSBO, ES3 compatibility
OpenGL 4.4: Persistently mapped buffers, multi bind
???
Big News: OpenGL 4.5 Released Today!
 Direct State Access (DSA) finally!
 Robustness
 OpenGL ES 3.1 compatibility
 Faster MakeCurrent
 DirectX 11 features for porting and emulation
 SubImage variant of GetTexImage
 Texture barriers
 Sparse buffers (ARB extension)
So OpenGL Evolution Through 4.5
 Major revision of OpenGL every year since 2008
 Maintained full backwards compatibility
2010 2011 2012 2013 2014
OpenGL 4.0: Tessellation
OpenGL 4.1: Shader mix-and-match, ES2 compatibility
OpenGL 4.2: GLSL upgrades and shader image load store
OpenGL 4.3: Compute shaders, SSBO, ES3 compatibility
OpenGL 4.4: Persistently mapped buffers, multi bind
OpenGL 4.5: Direct state access, robustness, ES3.1
OpenGL Evolves Modularly
• Each core revision is specified as a set of extensions
– Example: ARB_ES3_1_compatibility
• Puts together all the functionality for ES 3.1 compatibility
• Describe in its own text file
– May have dependencies on other extensions
• Dependencies are stated explicitly
• A core OpenGL revision (such as OpenGL 4.5) “bundles” a set of agreed
extensions — and mandates their mutual support
– Note: implementations can also “unbundle” ARB extensions for hardware unable
to support the latest core revision
• So easiest to describe OpenGL 4.5 based on its bundled extensions…
4.5
ARB_direct_state_access
ARB_clip_control
many more …
OpenGL 4.5 as extensions
 All new features to OpenGL 4.5 can be used with GL contexts
4.0 through 4.4 via extensions:
— ARB_clip_control
— ARB_conditional_render_inverted
— ARB_cull_distance
— ARB_shader_texture_image_samples
— ARB_ES3_1_compatibility
— ARB_direct_state_access
— KHR_context_flush_control
— ARB_get_texture_subimage
— KHR_robustness
— ARB_texture_barrier
API Compatibility
(Direct3D, OpenGL ES)
API Improvements
Browser security (WebGL)
Texture & framebuffer
memory consistency
Additional ARB extensions
 Along with OpenGL 4.5, Khronos has released ARB extensions
 ARB_sparse_buffer
 DirectX 11 features
— ARB_pipeline_statistics_query
— ARB_transform_feedback_overflow_query
 NVIDIA supports the above on all OpenGL 4.x hardware
— Fermi, Kepler and Maxwell
— GeForce, Quadro and Tegra K1
NVIDIA OpenGL 4.5 beta Driver
 Available today!
 https://developer.nvidia.com/opengl-driver
— Or just Google “opengl driver” – it’s the first hit!
— Windows and Linux
 Supports all OpenGL 4.5 features and all ARB/KHR extensions
 Available on Fermi, Kepler and Maxwell GPUs
— GeForce and Quadro
— Desktop and Laptop
Using OpenGL 4.5
 OpenGL 4.5 has 118 New functions. Eek.
 How do you deal with all that? The easy way…
 Use the OpenGL Extension Wrangler (GLEW)
— Release 1.11.0 already has OpenGL 4.5 support
— http://glew.sourceforge.net/
Direct State Access (DSA)
 Read and modify object state directly without bind-to-edit
 Performance benefit in many cases
 Context binding state unmodified
— Convenient for tools and middleware
— Avoids redundant state changes
 Derived from EXT_direct_state_access
More Efficient Middleware
void Texture2D::SetMagFilter(Glenum filter)
{
GLuint oldTex;
glGetIntegerv(GL_TEXTURE_BINDING_2D, &oldTex);
glBindTexture(GL_TEXTURE_2D, m_tex);
glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_MAG_FILTER, filter);
glBindTexture(GL_TEXTURE_2D, oldTex);
}
 Before DSA
 After DSA
void Texture2D::SetMagFilter(Glenum filter)
{
glTextureParameteri(m_tex, GL_TEXTURE_MAG_FILTER, filter);
}
Simplified Code
 Before DSA
GLuint tex[2];
glGenTextures(2, tex);
glActiveTexture(GL_TEXTURE0 + 0);
glBindTexture(GL_TEXTURE_2D, tex[0]);
glTexStorage2D(GL_TEXTURE_2D, 1, GL_RGBA8, 8, 8);
glActiveTexture(GL_TEXTURE0 + 1);
glBindTexture(GL_TEXTURE_2D, tex[1]);
glTexStorage2D(GL_TEXTURE_2D, 1, GL_RGBA8, 4, 4);
 After DSA
GLuint tex[2];
glCreateTextures(GL_TEXTURE_2D, 2, tex);
glTextureStorage2D(tex[0], 1, GL_RGBA8, 8, 8);
glTextureStorage2D(tex[1], 1, GL_RGBA8, 4, 4);
glBindTextures(0, 2, tex);
More Direct Framebuffer Access
 Before DSA
glBindFramebuffer(GL_DRAW_FRAMEBUFFER, msFBO);
DrawStuff();
glBindFramebuffer(GL_DRAW_FRAMEBUFFER, nonMsFBO);
glBindFramebuffer(GL_READ_FRAMEBUFFER, msFBO);
glBlitFramebuffer(...);
glBindFramebuffer(GL_DRAW_FRAMEBUFFER, msFBO);
 After DSA
glBindFramebuffer(GL_DRAW_FRAMEBUFFER, msFBO);
DrawStuff();
glBlitNamedFramebuffer(msFBO, nonMsFBO, ...);
DSA Create Functions
glCreate Creates
glCreateBuffers Buffer Objects
glCreateRenderbuffers Renderbuffer Objects
glCreateTextures(<target>) Texture Objects of specific target
glCreateFramebuffers Framebuffer Objects
glCreateVertexArrays Vertex Array Objects
glCreateProgramPipelines Program Pipeline Objects
glCreateSamplers Sampler Objects
glCreateQueries(<target>) Query Objects of a specific target
 Generates name AND creates object
 Bind-to-create not needed
DSA Texture Functions
Non-DSA DSA
glGenTextures + glBindTexture glCreateTextures
glTexStorage* glTextureStroage*
glTexSubImage* glTextureSubImage*
glCopyTexSubImage* glCopyTextureSubImage*
glGetTexImage glGetTextureImage
glCompressedTexSubImage* glCompressedTextureSubImage*
glGetCompressedTexImage glGetCompressedTextureImage
glActiveTexture + glBindTexture glBindTextureUnit
glTexBuffer[Range] glTextureBuffer[Range]
glGenerateMipmap glGenerateTextureMipmap
gl[Get]TexParameter* gl[Get]TextureParameter*
DSA Renderbuffer Functions
Non-DSA DSA
glGenRenderbuffers + glBindRenderbuffer glCreateRenderbuffers
glRenderbufferStorage* glNamedRenderbufferStorage*
glGetRenderbufferParameteriv glGetNamedRenderbufferParameteriv
DSA Framebuffer Functions
Non-DSA DSA
glGenFramebuffers + glBindFramebuffer glCreateFramebuffers
glFramebufferRenderbuffer glNamedFramebufferRenderbuffer
glFramebufferTexture[Layer] glNamedFramebufferTexture[Layer]
glDrawBuffer[s] glNamedFramebufferDrawBuffer[s]
glReadBuffer glNamedFramebufferReadBuffer
glInvalidateFramebuffer[Sub]Data glInvalidateNamedFramebuffer[Sub]Data
glClearBuffer* glClearNamedFramebuffer*
glBlitFramebuffer glBlitNamedFramebuffer
glCheckFramebufferStatus glCheckNamedFramebufferStatus
glFramebufferParameteri glNamedFramebufferParameteri
glGetFramebuffer*Parameter* glGetNamedFramebuffer*Parameter*
DSA Buffer Object Functions
Non-DSA DSA
glGenBuffers + glBindBuffer glCreateBuffers
glBufferStorage glNamedBufferStorage
glBuffer[Sub]Data glNamedBuffer[Sub]Data
glCopyBufferSubData glCopyNamedBufferSubData
glClearBuffer[Sub]Data glClearNamedBuffer[Sub]Data
glMapBuffer[Range] glMapNamedBuffer[Range]
glUnmapBuffer glUnmapNamedBuffer
glFlushMappedBufferRange glFlushMappedNamedBufferRange
glGetBufferParameteri* glGetNamedBufferParameteri*
glGetBufferPointerv glGetNamedBufferPointerv
glGetBufferSubData glGetNamedBufferSubData
DSA Transform Feedback Functions
Non-DSA DSA
glGenTransformFeedbacks + glBind glCreateTransformFeedbacks
glBindBuffer{Base|Range} glTransformFeedbackBuffer{Base|Range}
glGetInteger* glGetTransformFeedbacki*
DSA Vertex Array Object (VAO) Functions
Non-DSA DSA
glGenVertexArrays + glBindVertexArray glCreateVertexArrays
glEnableVertexAttribArray glEnableVertexArrayAttrib
glDisableVertexAttribArray glDisableVertexArrayAttrib
glBindBuffer(ELEMENT_ARRAY_BUFFER) glVertexArrayElementBuffer
glBindVertexBuffer[s] glVertexArrayVertexBuffer[s]
glVertexAttrib*Format glVertexArrayAttrib*Format
glVertexBindingDivisor glVertexArrayBindingDivisor
glGetInteger* glGetVertexArray*
EXT_direct_state_access Differences
 Only OpenGL 4.5 core functionality supported
 Some minor name changes to some functions
— Mostly the same, but drops EXT suffix
 TextureParameterfEXT -> TextureParameterf
— VAO function names shortened
 glVertexArrayVertexBindingDivisorEXT -> glVertexArrayBindingDivisor
— Texture functions no longer require a target parameter
 Target comes from glCreateTextures(<target>,)
 Use “3D” functions with CUBE_MAP where z specifies the face
 DSA functions can no longer create objects
— Use glCreate* functions to create name and object at once
Robustness
 ARB_robustness functionality now part of OpenGL 4.5
— Called KHR_robustness for use with OpenGL ES too
— Does not include compatibility functions
 Adds “safe” APIs for queries that return data to user pointers
 Adds mechanism for app to learn about GPU resets
— Due to my app or some other misbehaving app
 Stronger out-of-bounds behavior
— No more undefined behavior
 Used by WebGL implementations to deal with Denial of
Service (DOS) attacks
Robustness API
 Before Robustness
GLubyte tooSmall[NOT_BIG_ENOUGH];
glReadPixels(0, 0, H, W, GL_RGBA, GL_UNSIGNED_BYTE, tooSmall);
// CRASH!!
 After Robustness
GLubyte tooSmall[NOT_BIG_ENOUGH];
glReadnPixels(0, 0, H, W, GL_RGBA, GL_UNSIGNED_BYTE, sizeof tooSmall, tooSmall);
// No CRASH, glGetError() returns INVALID_OPERATION
Robustness Reset Notification
 Typical render loop with reset check
while (!quit) {
DrawStuff();
SwapBuffers();
if (glGetGraphicsResetStatus() != GL_NO_ERROR) {
quit = true;
}
}
DestroyContext(glrc);
 Reset is asynchronous
— GL will behave as normal after a reset event but rendering commands
may not produce the right results
— The GL context should be destroyed
— Notify the user
OpenGL ES 3.1 Compatibility
 Adds new ES 3.1 features not already in GL
 Also adds #version 310 es GLSL shader support
 Compatibility profile required for full superset
— ES 3.1 allows client-side vertex arrays
— Allows application generated object names
— Has default Vertex Array Object (VAO)
 Desktop provides great development platform for ES 3.1
content
Desktop features in an ES profile
 NVIDA GPUs provide all ANDROID_extension_pack_es31a
features in an ES profile
— Geometry, Tessellation, Advanced blending, etc.
 Scene from Epic’s “Rivarly” OpenGL ES 3.1 + AEP demo running on Tegra K1
Using OpenGL ES 3.1 on Desktop
 The Windows WGL way
int attribList[] = {
WGL_CONTEXT_MAJOR_VERSION_ARB, 3,
WGL_CONTEXT_MINOR_VERSION_ARB, 1,
WGL_CONTEXT_PROFILE_MASK_ARB, WGL_CONTEXT_ES_PROFILE_BIT_EXT,
0
};
HGLRC hglrc = wglCreateContextAttribsARB(wglGetCurrentDC(), NULL, attribList);
wglMakeCurrent(wglGetCurrentDC(), hglrc);
 On NVIDIA GPUs this is a fully conformant OpenGL ES 3.1
implementation
— http://www.khronos.org/conformance/adopters/conformant-products
New OpenGL ES 3.1 features
 glMemoryBarrierByRegion
— Like glMemoryBarrier, but potentially more efficient on tillers
 GLSL functionality
— imageAtomicExchange() support for float32
— gl_HelperInvocation fragment shader input
 Know which pixels won’t get output
 Skip useless cycles or unwanted side-effects
— mix() function now supports int, uint and bool
— gl_MaxSamples
 Implementation maximum sample count
Faster MakeCurrent
 An implicit glFlush is called on MakeCurrent
— Makes switching contexts slow
 New WGL and GLX extensions allow glFlush to be skipped
— Commands wait in context queue
— App has more control over flush
 Provides 2x MakeCurrent performance boost
StartTimer();
for (int i = 0; i < iterations; ++i) {
DrawSimpleTriangle();
wglMakeCurrent(context[i % 2]);
}
StopTimer();
Disable Implicit glFlush on MakeCurrent
 The Windows way with WGL
int attribList[] = {
WGL_CONTEXT_MAJOR_VERSION_ARB, 4,
WGL_CONTEXT_MINOR_VERSION_ARB, 5,
WGL_CONTEXT_RELEASE_BEHAVIOR_ARB, WGL_CONTEXT_RELEASE_BEHAVIOR_NONE_ARB,
0
};
HGLRC hglrc = wglCreateContextAttribsARB(wglGetCurrentDC(), NULL, attribList);
wglMakeCurrent(wglGetCurrentDC(), hglrc);
DirectX 11 Features
 ARB_clip_control
 ARB_conditional_render_inverted
 ARB_cull_distance
 ARB_derivative_control
 ARB_shader_texture_image_samples
 ARB_pipeline_statistics_query (ARB extension)
 ARB_transform_feedback_overflow_query (ARB extension)
ARB_clip_control
 glClipControl(origin, depthMode);
— y-origin can be flipped during viewport transformation
— Depth clip range can be [0,1] instead of [-1,1]
 depthMode = GL_NEGATIVE_ONE_TO_ONE: Zw = ((f-n)/2) * Zd + (n+f)/2
 depthMode = GL_ZERO_TO_ONE: Zw = (f-n) * Zd + n
— Provides direct mapping of [0,1] depth clip coordinates to [0,1] depth
buffer values when f=1 and n=0
 No precision loss
origin=GL_LOWER_LEFT origin=GL_UPPER_LEFT
ARB_conditional_render_inverted
 Allow conditional render to use the negated query result
 Matches the DX11 ::SetPredication(, PredicateValue) option
 Query result negation only happens to landed result
— Otherwise rendering takes place
GLuint predicate;
glCreateQueries(GL_SAMPLES_PASSED, 1, & predicate);
glBeginQuery(GL_SAMPLES_PASSED, predicate);
DrawNothing(); // Draws nothing
glEndQuery(GL_SAMPLES_PASSED);
glBeginConditionalRender(predicate, GL_QUERY_WAIT_INVERTED);
DrawStuff(); // Scene is rendered since SAMPLES_PASSED==0
glEndConditionalRender();
 More useful with other query targets like
GL_TRANSFORM_FEEDBACK_OVERFLOW
ARB_cull_distance
 Adds new gl_CullDistance[n] to Vertex, Tessellation, and
Geometry shaders (VS, TCS, TES and GS)
 Like gl_ClipDistance except when any vertex has negative
distance whole primitive is culled
 Matches DX11 SV_CullDistance[n]
Clipping
Plane
Negative
gl_ClipDistance
Positive
gl_ClipDistance
Clipped
Clipping
Plane
Negative
gl_CullDistance
Positive
gl_CullDistance
Culled
ARB_derivative_control
 Adds “coarse” and “fine” variant of GLSL derivative functions
 dFdxCoarse, dFdyCoarse
— Potentially faster performance
 dFdxFine, dFdyFine
— More correct
— Default behavior of old dFdx and dFdy functions
 fwidthCoarse and fwidthFine are also added
2x2 Quad Fragment
dFdxCoarse
=
=
2x2 Quad Fragment
dFdxFine=
= dFdxFine
ARB_shader_texture_image_samples
 New GLSL built-ins to query the sample count of multi-sample
texture and image resources
— textureSamples
— imageSamples
 Equivalent to the NumberOfSamples return with the
GetDimensions query in HLSL
#version 450 core
uniform sample2DMS tex;
out vec4 color;
void main() {
if (textureSamples(tex) > 2) {
color = DoFancyDownsample(tex);
} else {
color = DoSimpleDownsample(tex);
}
}
ARB_pipeline_statistics_query
 New queries for profiling and DX11 compatibility
— GL_VERTICES_SUBMITTED
 Number of vertices submitted to the GL
— GL_PRIMITIVES_SUBMITTED
 Number of primitives submitted to the GL
— GL_VERTEX_SHADER_INVOCATIONS
 Number of times the vertex shader has been invoked
— GL_TESS_CONTROL_SHADER_PATCHES
 Number of patches processed by the tessellation control shader
— GL_TESS_EVALUATION_SHADER_INVOCATIONS
 Number of times the tessellation control shader has been invoked
ARB_pipeline_statistics_query cont.
 More queries
— GL_GEOMETRY_SHADER_INVOCATIONS
 Number of times the geometry shader has been invoked
— GL_GEOMETRY_SHEDER_PRIMITIVES_EMITTED
 Total number of primitives emitted by geometry shader
— GL_FRAGMENT_SHADER_INVOCATIONS
 Number of times the fragment shader has been invoked
— GL_COMPUTE_SHADER_INVOCATIONS
 Number of time the compute shader has been invoked
— GL_CLIPPING_INPUT_PRIMITIVES
— GL_CLIPPINT_OUTPUT_PRIMITIVES
 Input and output primitives of the clipping stage
ARB_transform_feedback_overflow_query
 Target queries to indicate Transform Feedback Buffer overflow
— GL_TRANSFORM_FEEDBACK_OVERFLOW_ARB
— GL_TRANSFORM_FEEDBACK_STREAM_OVERFLOW_ARB
 Use glBeginQueryIndex to specify specific stream
 The result of which can be used with conditional render
GLuint predicate;
glCreateQueries(GL_TRANSFORM_FEEDBACK_OVERFLOW_ARB, 1, & predicate);
glBeginQuery(GL_TRANSFORM_FEEDBACK_OVERFLOW_ARB, predicate);
glBeginTransformFeedback(GL_TRIANGLES);
DrawLotsOfStuff();
glEndTransformFeedback();
glEndQuery(GL_TRANSFORM_FEEDBACK_OVERFLOW_ARB);
glBeginConditionalRender(predicate, GL_QUERY_NO_WAIT_INVERTED);
DrawStuff(); // Scene not rendered if XFB overflowed buffers
glEndConditionalRender();
… glEnd() // DX11 Features
Texture Barrier
 Allows rendering to a bound texture
— Use glTextureBarrier() to safely read previously written texels
— Behavior is now defined with use of texture barriers
 Allows render-to-texture algorithms to ping-pong without
expensive Framebuffer Object (FBO) changes
— Bind 2D texture array for texturing and as a layered FBO attachment
Draw gl_Layer=0
glTextureBarrier()
texture
Draw gl_Layer=1
texture
Programmable Blending
 Limited form of programmable blending with non-self-
overlapping draw calls
— Bind texture as a render target and for texturing
glBindTexture(GL_TEXTURE_2D, tex);
glFramebufferTexture(GL_FRAMEBUFFER, GL_COLOR_ATTACHMENT0, tex, 0);
dirtybbox.empty();
foreach (object in scene) {
if (dirtybbox.overlaps(object.bbox())) {
glTextureBarrier();
dirtybbox.empty();
}
object.draw();
dirtybbox = bound(dirtybbox, object.bbox());
}
Advanced Blending
 KHR_blend_equation_advanced created from
NV_blend_equation_advanced
 Supported by NVIDIA since r340 – June, 2014
— GL and ES profiles
 Supported natively on Maxwell and Tegra K1 GPUs
— Otherwise implemented seamlessly with shaders on Fermi and Kepler
 Implements a subset of NV_blend_equation_advanced modes
 Maxwell and Tegra K1 also provide
KHR_blend_equation_advanced_coherent
— Doesn’t require glBlendBarrierKHR between primitives that double-hit
color samples
KHR_blend_equation_advanced Modes
 GL_MULTIPLY_KHR
 GL_SCREEN_KHR
 GL_OVERLAY_KHR
 GL_SOFTLIGHT_KHR
 GL_HARDLIGHT_KHR
 GL_COLORDODGE_KHR
 GL_COLORBURN_KHR
 GL_DARKEN_KHR
 GL_LIGHTEN_KHR
 GL_DIFFERENCE_KHR
 GL_EXCLUSION_KHR
 GL_HSL_HUE_KHR
 GL_HSL_SATURATION_KHR
 GL_HSL_COLOR_KHR
 GL_HSL_LUMINOSITY_KHR
Get Texture Sub Image
 Like glGetTexImage, but now you can read a sub-region
 glGetTextureSubImage
— DSA only variant
void GetTextureSubImage(uint texture, int level,
int xoffset, int yoffset, int zoffset, sizei width,
sizei height, sizei depth, enum format, enum type,
sizei bufSize, void * pixels);
Direct State Access
Robustness
pixels
yoffset
xoffset
width
height
 For GL_TEXTURE_CUBE_MAP targets zoffset specifies face
ARB_sparse_buffer
 Ability to have large buffer objects without the whole buffer
being resident
— Analogous to ARB_sparse_texture for buffer objects
 Application controls page residency
1) Create uncommitted buffer: glBufferStorage(,SPARSE_STORAGE_BIT_ARB)
2) Make pages resident: glBufferPageCommitmentARB(, offset, size, GL_TRUE);
GL_SPARSE_BUFFER_PAGE_SIZE_ARB
offset size
Summary of GLSL 450 additions
 dFdxFine, dFdxCoarse, dFxyFine, dFdyCoarse
 textureSamples, imageSamples
 gl_CullDistance[gl_MaxCullDistances];
 #version 310 es
 imageAtomicExchange on float
 gl_HelperInvocation
 gl_MaxSamples
 mix() on int, uint and bool
OpenGL Demos on K1
Shield Tablet
• Tegra K1 runs Android
• Kepler GPU hardware in K1 supports the full OpenGL 4.5
feature set
– Today 4.4, expect 4.5 support
– OpenGL 4.5 is all the new stuff, plus tons of proven features
• Tessellation, compute, instancing
– Plus latest features: bindless, path rendering, blend modes
• Demos use GameWorks framework
– Write Android-ready OpenGL code that runs on Windows and Linux too
Programmable Tessellation Demo
on Android
Programmable Tessellation Demo
on Windows
Build, Deploy, and Debug Android Native
OpenGL Code Right in Visual Studio
GameWorks Compute Shader Example
layout (local_size_x =16, local_size_y = 16) in;
layout(binding=0, rgba8) uniform mediump image2D inputImage;
layout(binding=1, rgba8) uniform mediump image2D resultImage;
void main() {
float u = float(gl_GlobalInvocationID.x);
float v = float(gl_GlobalInvocationID.y);
vec4 inv = 1.0 - imageLoad(inputImage, ivec2(u,v));
imageStore(resultImage, ivec2(u,v), inv);
}
GLSL Compute
Shader to
invert an image
Massive Compute Shader Particle Simulation
Mega Geometry with Instancing
glDrawElementsInstanced +
glVertexAttribDivisor
Getting GameWorks
• Get Tegra Android Development Pack (TADP)
– All the tools you need for Android development
• Windows or Linux
– Includes GameWorks samples
• Samples also available on Github
https://github.com/NVIDIAGameWorks/OpenGLSamples
OpenGL Debug Features
 KHR_debug added to OpenGL 4.3
 App has access to driver “stderr” message stream
— Via Callback function or
— Query of message queue
 Any object can have a meaningful “label”
 Driver can tell app about
— Errors
— Performance warnings
— Hazards
— Usage hints
 App can insert own events into stream for marking
Why is my screen blank?
void DrawTexture()
{
GLuint tex;
glGenTextures(1, &tex);
glBindTexture(GL_TEXTURE_2D, tex);
glTexImage2D(tex, 0, GL_R8, 32, 32, 0, GL_RED, GL_UNSIGNED_BYTE, pixels);
glEnable(GL_TEXTURE_2D);
glBegin(GL_QUADS); {
glTexCoord2f(0.0f, 0.0f); glVertex2f(-1.0f, -1.0f);
glTexCoord2f(1.0f, 0.0f); glVertex2f( 1.0f, -1.0f);
glTexCoord2f(1.0f, 1.0f); glVertex2f( 1.0f, 1.0f);
glTexCoord2f(0.0f, 1.0f); glVertex2f(-1.0f, 1.0f);
} glEnd();
SwapBuffers();
}
Oops – Texture is incomplete!
Enable Debug
 Can be done on-the-fly
void GLAPIENTRY DebugCallback(GLenum source, GLenum type, GLuint id, GLenum
severity, GLsizei length, const GLchar* message, const void* userParam)
{
printf(“0x%X: %sn", id, message);
}
void DebugDrawTexture()
{
glDebugMessageCallback(DebugCallback, NULL);
glDebugMessageControl(GL_DONT_CARE, GL_DONT_CARE, GL_DONT_CARE, 0, 0, GL_TRUE);
glEnable(GL_DEBUG_OUTPUT);
DrawTexture();
}
 The callback function outputs:
0x20084: Texture state usage warning: Texture 1 has no mipmaps, while its min
filter requires mipmap.
Works in non-debug context!
Give the texture a name
 Instead of “texture 1” – give it a name
void DrawTexture()
{
GLuint tex;
glGenTextures(1, &tex);
glBindTexture(GL_TEXTURE_2D, tex);
GLchar texName[] = "Sky";
glObjectLabel(GL_TEXTURE, tex, sizeof texName, texName);
...
}
 The callback function outputs:
0x20084: Texture state usage warning: Texture Sky has no mipmaps, while its min
filter requires mipmap.
Organize your debug trace
 Lots of text can get unwieldy
— What parts of my code does this error apply?
 Use synchronous debug output:
— glEnable(GL_DEBUG_OUTPUT_SYNCHRONOUS);
— Effectively disables dual-core driver
— So your callback goes to your calling application thread
— Instead of a driver internal thread
 Use groups and markers
— App injects markers to notate debug output
— Push/pop groups to easily control volume
Notating debug with groups
 Use a group
void DebugDrawTexture()
{
...
GLchar groupName[] = "DrawTexture";
glPushDebugGroup(GL_DEBUG_SOURCE_APPLICATION, 0x1234,
sizeof groupName, groupName);
glDebugOutputControl(...); // Can change volume if needed
DrawTexture();
glPopDebugGroup(); // Old debug volume restored
}
 Improved output
0x1234: DrawTexture PUSH
0x20084: Texture state usage warning: Texture Sky has no mipmaps, while its
min filter requires mipmap.
0x1234: DrawTexture POP
Debug the easy way
Nsight: Interactive OpenGL debugging
 Frame Debugging and Profiling
 Shader Debugging and Pixel History
 Frame Debugging and Dynamic Shader Editing
 OpenGL API & Hardware Trace
 Supports up to OpenGL 4.2 Core
— And a bunch of useful extensions
 https://developer.nvidia.com/nvidia-nsight-visual-studio-edition
OpenGL related Linux improvements
 Support for EGL on desktop Linux within X11 (r331)
 OpenGL-based Framebuffer Capture (NvFBC), for remote
graphics (r331)
 Support for Quad-Buffered stereo + Composite X extension
(GLX_EXT_stereo_tree) (r337)
 Support for G-SYNC (Variable Refresh Rate) (r340)
 Support for Tegra K1: NVIDIA SOC with Kepler graphics core
— Linux Tegra K1 (Jetson) support leverages same X driver, OpenGL
implementation as desktop NVIDIA GPUs
— NVIDIA also contributing to Nouveau for K1 support
 Coming soon: Framebuffer Object creation dramatically
faster!
Beyond OpenGL 4.5  Path Rendering
Path rendering and blend modes
 Resolution-independent 2D rendering
 Not your classic 3D hardware rendering
Earlier Illustrator demo showed this
 NV_path_rendering +
NV_blend_equation_advanced
PostScript Tiger with Perspective Warping
No textures!
Paths rendered from
resolution-independent
2D paths (outlines)
Render Fancy Text from Outlines
Paths + Text + 3D all at once
Web Page Rendering
every glyph from its outlines!
Zoom in
and visualize
glyph outline
control points
Beyond OpenGL 4.5
 Advanced scene rendering with ARB_multi_draw_indirect
— Added to OpenGL 4.3
 Bring even more processing onto the GPU with
NV_bindless_multi_draw_indirect
— Even less work for the CPU – no Vertex Buffer Object (VBO) binds
between draws
 Covered in depth by Christoph Kubisch yesterday
— SG4117: OpenGL Scene Rendering Techniques
NV_bindless_multi_draw_indirect
 DrawIndirect combined with Bindless
struct DrawElementsIndirect {
GLuint count;
GLuint instanceCount;
GLuint firstIndex;
GLint baseVertex;
GLuint baseInstance;
}
struct BindlessPtr {
Gluint index;
Gluint reserved;
GLuint64 address;
GLuint64 length;
}
struct DrawElementsIndirectBindlessCommandNV {
DrawElementsIndirect cmd;
GLuint reserved;
BindlessPtr index;
BindlessPtr vertex[];
}
Change vertex buffers per draw iteration!
Change index buffer per draw iteration!
MultiDrawElementsIndirectBindlessNV(enum mode, enum type, const void *indirect,
sizei drawCount, sizei stride, int vertexBufferCount);
Caveat: Does the CPU know the drawCount?
The GL_BUFFER_GPU_ADDRESS_NV of the buffer object
NV_bindless_multi_draw_indirect_count
 Source the drawCount from a buffer object
void MultiDrawElementsIndirectBindlessCountNV(
enum mode,
enum type,
const void * indirect,
intptr drawCount,
sizei maxDrawCount,
sizei stride,
int vertexBufferCount
);
drawCount now an offset into the bound GL_PARAMETER_BUFFER_ARB buffer range.
Khronos OpenGL BOF at SIGGRAPH
 Date: Wednesday, August 13 2014
 Venue: Marriott Pinnacle Hotel, next to the Convention
Center
 Website: http://s2014.siggraph.org
 Times: 5pm-7pm OpenGL and OpenGL ES Track
 BOF After-Party: 7:30pm until late
— Rumor: Free beer and door prizes
Questions?

Contenu connexe

Tendances

FrameGraph: Extensible Rendering Architecture in Frostbite
FrameGraph: Extensible Rendering Architecture in FrostbiteFrameGraph: Extensible Rendering Architecture in Frostbite
FrameGraph: Extensible Rendering Architecture in FrostbiteElectronic Arts / DICE
 
Optimizing the Graphics Pipeline with Compute, GDC 2016
Optimizing the Graphics Pipeline with Compute, GDC 2016Optimizing the Graphics Pipeline with Compute, GDC 2016
Optimizing the Graphics Pipeline with Compute, GDC 2016Graham Wihlidal
 
NVIDIA OpenGL in 2016
NVIDIA OpenGL in 2016NVIDIA OpenGL in 2016
NVIDIA OpenGL in 2016Mark Kilgard
 
Dissecting the Rendering of The Surge
Dissecting the Rendering of The SurgeDissecting the Rendering of The Surge
Dissecting the Rendering of The SurgePhilip Hammer
 
Low-level Shader Optimization for Next-Gen and DX11 by Emil Persson
Low-level Shader Optimization for Next-Gen and DX11 by Emil PerssonLow-level Shader Optimization for Next-Gen and DX11 by Emil Persson
Low-level Shader Optimization for Next-Gen and DX11 by Emil PerssonAMD Developer Central
 
Terrain Rendering in Frostbite using Procedural Shader Splatting (Siggraph 2007)
Terrain Rendering in Frostbite using Procedural Shader Splatting (Siggraph 2007)Terrain Rendering in Frostbite using Procedural Shader Splatting (Siggraph 2007)
Terrain Rendering in Frostbite using Procedural Shader Splatting (Siggraph 2007)Johan Andersson
 
SIGGRAPH Asia 2008 Modern OpenGL
SIGGRAPH Asia 2008 Modern OpenGLSIGGRAPH Asia 2008 Modern OpenGL
SIGGRAPH Asia 2008 Modern OpenGLMark Kilgard
 
Vertex Shader Tricks by Bill Bilodeau - AMD at GDC14
Vertex Shader Tricks by Bill Bilodeau - AMD at GDC14Vertex Shader Tricks by Bill Bilodeau - AMD at GDC14
Vertex Shader Tricks by Bill Bilodeau - AMD at GDC14AMD Developer Central
 
Triangle Visibility buffer
Triangle Visibility bufferTriangle Visibility buffer
Triangle Visibility bufferWolfgang Engel
 
Siggraph 2016 - Vulkan and nvidia : the essentials
Siggraph 2016 - Vulkan and nvidia : the essentialsSiggraph 2016 - Vulkan and nvidia : the essentials
Siggraph 2016 - Vulkan and nvidia : the essentialsTristan Lorach
 
The Guerrilla Guide to Game Code
The Guerrilla Guide to Game CodeThe Guerrilla Guide to Game Code
The Guerrilla Guide to Game CodeGuerrilla
 
Secrets of CryENGINE 3 Graphics Technology
Secrets of CryENGINE 3 Graphics TechnologySecrets of CryENGINE 3 Graphics Technology
Secrets of CryENGINE 3 Graphics TechnologyTiago Sousa
 
Efficient Buffer Management
Efficient Buffer ManagementEfficient Buffer Management
Efficient Buffer Managementbasisspace
 
Bindless Deferred Decals in The Surge 2
Bindless Deferred Decals in The Surge 2Bindless Deferred Decals in The Surge 2
Bindless Deferred Decals in The Surge 2Philip Hammer
 
NVIDIA's OpenGL Functionality
NVIDIA's OpenGL FunctionalityNVIDIA's OpenGL Functionality
NVIDIA's OpenGL FunctionalityMark Kilgard
 
Frostbite Rendering Architecture and Real-time Procedural Shading & Texturing...
Frostbite Rendering Architecture and Real-time Procedural Shading & Texturing...Frostbite Rendering Architecture and Real-time Procedural Shading & Texturing...
Frostbite Rendering Architecture and Real-time Procedural Shading & Texturing...Johan Andersson
 
Deferred Rendering in Killzone 2
Deferred Rendering in Killzone 2Deferred Rendering in Killzone 2
Deferred Rendering in Killzone 2Guerrilla
 
Parallel Futures of a Game Engine (v2.0)
Parallel Futures of a Game Engine (v2.0)Parallel Futures of a Game Engine (v2.0)
Parallel Futures of a Game Engine (v2.0)Johan Andersson
 

Tendances (20)

FrameGraph: Extensible Rendering Architecture in Frostbite
FrameGraph: Extensible Rendering Architecture in FrostbiteFrameGraph: Extensible Rendering Architecture in Frostbite
FrameGraph: Extensible Rendering Architecture in Frostbite
 
Optimizing the Graphics Pipeline with Compute, GDC 2016
Optimizing the Graphics Pipeline with Compute, GDC 2016Optimizing the Graphics Pipeline with Compute, GDC 2016
Optimizing the Graphics Pipeline with Compute, GDC 2016
 
NVIDIA OpenGL in 2016
NVIDIA OpenGL in 2016NVIDIA OpenGL in 2016
NVIDIA OpenGL in 2016
 
Dissecting the Rendering of The Surge
Dissecting the Rendering of The SurgeDissecting the Rendering of The Surge
Dissecting the Rendering of The Surge
 
Low-level Shader Optimization for Next-Gen and DX11 by Emil Persson
Low-level Shader Optimization for Next-Gen and DX11 by Emil PerssonLow-level Shader Optimization for Next-Gen and DX11 by Emil Persson
Low-level Shader Optimization for Next-Gen and DX11 by Emil Persson
 
DirectX 11 Rendering in Battlefield 3
DirectX 11 Rendering in Battlefield 3DirectX 11 Rendering in Battlefield 3
DirectX 11 Rendering in Battlefield 3
 
Terrain Rendering in Frostbite using Procedural Shader Splatting (Siggraph 2007)
Terrain Rendering in Frostbite using Procedural Shader Splatting (Siggraph 2007)Terrain Rendering in Frostbite using Procedural Shader Splatting (Siggraph 2007)
Terrain Rendering in Frostbite using Procedural Shader Splatting (Siggraph 2007)
 
SIGGRAPH Asia 2008 Modern OpenGL
SIGGRAPH Asia 2008 Modern OpenGLSIGGRAPH Asia 2008 Modern OpenGL
SIGGRAPH Asia 2008 Modern OpenGL
 
Vertex Shader Tricks by Bill Bilodeau - AMD at GDC14
Vertex Shader Tricks by Bill Bilodeau - AMD at GDC14Vertex Shader Tricks by Bill Bilodeau - AMD at GDC14
Vertex Shader Tricks by Bill Bilodeau - AMD at GDC14
 
Triangle Visibility buffer
Triangle Visibility bufferTriangle Visibility buffer
Triangle Visibility buffer
 
Siggraph 2016 - Vulkan and nvidia : the essentials
Siggraph 2016 - Vulkan and nvidia : the essentialsSiggraph 2016 - Vulkan and nvidia : the essentials
Siggraph 2016 - Vulkan and nvidia : the essentials
 
The Guerrilla Guide to Game Code
The Guerrilla Guide to Game CodeThe Guerrilla Guide to Game Code
The Guerrilla Guide to Game Code
 
Secrets of CryENGINE 3 Graphics Technology
Secrets of CryENGINE 3 Graphics TechnologySecrets of CryENGINE 3 Graphics Technology
Secrets of CryENGINE 3 Graphics Technology
 
Efficient Buffer Management
Efficient Buffer ManagementEfficient Buffer Management
Efficient Buffer Management
 
Bindless Deferred Decals in The Surge 2
Bindless Deferred Decals in The Surge 2Bindless Deferred Decals in The Surge 2
Bindless Deferred Decals in The Surge 2
 
NVIDIA's OpenGL Functionality
NVIDIA's OpenGL FunctionalityNVIDIA's OpenGL Functionality
NVIDIA's OpenGL Functionality
 
Frostbite Rendering Architecture and Real-time Procedural Shading & Texturing...
Frostbite Rendering Architecture and Real-time Procedural Shading & Texturing...Frostbite Rendering Architecture and Real-time Procedural Shading & Texturing...
Frostbite Rendering Architecture and Real-time Procedural Shading & Texturing...
 
Deferred Rendering in Killzone 2
Deferred Rendering in Killzone 2Deferred Rendering in Killzone 2
Deferred Rendering in Killzone 2
 
OpenGL ES 3.1 Reference Card
OpenGL ES 3.1 Reference CardOpenGL ES 3.1 Reference Card
OpenGL ES 3.1 Reference Card
 
Parallel Futures of a Game Engine (v2.0)
Parallel Futures of a Game Engine (v2.0)Parallel Futures of a Game Engine (v2.0)
Parallel Futures of a Game Engine (v2.0)
 

En vedette

Migrating from OpenGL to Vulkan
Migrating from OpenGL to VulkanMigrating from OpenGL to Vulkan
Migrating from OpenGL to VulkanMark Kilgard
 
SIGGRAPH 2012: NVIDIA OpenGL for 2012
SIGGRAPH 2012: NVIDIA OpenGL for 2012SIGGRAPH 2012: NVIDIA OpenGL for 2012
SIGGRAPH 2012: NVIDIA OpenGL for 2012Mark Kilgard
 
Modern OpenGL Usage: Using Vertex Buffer Objects Well
Modern OpenGL Usage: Using Vertex Buffer Objects Well Modern OpenGL Usage: Using Vertex Buffer Objects Well
Modern OpenGL Usage: Using Vertex Buffer Objects Well Mark Kilgard
 
Getting Started with NV_path_rendering
Getting Started with NV_path_renderingGetting Started with NV_path_rendering
Getting Started with NV_path_renderingMark Kilgard
 
CS 354 Blending, Compositing, Anti-aliasing
CS 354 Blending, Compositing, Anti-aliasingCS 354 Blending, Compositing, Anti-aliasing
CS 354 Blending, Compositing, Anti-aliasingMark Kilgard
 
Notes2StudyGST-160511
Notes2StudyGST-160511Notes2StudyGST-160511
Notes2StudyGST-160511xiaozhong hua
 
Gallium3D - Mesa's New Driver Model
Gallium3D - Mesa's New Driver ModelGallium3D - Mesa's New Driver Model
Gallium3D - Mesa's New Driver ModelChia-I Wu
 
Slides: Accelerating Vector Graphics Rendering using the Graphics Hardware Pi...
Slides: Accelerating Vector Graphics Rendering using the Graphics Hardware Pi...Slides: Accelerating Vector Graphics Rendering using the Graphics Hardware Pi...
Slides: Accelerating Vector Graphics Rendering using the Graphics Hardware Pi...Mark Kilgard
 
gtkgst video in your widgets!
gtkgst video in your widgets!gtkgst video in your widgets!
gtkgst video in your widgets!ystreet00
 
[Harvard CS264] 03 - Introduction to GPU Computing, CUDA Basics
[Harvard CS264] 03 - Introduction to GPU Computing, CUDA Basics[Harvard CS264] 03 - Introduction to GPU Computing, CUDA Basics
[Harvard CS264] 03 - Introduction to GPU Computing, CUDA Basicsnpinto
 
SIGGRAPH Asia 2012 Exhibitor Talk: OpenGL 4.3 and Beyond
SIGGRAPH Asia 2012 Exhibitor Talk: OpenGL 4.3 and BeyondSIGGRAPH Asia 2012 Exhibitor Talk: OpenGL 4.3 and Beyond
SIGGRAPH Asia 2012 Exhibitor Talk: OpenGL 4.3 and BeyondMark Kilgard
 
Understaing Android EGL
Understaing Android EGLUnderstaing Android EGL
Understaing Android EGLSuhan Lee
 
CS 354 Introduction
CS 354 IntroductionCS 354 Introduction
CS 354 IntroductionMark Kilgard
 
水土保持局環境教育1030904
水土保持局環境教育1030904水土保持局環境教育1030904
水土保持局環境教育1030904Yung-Chuan Ko
 
CS 354 Viewing Stuff
CS 354 Viewing StuffCS 354 Viewing Stuff
CS 354 Viewing StuffMark Kilgard
 
CS 354 Project 1 Discussion
CS 354 Project 1 DiscussionCS 354 Project 1 Discussion
CS 354 Project 1 DiscussionMark Kilgard
 

En vedette (20)

Migrating from OpenGL to Vulkan
Migrating from OpenGL to VulkanMigrating from OpenGL to Vulkan
Migrating from OpenGL to Vulkan
 
OpenGL for 2015
OpenGL for 2015OpenGL for 2015
OpenGL for 2015
 
SIGGRAPH 2012: NVIDIA OpenGL for 2012
SIGGRAPH 2012: NVIDIA OpenGL for 2012SIGGRAPH 2012: NVIDIA OpenGL for 2012
SIGGRAPH 2012: NVIDIA OpenGL for 2012
 
Modern OpenGL Usage: Using Vertex Buffer Objects Well
Modern OpenGL Usage: Using Vertex Buffer Objects Well Modern OpenGL Usage: Using Vertex Buffer Objects Well
Modern OpenGL Usage: Using Vertex Buffer Objects Well
 
OpenGL 4 for 2010
OpenGL 4 for 2010OpenGL 4 for 2010
OpenGL 4 for 2010
 
Getting Started with NV_path_rendering
Getting Started with NV_path_renderingGetting Started with NV_path_rendering
Getting Started with NV_path_rendering
 
CS 354 Blending, Compositing, Anti-aliasing
CS 354 Blending, Compositing, Anti-aliasingCS 354 Blending, Compositing, Anti-aliasing
CS 354 Blending, Compositing, Anti-aliasing
 
Haskell Accelerate
Haskell  AccelerateHaskell  Accelerate
Haskell Accelerate
 
Notes2StudyGST-160511
Notes2StudyGST-160511Notes2StudyGST-160511
Notes2StudyGST-160511
 
Gallium3D - Mesa's New Driver Model
Gallium3D - Mesa's New Driver ModelGallium3D - Mesa's New Driver Model
Gallium3D - Mesa's New Driver Model
 
Slides: Accelerating Vector Graphics Rendering using the Graphics Hardware Pi...
Slides: Accelerating Vector Graphics Rendering using the Graphics Hardware Pi...Slides: Accelerating Vector Graphics Rendering using the Graphics Hardware Pi...
Slides: Accelerating Vector Graphics Rendering using the Graphics Hardware Pi...
 
gtkgst video in your widgets!
gtkgst video in your widgets!gtkgst video in your widgets!
gtkgst video in your widgets!
 
[Harvard CS264] 03 - Introduction to GPU Computing, CUDA Basics
[Harvard CS264] 03 - Introduction to GPU Computing, CUDA Basics[Harvard CS264] 03 - Introduction to GPU Computing, CUDA Basics
[Harvard CS264] 03 - Introduction to GPU Computing, CUDA Basics
 
SIGGRAPH Asia 2012 Exhibitor Talk: OpenGL 4.3 and Beyond
SIGGRAPH Asia 2012 Exhibitor Talk: OpenGL 4.3 and BeyondSIGGRAPH Asia 2012 Exhibitor Talk: OpenGL 4.3 and Beyond
SIGGRAPH Asia 2012 Exhibitor Talk: OpenGL 4.3 and Beyond
 
Understaing Android EGL
Understaing Android EGLUnderstaing Android EGL
Understaing Android EGL
 
GPU Computing
GPU ComputingGPU Computing
GPU Computing
 
CS 354 Introduction
CS 354 IntroductionCS 354 Introduction
CS 354 Introduction
 
水土保持局環境教育1030904
水土保持局環境教育1030904水土保持局環境教育1030904
水土保持局環境教育1030904
 
CS 354 Viewing Stuff
CS 354 Viewing StuffCS 354 Viewing Stuff
CS 354 Viewing Stuff
 
CS 354 Project 1 Discussion
CS 354 Project 1 DiscussionCS 354 Project 1 Discussion
CS 354 Project 1 Discussion
 

Similaire à OpenGL 4.5 Update for NVIDIA GPUs

Porting the Source Engine to Linux: Valve's Lessons Learned
Porting the Source Engine to Linux: Valve's Lessons LearnedPorting the Source Engine to Linux: Valve's Lessons Learned
Porting the Source Engine to Linux: Valve's Lessons Learnedbasisspace
 
Advanced Graphics Workshop - GFX2011
Advanced Graphics Workshop - GFX2011Advanced Graphics Workshop - GFX2011
Advanced Graphics Workshop - GFX2011Prabindh Sundareson
 
State of GeoServer 2.10
State of GeoServer 2.10State of GeoServer 2.10
State of GeoServer 2.10Jody Garnett
 
State of GeoServer - FOSS4G 2016
State of GeoServer - FOSS4G 2016State of GeoServer - FOSS4G 2016
State of GeoServer - FOSS4G 2016GeoSolutions
 
13th kandroid OpenGL and EGL
13th kandroid OpenGL and EGL13th kandroid OpenGL and EGL
13th kandroid OpenGL and EGLJungsoo Nam
 
Gdc 14 bringing unreal engine 4 to open_gl
Gdc 14 bringing unreal engine 4 to open_glGdc 14 bringing unreal engine 4 to open_gl
Gdc 14 bringing unreal engine 4 to open_glchangehee lee
 
LMG Lightning Talks - SFO17-205
LMG Lightning Talks - SFO17-205LMG Lightning Talks - SFO17-205
LMG Lightning Talks - SFO17-205Linaro
 
OpenGL Shading Language
OpenGL Shading LanguageOpenGL Shading Language
OpenGL Shading LanguageJungsoo Nam
 
IE WebGL and Babylon.js (Web3D 2014)
IE WebGL and Babylon.js (Web3D 2014)IE WebGL and Babylon.js (Web3D 2014)
IE WebGL and Babylon.js (Web3D 2014)David Catuhe
 
CS 354 Programmable Shading
CS 354 Programmable ShadingCS 354 Programmable Shading
CS 354 Programmable ShadingMark Kilgard
 
Open GL ES Android
Open GL ES AndroidOpen GL ES Android
Open GL ES AndroidMindos Cheng
 
Adopting Debug Adapter Protocol in Eclipse IDE: netcoredbg (.NET debugger) ca...
Adopting Debug Adapter Protocol in Eclipse IDE: netcoredbg (.NET debugger) ca...Adopting Debug Adapter Protocol in Eclipse IDE: netcoredbg (.NET debugger) ca...
Adopting Debug Adapter Protocol in Eclipse IDE: netcoredbg (.NET debugger) ca...Mickael Istria
 
Computer Graphics - Lecture 01 - 3D Programming I
Computer Graphics - Lecture 01 - 3D Programming IComputer Graphics - Lecture 01 - 3D Programming I
Computer Graphics - Lecture 01 - 3D Programming I💻 Anton Gerdelan
 
Minko - Flash Conference #5
Minko - Flash Conference #5Minko - Flash Conference #5
Minko - Flash Conference #5Minko3D
 

Similaire à OpenGL 4.5 Update for NVIDIA GPUs (20)

Porting the Source Engine to Linux: Valve's Lessons Learned
Porting the Source Engine to Linux: Valve's Lessons LearnedPorting the Source Engine to Linux: Valve's Lessons Learned
Porting the Source Engine to Linux: Valve's Lessons Learned
 
Advanced Graphics Workshop - GFX2011
Advanced Graphics Workshop - GFX2011Advanced Graphics Workshop - GFX2011
Advanced Graphics Workshop - GFX2011
 
Android native gl
Android native glAndroid native gl
Android native gl
 
State of GeoServer 2.10
State of GeoServer 2.10State of GeoServer 2.10
State of GeoServer 2.10
 
State of GeoServer - FOSS4G 2016
State of GeoServer - FOSS4G 2016State of GeoServer - FOSS4G 2016
State of GeoServer - FOSS4G 2016
 
13th kandroid OpenGL and EGL
13th kandroid OpenGL and EGL13th kandroid OpenGL and EGL
13th kandroid OpenGL and EGL
 
Regal
RegalRegal
Regal
 
Angel
AngelAngel
Angel
 
Gdc 14 bringing unreal engine 4 to open_gl
Gdc 14 bringing unreal engine 4 to open_glGdc 14 bringing unreal engine 4 to open_gl
Gdc 14 bringing unreal engine 4 to open_gl
 
Opengl basics
Opengl basicsOpengl basics
Opengl basics
 
LMG Lightning Talks - SFO17-205
LMG Lightning Talks - SFO17-205LMG Lightning Talks - SFO17-205
LMG Lightning Talks - SFO17-205
 
How to Use OpenGL/ES on Native Activity
How to Use OpenGL/ES on Native ActivityHow to Use OpenGL/ES on Native Activity
How to Use OpenGL/ES on Native Activity
 
OpenGL Shading Language
OpenGL Shading LanguageOpenGL Shading Language
OpenGL Shading Language
 
IE WebGL and Babylon.js (Web3D 2014)
IE WebGL and Babylon.js (Web3D 2014)IE WebGL and Babylon.js (Web3D 2014)
IE WebGL and Babylon.js (Web3D 2014)
 
CS 354 Programmable Shading
CS 354 Programmable ShadingCS 354 Programmable Shading
CS 354 Programmable Shading
 
Open GL ES Android
Open GL ES AndroidOpen GL ES Android
Open GL ES Android
 
Adopting Debug Adapter Protocol in Eclipse IDE: netcoredbg (.NET debugger) ca...
Adopting Debug Adapter Protocol in Eclipse IDE: netcoredbg (.NET debugger) ca...Adopting Debug Adapter Protocol in Eclipse IDE: netcoredbg (.NET debugger) ca...
Adopting Debug Adapter Protocol in Eclipse IDE: netcoredbg (.NET debugger) ca...
 
Computer Graphics - Lecture 01 - 3D Programming I
Computer Graphics - Lecture 01 - 3D Programming IComputer Graphics - Lecture 01 - 3D Programming I
Computer Graphics - Lecture 01 - 3D Programming I
 
OpenGL Introduction
OpenGL IntroductionOpenGL Introduction
OpenGL Introduction
 
Minko - Flash Conference #5
Minko - Flash Conference #5Minko - Flash Conference #5
Minko - Flash Conference #5
 

Plus de Mark Kilgard

D11: a high-performance, protocol-optional, transport-optional, window system...
D11: a high-performance, protocol-optional, transport-optional, window system...D11: a high-performance, protocol-optional, transport-optional, window system...
D11: a high-performance, protocol-optional, transport-optional, window system...Mark Kilgard
 
Computers, Graphics, Engineering, Math, and Video Games for High School Students
Computers, Graphics, Engineering, Math, and Video Games for High School StudentsComputers, Graphics, Engineering, Math, and Video Games for High School Students
Computers, Graphics, Engineering, Math, and Video Games for High School StudentsMark Kilgard
 
Virtual Reality Features of NVIDIA GPUs
Virtual Reality Features of NVIDIA GPUsVirtual Reality Features of NVIDIA GPUs
Virtual Reality Features of NVIDIA GPUsMark Kilgard
 
EXT_window_rectangles
EXT_window_rectanglesEXT_window_rectangles
EXT_window_rectanglesMark Kilgard
 
Accelerating Vector Graphics Rendering using the Graphics Hardware Pipeline
Accelerating Vector Graphics Rendering using the Graphics Hardware PipelineAccelerating Vector Graphics Rendering using the Graphics Hardware Pipeline
Accelerating Vector Graphics Rendering using the Graphics Hardware PipelineMark Kilgard
 
NV_path rendering Functional Improvements
NV_path rendering Functional ImprovementsNV_path rendering Functional Improvements
NV_path rendering Functional ImprovementsMark Kilgard
 
SIGGRAPH Asia 2012: GPU-accelerated Path Rendering
SIGGRAPH Asia 2012: GPU-accelerated Path RenderingSIGGRAPH Asia 2012: GPU-accelerated Path Rendering
SIGGRAPH Asia 2012: GPU-accelerated Path RenderingMark Kilgard
 
Programming with NV_path_rendering: An Annex to the SIGGRAPH Asia 2012 paper...
Programming with NV_path_rendering:  An Annex to the SIGGRAPH Asia 2012 paper...Programming with NV_path_rendering:  An Annex to the SIGGRAPH Asia 2012 paper...
Programming with NV_path_rendering: An Annex to the SIGGRAPH Asia 2012 paper...Mark Kilgard
 
GPU accelerated path rendering fastforward
GPU accelerated path rendering fastforwardGPU accelerated path rendering fastforward
GPU accelerated path rendering fastforwardMark Kilgard
 
GPU-accelerated Path Rendering
GPU-accelerated Path RenderingGPU-accelerated Path Rendering
GPU-accelerated Path RenderingMark Kilgard
 
SIGGRAPH 2012: GPU-Accelerated 2D and Web Rendering
SIGGRAPH 2012: GPU-Accelerated 2D and Web RenderingSIGGRAPH 2012: GPU-Accelerated 2D and Web Rendering
SIGGRAPH 2012: GPU-Accelerated 2D and Web RenderingMark Kilgard
 
GTC 2012: GPU-Accelerated Path Rendering
GTC 2012: GPU-Accelerated Path RenderingGTC 2012: GPU-Accelerated Path Rendering
GTC 2012: GPU-Accelerated Path Rendering Mark Kilgard
 
GTC 2012: NVIDIA OpenGL in 2012
GTC 2012: NVIDIA OpenGL in 2012GTC 2012: NVIDIA OpenGL in 2012
GTC 2012: NVIDIA OpenGL in 2012Mark Kilgard
 
CS 354 Final Exam Review
CS 354 Final Exam ReviewCS 354 Final Exam Review
CS 354 Final Exam ReviewMark Kilgard
 
CS 354 Surfaces, Programmable Tessellation, and NPR Graphics
CS 354 Surfaces, Programmable Tessellation, and NPR GraphicsCS 354 Surfaces, Programmable Tessellation, and NPR Graphics
CS 354 Surfaces, Programmable Tessellation, and NPR GraphicsMark Kilgard
 
CS 354 Performance Analysis
CS 354 Performance AnalysisCS 354 Performance Analysis
CS 354 Performance AnalysisMark Kilgard
 
CS 354 Acceleration Structures
CS 354 Acceleration StructuresCS 354 Acceleration Structures
CS 354 Acceleration StructuresMark Kilgard
 
CS 354 Global Illumination
CS 354 Global IlluminationCS 354 Global Illumination
CS 354 Global IlluminationMark Kilgard
 
CS 354 Ray Casting & Tracing
CS 354 Ray Casting & TracingCS 354 Ray Casting & Tracing
CS 354 Ray Casting & TracingMark Kilgard
 

Plus de Mark Kilgard (20)

D11: a high-performance, protocol-optional, transport-optional, window system...
D11: a high-performance, protocol-optional, transport-optional, window system...D11: a high-performance, protocol-optional, transport-optional, window system...
D11: a high-performance, protocol-optional, transport-optional, window system...
 
Computers, Graphics, Engineering, Math, and Video Games for High School Students
Computers, Graphics, Engineering, Math, and Video Games for High School StudentsComputers, Graphics, Engineering, Math, and Video Games for High School Students
Computers, Graphics, Engineering, Math, and Video Games for High School Students
 
Virtual Reality Features of NVIDIA GPUs
Virtual Reality Features of NVIDIA GPUsVirtual Reality Features of NVIDIA GPUs
Virtual Reality Features of NVIDIA GPUs
 
EXT_window_rectangles
EXT_window_rectanglesEXT_window_rectangles
EXT_window_rectangles
 
Accelerating Vector Graphics Rendering using the Graphics Hardware Pipeline
Accelerating Vector Graphics Rendering using the Graphics Hardware PipelineAccelerating Vector Graphics Rendering using the Graphics Hardware Pipeline
Accelerating Vector Graphics Rendering using the Graphics Hardware Pipeline
 
NV_path rendering Functional Improvements
NV_path rendering Functional ImprovementsNV_path rendering Functional Improvements
NV_path rendering Functional Improvements
 
SIGGRAPH Asia 2012: GPU-accelerated Path Rendering
SIGGRAPH Asia 2012: GPU-accelerated Path RenderingSIGGRAPH Asia 2012: GPU-accelerated Path Rendering
SIGGRAPH Asia 2012: GPU-accelerated Path Rendering
 
Programming with NV_path_rendering: An Annex to the SIGGRAPH Asia 2012 paper...
Programming with NV_path_rendering:  An Annex to the SIGGRAPH Asia 2012 paper...Programming with NV_path_rendering:  An Annex to the SIGGRAPH Asia 2012 paper...
Programming with NV_path_rendering: An Annex to the SIGGRAPH Asia 2012 paper...
 
GPU accelerated path rendering fastforward
GPU accelerated path rendering fastforwardGPU accelerated path rendering fastforward
GPU accelerated path rendering fastforward
 
GPU-accelerated Path Rendering
GPU-accelerated Path RenderingGPU-accelerated Path Rendering
GPU-accelerated Path Rendering
 
SIGGRAPH 2012: GPU-Accelerated 2D and Web Rendering
SIGGRAPH 2012: GPU-Accelerated 2D and Web RenderingSIGGRAPH 2012: GPU-Accelerated 2D and Web Rendering
SIGGRAPH 2012: GPU-Accelerated 2D and Web Rendering
 
GTC 2012: GPU-Accelerated Path Rendering
GTC 2012: GPU-Accelerated Path RenderingGTC 2012: GPU-Accelerated Path Rendering
GTC 2012: GPU-Accelerated Path Rendering
 
GTC 2012: NVIDIA OpenGL in 2012
GTC 2012: NVIDIA OpenGL in 2012GTC 2012: NVIDIA OpenGL in 2012
GTC 2012: NVIDIA OpenGL in 2012
 
CS 354 Final Exam Review
CS 354 Final Exam ReviewCS 354 Final Exam Review
CS 354 Final Exam Review
 
CS 354 Surfaces, Programmable Tessellation, and NPR Graphics
CS 354 Surfaces, Programmable Tessellation, and NPR GraphicsCS 354 Surfaces, Programmable Tessellation, and NPR Graphics
CS 354 Surfaces, Programmable Tessellation, and NPR Graphics
 
CS 354 Performance Analysis
CS 354 Performance AnalysisCS 354 Performance Analysis
CS 354 Performance Analysis
 
CS 354 Acceleration Structures
CS 354 Acceleration StructuresCS 354 Acceleration Structures
CS 354 Acceleration Structures
 
CS 354 Global Illumination
CS 354 Global IlluminationCS 354 Global Illumination
CS 354 Global Illumination
 
CS 354 Ray Casting & Tracing
CS 354 Ray Casting & TracingCS 354 Ray Casting & Tracing
CS 354 Ray Casting & Tracing
 
CS 354 Typography
CS 354 TypographyCS 354 Typography
CS 354 Typography
 

Dernier

How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesHow to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesThousandEyes
 
2024 April Patch Tuesday
2024 April Patch Tuesday2024 April Patch Tuesday
2024 April Patch TuesdayIvanti
 
QCon London: Mastering long-running processes in modern architectures
QCon London: Mastering long-running processes in modern architecturesQCon London: Mastering long-running processes in modern architectures
QCon London: Mastering long-running processes in modern architecturesBernd Ruecker
 
Glenn Lazarus- Why Your Observability Strategy Needs Security Observability
Glenn Lazarus- Why Your Observability Strategy Needs Security ObservabilityGlenn Lazarus- Why Your Observability Strategy Needs Security Observability
Glenn Lazarus- Why Your Observability Strategy Needs Security Observabilityitnewsafrica
 
Infrared simulation and processing on Nvidia platforms
Infrared simulation and processing on Nvidia platformsInfrared simulation and processing on Nvidia platforms
Infrared simulation and processing on Nvidia platformsYoss Cohen
 
Generative AI - Gitex v1Generative AI - Gitex v1.pptx
Generative AI - Gitex v1Generative AI - Gitex v1.pptxGenerative AI - Gitex v1Generative AI - Gitex v1.pptx
Generative AI - Gitex v1Generative AI - Gitex v1.pptxfnnc6jmgwh
 
Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...Farhan Tariq
 
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24Mark Goldstein
 
Kuma Meshes Part I - The basics - A tutorial
Kuma Meshes Part I - The basics - A tutorialKuma Meshes Part I - The basics - A tutorial
Kuma Meshes Part I - The basics - A tutorialJoão Esperancinha
 
So einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdfSo einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdfpanagenda
 
Generative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdfGenerative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdfIngrid Airi González
 
React JS; all concepts. Contains React Features, JSX, functional & Class comp...
React JS; all concepts. Contains React Features, JSX, functional & Class comp...React JS; all concepts. Contains React Features, JSX, functional & Class comp...
React JS; all concepts. Contains React Features, JSX, functional & Class comp...Karmanjay Verma
 
JET Technology Labs White Paper for Virtualized Security and Encryption Techn...
JET Technology Labs White Paper for Virtualized Security and Encryption Techn...JET Technology Labs White Paper for Virtualized Security and Encryption Techn...
JET Technology Labs White Paper for Virtualized Security and Encryption Techn...amber724300
 
A Glance At The Java Performance Toolbox
A Glance At The Java Performance ToolboxA Glance At The Java Performance Toolbox
A Glance At The Java Performance ToolboxAna-Maria Mihalceanu
 
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...panagenda
 
Digital Tools & AI in Career Development
Digital Tools & AI in Career DevelopmentDigital Tools & AI in Career Development
Digital Tools & AI in Career DevelopmentMahmoud Rabie
 
[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality Assurance[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality AssuranceInflectra
 
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...Wes McKinney
 
Landscape Catalogue 2024 Australia-1.pdf
Landscape Catalogue 2024 Australia-1.pdfLandscape Catalogue 2024 Australia-1.pdf
Landscape Catalogue 2024 Australia-1.pdfAarwolf Industries LLC
 
Microservices, Docker deploy and Microservices source code in C#
Microservices, Docker deploy and Microservices source code in C#Microservices, Docker deploy and Microservices source code in C#
Microservices, Docker deploy and Microservices source code in C#Karmanjay Verma
 

Dernier (20)

How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesHow to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
 
2024 April Patch Tuesday
2024 April Patch Tuesday2024 April Patch Tuesday
2024 April Patch Tuesday
 
QCon London: Mastering long-running processes in modern architectures
QCon London: Mastering long-running processes in modern architecturesQCon London: Mastering long-running processes in modern architectures
QCon London: Mastering long-running processes in modern architectures
 
Glenn Lazarus- Why Your Observability Strategy Needs Security Observability
Glenn Lazarus- Why Your Observability Strategy Needs Security ObservabilityGlenn Lazarus- Why Your Observability Strategy Needs Security Observability
Glenn Lazarus- Why Your Observability Strategy Needs Security Observability
 
Infrared simulation and processing on Nvidia platforms
Infrared simulation and processing on Nvidia platformsInfrared simulation and processing on Nvidia platforms
Infrared simulation and processing on Nvidia platforms
 
Generative AI - Gitex v1Generative AI - Gitex v1.pptx
Generative AI - Gitex v1Generative AI - Gitex v1.pptxGenerative AI - Gitex v1Generative AI - Gitex v1.pptx
Generative AI - Gitex v1Generative AI - Gitex v1.pptx
 
Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...
 
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
 
Kuma Meshes Part I - The basics - A tutorial
Kuma Meshes Part I - The basics - A tutorialKuma Meshes Part I - The basics - A tutorial
Kuma Meshes Part I - The basics - A tutorial
 
So einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdfSo einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdf
 
Generative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdfGenerative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdf
 
React JS; all concepts. Contains React Features, JSX, functional & Class comp...
React JS; all concepts. Contains React Features, JSX, functional & Class comp...React JS; all concepts. Contains React Features, JSX, functional & Class comp...
React JS; all concepts. Contains React Features, JSX, functional & Class comp...
 
JET Technology Labs White Paper for Virtualized Security and Encryption Techn...
JET Technology Labs White Paper for Virtualized Security and Encryption Techn...JET Technology Labs White Paper for Virtualized Security and Encryption Techn...
JET Technology Labs White Paper for Virtualized Security and Encryption Techn...
 
A Glance At The Java Performance Toolbox
A Glance At The Java Performance ToolboxA Glance At The Java Performance Toolbox
A Glance At The Java Performance Toolbox
 
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...
 
Digital Tools & AI in Career Development
Digital Tools & AI in Career DevelopmentDigital Tools & AI in Career Development
Digital Tools & AI in Career Development
 
[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality Assurance[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality Assurance
 
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
 
Landscape Catalogue 2024 Australia-1.pdf
Landscape Catalogue 2024 Australia-1.pdfLandscape Catalogue 2024 Australia-1.pdf
Landscape Catalogue 2024 Australia-1.pdf
 
Microservices, Docker deploy and Microservices source code in C#
Microservices, Docker deploy and Microservices source code in C#Microservices, Docker deploy and Microservices source code in C#
Microservices, Docker deploy and Microservices source code in C#
 

OpenGL 4.5 Update for NVIDIA GPUs

  • 1. SG4121: OPENGL 4.5 UPDATE FOR NVIDIA GPUS Mark Kilgard Principal System Software Engineer, NVIDIA Piers Daniell Senior Graphics Software Engineer, NVIDIA
  • 2. Mark Kilgard • Principal System Software Engineer – OpenGL driver and API evolution – Cg (“C for graphics”) shading language – GPU-accelerated path rendering • OpenGL Utility Toolkit (GLUT) implementer • Author of OpenGL for the X Window System • Co-author of Cg Tutorial • Worked on OpenGL for 20+ years
  • 3. Piers Daniell • Senior Graphics Software Engineer • NVIDIA’s Khronos OpenGL representative – Since 2010 – Authored numerous OpenGL extension specifications now core • Leads OpenGL version updates – Since OpenGL 4.1 • 10+ years with NVIDIA
  • 4. NVIDIA’s OpenGL Leverage Debugging with Nsight Programmable Graphics Tegra Quadro OptiX GeForce Adobe Creative Cloud
  • 5. Single 3D API for Every Platform OS X Linux FreeBSD Solaris Android Windows
  • 6. Adobe Creative Cloud: GPU-accelerated Illustrator • 27 year old application – World’s leading graphics design application • 6 million users – Never used the GPU • Until this June 2014 • Adobe and NVIDIA worked to integrate NV_path_rendering into Illustrator CC 2014
  • 7.
  • 8. OpenGL 4.x Evolution  Major revision of OpenGL every year since OpenGL 3.0, 2008  Maintained full backwards compatibility 2010 2011 2012 2013 2014 OpenGL 4.0: Tessellation OpenGL 4.1: Shader mix-and-match, ES2 compatibility OpenGL 4.2: GLSL upgrades and shader image load store OpenGL 4.3: Compute shaders, SSBO, ES3 compatibility OpenGL 4.4: Persistently mapped buffers, multi bind ???
  • 9. Big News: OpenGL 4.5 Released Today!  Direct State Access (DSA) finally!  Robustness  OpenGL ES 3.1 compatibility  Faster MakeCurrent  DirectX 11 features for porting and emulation  SubImage variant of GetTexImage  Texture barriers  Sparse buffers (ARB extension)
  • 10. So OpenGL Evolution Through 4.5  Major revision of OpenGL every year since 2008  Maintained full backwards compatibility 2010 2011 2012 2013 2014 OpenGL 4.0: Tessellation OpenGL 4.1: Shader mix-and-match, ES2 compatibility OpenGL 4.2: GLSL upgrades and shader image load store OpenGL 4.3: Compute shaders, SSBO, ES3 compatibility OpenGL 4.4: Persistently mapped buffers, multi bind OpenGL 4.5: Direct state access, robustness, ES3.1
  • 11. OpenGL Evolves Modularly • Each core revision is specified as a set of extensions – Example: ARB_ES3_1_compatibility • Puts together all the functionality for ES 3.1 compatibility • Describe in its own text file – May have dependencies on other extensions • Dependencies are stated explicitly • A core OpenGL revision (such as OpenGL 4.5) “bundles” a set of agreed extensions — and mandates their mutual support – Note: implementations can also “unbundle” ARB extensions for hardware unable to support the latest core revision • So easiest to describe OpenGL 4.5 based on its bundled extensions… 4.5 ARB_direct_state_access ARB_clip_control many more …
  • 12. OpenGL 4.5 as extensions  All new features to OpenGL 4.5 can be used with GL contexts 4.0 through 4.4 via extensions: — ARB_clip_control — ARB_conditional_render_inverted — ARB_cull_distance — ARB_shader_texture_image_samples — ARB_ES3_1_compatibility — ARB_direct_state_access — KHR_context_flush_control — ARB_get_texture_subimage — KHR_robustness — ARB_texture_barrier API Compatibility (Direct3D, OpenGL ES) API Improvements Browser security (WebGL) Texture & framebuffer memory consistency
  • 13. Additional ARB extensions  Along with OpenGL 4.5, Khronos has released ARB extensions  ARB_sparse_buffer  DirectX 11 features — ARB_pipeline_statistics_query — ARB_transform_feedback_overflow_query  NVIDIA supports the above on all OpenGL 4.x hardware — Fermi, Kepler and Maxwell — GeForce, Quadro and Tegra K1
  • 14. NVIDIA OpenGL 4.5 beta Driver  Available today!  https://developer.nvidia.com/opengl-driver — Or just Google “opengl driver” – it’s the first hit! — Windows and Linux  Supports all OpenGL 4.5 features and all ARB/KHR extensions  Available on Fermi, Kepler and Maxwell GPUs — GeForce and Quadro — Desktop and Laptop
  • 15. Using OpenGL 4.5  OpenGL 4.5 has 118 New functions. Eek.  How do you deal with all that? The easy way…  Use the OpenGL Extension Wrangler (GLEW) — Release 1.11.0 already has OpenGL 4.5 support — http://glew.sourceforge.net/
  • 16. Direct State Access (DSA)  Read and modify object state directly without bind-to-edit  Performance benefit in many cases  Context binding state unmodified — Convenient for tools and middleware — Avoids redundant state changes  Derived from EXT_direct_state_access
  • 17. More Efficient Middleware void Texture2D::SetMagFilter(Glenum filter) { GLuint oldTex; glGetIntegerv(GL_TEXTURE_BINDING_2D, &oldTex); glBindTexture(GL_TEXTURE_2D, m_tex); glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_MAG_FILTER, filter); glBindTexture(GL_TEXTURE_2D, oldTex); }  Before DSA  After DSA void Texture2D::SetMagFilter(Glenum filter) { glTextureParameteri(m_tex, GL_TEXTURE_MAG_FILTER, filter); }
  • 18. Simplified Code  Before DSA GLuint tex[2]; glGenTextures(2, tex); glActiveTexture(GL_TEXTURE0 + 0); glBindTexture(GL_TEXTURE_2D, tex[0]); glTexStorage2D(GL_TEXTURE_2D, 1, GL_RGBA8, 8, 8); glActiveTexture(GL_TEXTURE0 + 1); glBindTexture(GL_TEXTURE_2D, tex[1]); glTexStorage2D(GL_TEXTURE_2D, 1, GL_RGBA8, 4, 4);  After DSA GLuint tex[2]; glCreateTextures(GL_TEXTURE_2D, 2, tex); glTextureStorage2D(tex[0], 1, GL_RGBA8, 8, 8); glTextureStorage2D(tex[1], 1, GL_RGBA8, 4, 4); glBindTextures(0, 2, tex);
  • 19. More Direct Framebuffer Access  Before DSA glBindFramebuffer(GL_DRAW_FRAMEBUFFER, msFBO); DrawStuff(); glBindFramebuffer(GL_DRAW_FRAMEBUFFER, nonMsFBO); glBindFramebuffer(GL_READ_FRAMEBUFFER, msFBO); glBlitFramebuffer(...); glBindFramebuffer(GL_DRAW_FRAMEBUFFER, msFBO);  After DSA glBindFramebuffer(GL_DRAW_FRAMEBUFFER, msFBO); DrawStuff(); glBlitNamedFramebuffer(msFBO, nonMsFBO, ...);
  • 20. DSA Create Functions glCreate Creates glCreateBuffers Buffer Objects glCreateRenderbuffers Renderbuffer Objects glCreateTextures(<target>) Texture Objects of specific target glCreateFramebuffers Framebuffer Objects glCreateVertexArrays Vertex Array Objects glCreateProgramPipelines Program Pipeline Objects glCreateSamplers Sampler Objects glCreateQueries(<target>) Query Objects of a specific target  Generates name AND creates object  Bind-to-create not needed
  • 21. DSA Texture Functions Non-DSA DSA glGenTextures + glBindTexture glCreateTextures glTexStorage* glTextureStroage* glTexSubImage* glTextureSubImage* glCopyTexSubImage* glCopyTextureSubImage* glGetTexImage glGetTextureImage glCompressedTexSubImage* glCompressedTextureSubImage* glGetCompressedTexImage glGetCompressedTextureImage glActiveTexture + glBindTexture glBindTextureUnit glTexBuffer[Range] glTextureBuffer[Range] glGenerateMipmap glGenerateTextureMipmap gl[Get]TexParameter* gl[Get]TextureParameter*
  • 22. DSA Renderbuffer Functions Non-DSA DSA glGenRenderbuffers + glBindRenderbuffer glCreateRenderbuffers glRenderbufferStorage* glNamedRenderbufferStorage* glGetRenderbufferParameteriv glGetNamedRenderbufferParameteriv
  • 23. DSA Framebuffer Functions Non-DSA DSA glGenFramebuffers + glBindFramebuffer glCreateFramebuffers glFramebufferRenderbuffer glNamedFramebufferRenderbuffer glFramebufferTexture[Layer] glNamedFramebufferTexture[Layer] glDrawBuffer[s] glNamedFramebufferDrawBuffer[s] glReadBuffer glNamedFramebufferReadBuffer glInvalidateFramebuffer[Sub]Data glInvalidateNamedFramebuffer[Sub]Data glClearBuffer* glClearNamedFramebuffer* glBlitFramebuffer glBlitNamedFramebuffer glCheckFramebufferStatus glCheckNamedFramebufferStatus glFramebufferParameteri glNamedFramebufferParameteri glGetFramebuffer*Parameter* glGetNamedFramebuffer*Parameter*
  • 24. DSA Buffer Object Functions Non-DSA DSA glGenBuffers + glBindBuffer glCreateBuffers glBufferStorage glNamedBufferStorage glBuffer[Sub]Data glNamedBuffer[Sub]Data glCopyBufferSubData glCopyNamedBufferSubData glClearBuffer[Sub]Data glClearNamedBuffer[Sub]Data glMapBuffer[Range] glMapNamedBuffer[Range] glUnmapBuffer glUnmapNamedBuffer glFlushMappedBufferRange glFlushMappedNamedBufferRange glGetBufferParameteri* glGetNamedBufferParameteri* glGetBufferPointerv glGetNamedBufferPointerv glGetBufferSubData glGetNamedBufferSubData
  • 25. DSA Transform Feedback Functions Non-DSA DSA glGenTransformFeedbacks + glBind glCreateTransformFeedbacks glBindBuffer{Base|Range} glTransformFeedbackBuffer{Base|Range} glGetInteger* glGetTransformFeedbacki*
  • 26. DSA Vertex Array Object (VAO) Functions Non-DSA DSA glGenVertexArrays + glBindVertexArray glCreateVertexArrays glEnableVertexAttribArray glEnableVertexArrayAttrib glDisableVertexAttribArray glDisableVertexArrayAttrib glBindBuffer(ELEMENT_ARRAY_BUFFER) glVertexArrayElementBuffer glBindVertexBuffer[s] glVertexArrayVertexBuffer[s] glVertexAttrib*Format glVertexArrayAttrib*Format glVertexBindingDivisor glVertexArrayBindingDivisor glGetInteger* glGetVertexArray*
  • 27. EXT_direct_state_access Differences  Only OpenGL 4.5 core functionality supported  Some minor name changes to some functions — Mostly the same, but drops EXT suffix  TextureParameterfEXT -> TextureParameterf — VAO function names shortened  glVertexArrayVertexBindingDivisorEXT -> glVertexArrayBindingDivisor — Texture functions no longer require a target parameter  Target comes from glCreateTextures(<target>,)  Use “3D” functions with CUBE_MAP where z specifies the face  DSA functions can no longer create objects — Use glCreate* functions to create name and object at once
  • 28. Robustness  ARB_robustness functionality now part of OpenGL 4.5 — Called KHR_robustness for use with OpenGL ES too — Does not include compatibility functions  Adds “safe” APIs for queries that return data to user pointers  Adds mechanism for app to learn about GPU resets — Due to my app or some other misbehaving app  Stronger out-of-bounds behavior — No more undefined behavior  Used by WebGL implementations to deal with Denial of Service (DOS) attacks
  • 29. Robustness API  Before Robustness GLubyte tooSmall[NOT_BIG_ENOUGH]; glReadPixels(0, 0, H, W, GL_RGBA, GL_UNSIGNED_BYTE, tooSmall); // CRASH!!  After Robustness GLubyte tooSmall[NOT_BIG_ENOUGH]; glReadnPixels(0, 0, H, W, GL_RGBA, GL_UNSIGNED_BYTE, sizeof tooSmall, tooSmall); // No CRASH, glGetError() returns INVALID_OPERATION
  • 30. Robustness Reset Notification  Typical render loop with reset check while (!quit) { DrawStuff(); SwapBuffers(); if (glGetGraphicsResetStatus() != GL_NO_ERROR) { quit = true; } } DestroyContext(glrc);  Reset is asynchronous — GL will behave as normal after a reset event but rendering commands may not produce the right results — The GL context should be destroyed — Notify the user
  • 31. OpenGL ES 3.1 Compatibility  Adds new ES 3.1 features not already in GL  Also adds #version 310 es GLSL shader support  Compatibility profile required for full superset — ES 3.1 allows client-side vertex arrays — Allows application generated object names — Has default Vertex Array Object (VAO)  Desktop provides great development platform for ES 3.1 content
  • 32. Desktop features in an ES profile  NVIDA GPUs provide all ANDROID_extension_pack_es31a features in an ES profile — Geometry, Tessellation, Advanced blending, etc.  Scene from Epic’s “Rivarly” OpenGL ES 3.1 + AEP demo running on Tegra K1
  • 33. Using OpenGL ES 3.1 on Desktop  The Windows WGL way int attribList[] = { WGL_CONTEXT_MAJOR_VERSION_ARB, 3, WGL_CONTEXT_MINOR_VERSION_ARB, 1, WGL_CONTEXT_PROFILE_MASK_ARB, WGL_CONTEXT_ES_PROFILE_BIT_EXT, 0 }; HGLRC hglrc = wglCreateContextAttribsARB(wglGetCurrentDC(), NULL, attribList); wglMakeCurrent(wglGetCurrentDC(), hglrc);  On NVIDIA GPUs this is a fully conformant OpenGL ES 3.1 implementation — http://www.khronos.org/conformance/adopters/conformant-products
  • 34. New OpenGL ES 3.1 features  glMemoryBarrierByRegion — Like glMemoryBarrier, but potentially more efficient on tillers  GLSL functionality — imageAtomicExchange() support for float32 — gl_HelperInvocation fragment shader input  Know which pixels won’t get output  Skip useless cycles or unwanted side-effects — mix() function now supports int, uint and bool — gl_MaxSamples  Implementation maximum sample count
  • 35. Faster MakeCurrent  An implicit glFlush is called on MakeCurrent — Makes switching contexts slow  New WGL and GLX extensions allow glFlush to be skipped — Commands wait in context queue — App has more control over flush  Provides 2x MakeCurrent performance boost StartTimer(); for (int i = 0; i < iterations; ++i) { DrawSimpleTriangle(); wglMakeCurrent(context[i % 2]); } StopTimer();
  • 36. Disable Implicit glFlush on MakeCurrent  The Windows way with WGL int attribList[] = { WGL_CONTEXT_MAJOR_VERSION_ARB, 4, WGL_CONTEXT_MINOR_VERSION_ARB, 5, WGL_CONTEXT_RELEASE_BEHAVIOR_ARB, WGL_CONTEXT_RELEASE_BEHAVIOR_NONE_ARB, 0 }; HGLRC hglrc = wglCreateContextAttribsARB(wglGetCurrentDC(), NULL, attribList); wglMakeCurrent(wglGetCurrentDC(), hglrc);
  • 37. DirectX 11 Features  ARB_clip_control  ARB_conditional_render_inverted  ARB_cull_distance  ARB_derivative_control  ARB_shader_texture_image_samples  ARB_pipeline_statistics_query (ARB extension)  ARB_transform_feedback_overflow_query (ARB extension)
  • 38. ARB_clip_control  glClipControl(origin, depthMode); — y-origin can be flipped during viewport transformation — Depth clip range can be [0,1] instead of [-1,1]  depthMode = GL_NEGATIVE_ONE_TO_ONE: Zw = ((f-n)/2) * Zd + (n+f)/2  depthMode = GL_ZERO_TO_ONE: Zw = (f-n) * Zd + n — Provides direct mapping of [0,1] depth clip coordinates to [0,1] depth buffer values when f=1 and n=0  No precision loss origin=GL_LOWER_LEFT origin=GL_UPPER_LEFT
  • 39. ARB_conditional_render_inverted  Allow conditional render to use the negated query result  Matches the DX11 ::SetPredication(, PredicateValue) option  Query result negation only happens to landed result — Otherwise rendering takes place GLuint predicate; glCreateQueries(GL_SAMPLES_PASSED, 1, & predicate); glBeginQuery(GL_SAMPLES_PASSED, predicate); DrawNothing(); // Draws nothing glEndQuery(GL_SAMPLES_PASSED); glBeginConditionalRender(predicate, GL_QUERY_WAIT_INVERTED); DrawStuff(); // Scene is rendered since SAMPLES_PASSED==0 glEndConditionalRender();  More useful with other query targets like GL_TRANSFORM_FEEDBACK_OVERFLOW
  • 40. ARB_cull_distance  Adds new gl_CullDistance[n] to Vertex, Tessellation, and Geometry shaders (VS, TCS, TES and GS)  Like gl_ClipDistance except when any vertex has negative distance whole primitive is culled  Matches DX11 SV_CullDistance[n] Clipping Plane Negative gl_ClipDistance Positive gl_ClipDistance Clipped Clipping Plane Negative gl_CullDistance Positive gl_CullDistance Culled
  • 41. ARB_derivative_control  Adds “coarse” and “fine” variant of GLSL derivative functions  dFdxCoarse, dFdyCoarse — Potentially faster performance  dFdxFine, dFdyFine — More correct — Default behavior of old dFdx and dFdy functions  fwidthCoarse and fwidthFine are also added 2x2 Quad Fragment dFdxCoarse = = 2x2 Quad Fragment dFdxFine= = dFdxFine
  • 42. ARB_shader_texture_image_samples  New GLSL built-ins to query the sample count of multi-sample texture and image resources — textureSamples — imageSamples  Equivalent to the NumberOfSamples return with the GetDimensions query in HLSL #version 450 core uniform sample2DMS tex; out vec4 color; void main() { if (textureSamples(tex) > 2) { color = DoFancyDownsample(tex); } else { color = DoSimpleDownsample(tex); } }
  • 43. ARB_pipeline_statistics_query  New queries for profiling and DX11 compatibility — GL_VERTICES_SUBMITTED  Number of vertices submitted to the GL — GL_PRIMITIVES_SUBMITTED  Number of primitives submitted to the GL — GL_VERTEX_SHADER_INVOCATIONS  Number of times the vertex shader has been invoked — GL_TESS_CONTROL_SHADER_PATCHES  Number of patches processed by the tessellation control shader — GL_TESS_EVALUATION_SHADER_INVOCATIONS  Number of times the tessellation control shader has been invoked
  • 44. ARB_pipeline_statistics_query cont.  More queries — GL_GEOMETRY_SHADER_INVOCATIONS  Number of times the geometry shader has been invoked — GL_GEOMETRY_SHEDER_PRIMITIVES_EMITTED  Total number of primitives emitted by geometry shader — GL_FRAGMENT_SHADER_INVOCATIONS  Number of times the fragment shader has been invoked — GL_COMPUTE_SHADER_INVOCATIONS  Number of time the compute shader has been invoked — GL_CLIPPING_INPUT_PRIMITIVES — GL_CLIPPINT_OUTPUT_PRIMITIVES  Input and output primitives of the clipping stage
  • 45. ARB_transform_feedback_overflow_query  Target queries to indicate Transform Feedback Buffer overflow — GL_TRANSFORM_FEEDBACK_OVERFLOW_ARB — GL_TRANSFORM_FEEDBACK_STREAM_OVERFLOW_ARB  Use glBeginQueryIndex to specify specific stream  The result of which can be used with conditional render GLuint predicate; glCreateQueries(GL_TRANSFORM_FEEDBACK_OVERFLOW_ARB, 1, & predicate); glBeginQuery(GL_TRANSFORM_FEEDBACK_OVERFLOW_ARB, predicate); glBeginTransformFeedback(GL_TRIANGLES); DrawLotsOfStuff(); glEndTransformFeedback(); glEndQuery(GL_TRANSFORM_FEEDBACK_OVERFLOW_ARB); glBeginConditionalRender(predicate, GL_QUERY_NO_WAIT_INVERTED); DrawStuff(); // Scene not rendered if XFB overflowed buffers glEndConditionalRender();
  • 46. … glEnd() // DX11 Features
  • 47. Texture Barrier  Allows rendering to a bound texture — Use glTextureBarrier() to safely read previously written texels — Behavior is now defined with use of texture barriers  Allows render-to-texture algorithms to ping-pong without expensive Framebuffer Object (FBO) changes — Bind 2D texture array for texturing and as a layered FBO attachment Draw gl_Layer=0 glTextureBarrier() texture Draw gl_Layer=1 texture
  • 48. Programmable Blending  Limited form of programmable blending with non-self- overlapping draw calls — Bind texture as a render target and for texturing glBindTexture(GL_TEXTURE_2D, tex); glFramebufferTexture(GL_FRAMEBUFFER, GL_COLOR_ATTACHMENT0, tex, 0); dirtybbox.empty(); foreach (object in scene) { if (dirtybbox.overlaps(object.bbox())) { glTextureBarrier(); dirtybbox.empty(); } object.draw(); dirtybbox = bound(dirtybbox, object.bbox()); }
  • 49. Advanced Blending  KHR_blend_equation_advanced created from NV_blend_equation_advanced  Supported by NVIDIA since r340 – June, 2014 — GL and ES profiles  Supported natively on Maxwell and Tegra K1 GPUs — Otherwise implemented seamlessly with shaders on Fermi and Kepler  Implements a subset of NV_blend_equation_advanced modes  Maxwell and Tegra K1 also provide KHR_blend_equation_advanced_coherent — Doesn’t require glBlendBarrierKHR between primitives that double-hit color samples
  • 50. KHR_blend_equation_advanced Modes  GL_MULTIPLY_KHR  GL_SCREEN_KHR  GL_OVERLAY_KHR  GL_SOFTLIGHT_KHR  GL_HARDLIGHT_KHR  GL_COLORDODGE_KHR  GL_COLORBURN_KHR  GL_DARKEN_KHR  GL_LIGHTEN_KHR  GL_DIFFERENCE_KHR  GL_EXCLUSION_KHR  GL_HSL_HUE_KHR  GL_HSL_SATURATION_KHR  GL_HSL_COLOR_KHR  GL_HSL_LUMINOSITY_KHR
  • 51. Get Texture Sub Image  Like glGetTexImage, but now you can read a sub-region  glGetTextureSubImage — DSA only variant void GetTextureSubImage(uint texture, int level, int xoffset, int yoffset, int zoffset, sizei width, sizei height, sizei depth, enum format, enum type, sizei bufSize, void * pixels); Direct State Access Robustness pixels yoffset xoffset width height  For GL_TEXTURE_CUBE_MAP targets zoffset specifies face
  • 52. ARB_sparse_buffer  Ability to have large buffer objects without the whole buffer being resident — Analogous to ARB_sparse_texture for buffer objects  Application controls page residency 1) Create uncommitted buffer: glBufferStorage(,SPARSE_STORAGE_BIT_ARB) 2) Make pages resident: glBufferPageCommitmentARB(, offset, size, GL_TRUE); GL_SPARSE_BUFFER_PAGE_SIZE_ARB offset size
  • 53. Summary of GLSL 450 additions  dFdxFine, dFdxCoarse, dFxyFine, dFdyCoarse  textureSamples, imageSamples  gl_CullDistance[gl_MaxCullDistances];  #version 310 es  imageAtomicExchange on float  gl_HelperInvocation  gl_MaxSamples  mix() on int, uint and bool
  • 54. OpenGL Demos on K1 Shield Tablet • Tegra K1 runs Android • Kepler GPU hardware in K1 supports the full OpenGL 4.5 feature set – Today 4.4, expect 4.5 support – OpenGL 4.5 is all the new stuff, plus tons of proven features • Tessellation, compute, instancing – Plus latest features: bindless, path rendering, blend modes • Demos use GameWorks framework – Write Android-ready OpenGL code that runs on Windows and Linux too
  • 57. Build, Deploy, and Debug Android Native OpenGL Code Right in Visual Studio
  • 58. GameWorks Compute Shader Example layout (local_size_x =16, local_size_y = 16) in; layout(binding=0, rgba8) uniform mediump image2D inputImage; layout(binding=1, rgba8) uniform mediump image2D resultImage; void main() { float u = float(gl_GlobalInvocationID.x); float v = float(gl_GlobalInvocationID.y); vec4 inv = 1.0 - imageLoad(inputImage, ivec2(u,v)); imageStore(resultImage, ivec2(u,v), inv); } GLSL Compute Shader to invert an image
  • 59. Massive Compute Shader Particle Simulation
  • 60. Mega Geometry with Instancing glDrawElementsInstanced + glVertexAttribDivisor
  • 61. Getting GameWorks • Get Tegra Android Development Pack (TADP) – All the tools you need for Android development • Windows or Linux – Includes GameWorks samples • Samples also available on Github https://github.com/NVIDIAGameWorks/OpenGLSamples
  • 62. OpenGL Debug Features  KHR_debug added to OpenGL 4.3  App has access to driver “stderr” message stream — Via Callback function or — Query of message queue  Any object can have a meaningful “label”  Driver can tell app about — Errors — Performance warnings — Hazards — Usage hints  App can insert own events into stream for marking
  • 63. Why is my screen blank? void DrawTexture() { GLuint tex; glGenTextures(1, &tex); glBindTexture(GL_TEXTURE_2D, tex); glTexImage2D(tex, 0, GL_R8, 32, 32, 0, GL_RED, GL_UNSIGNED_BYTE, pixels); glEnable(GL_TEXTURE_2D); glBegin(GL_QUADS); { glTexCoord2f(0.0f, 0.0f); glVertex2f(-1.0f, -1.0f); glTexCoord2f(1.0f, 0.0f); glVertex2f( 1.0f, -1.0f); glTexCoord2f(1.0f, 1.0f); glVertex2f( 1.0f, 1.0f); glTexCoord2f(0.0f, 1.0f); glVertex2f(-1.0f, 1.0f); } glEnd(); SwapBuffers(); } Oops – Texture is incomplete!
  • 64. Enable Debug  Can be done on-the-fly void GLAPIENTRY DebugCallback(GLenum source, GLenum type, GLuint id, GLenum severity, GLsizei length, const GLchar* message, const void* userParam) { printf(“0x%X: %sn", id, message); } void DebugDrawTexture() { glDebugMessageCallback(DebugCallback, NULL); glDebugMessageControl(GL_DONT_CARE, GL_DONT_CARE, GL_DONT_CARE, 0, 0, GL_TRUE); glEnable(GL_DEBUG_OUTPUT); DrawTexture(); }  The callback function outputs: 0x20084: Texture state usage warning: Texture 1 has no mipmaps, while its min filter requires mipmap. Works in non-debug context!
  • 65. Give the texture a name  Instead of “texture 1” – give it a name void DrawTexture() { GLuint tex; glGenTextures(1, &tex); glBindTexture(GL_TEXTURE_2D, tex); GLchar texName[] = "Sky"; glObjectLabel(GL_TEXTURE, tex, sizeof texName, texName); ... }  The callback function outputs: 0x20084: Texture state usage warning: Texture Sky has no mipmaps, while its min filter requires mipmap.
  • 66. Organize your debug trace  Lots of text can get unwieldy — What parts of my code does this error apply?  Use synchronous debug output: — glEnable(GL_DEBUG_OUTPUT_SYNCHRONOUS); — Effectively disables dual-core driver — So your callback goes to your calling application thread — Instead of a driver internal thread  Use groups and markers — App injects markers to notate debug output — Push/pop groups to easily control volume
  • 67. Notating debug with groups  Use a group void DebugDrawTexture() { ... GLchar groupName[] = "DrawTexture"; glPushDebugGroup(GL_DEBUG_SOURCE_APPLICATION, 0x1234, sizeof groupName, groupName); glDebugOutputControl(...); // Can change volume if needed DrawTexture(); glPopDebugGroup(); // Old debug volume restored }  Improved output 0x1234: DrawTexture PUSH 0x20084: Texture state usage warning: Texture Sky has no mipmaps, while its min filter requires mipmap. 0x1234: DrawTexture POP
  • 69. Nsight: Interactive OpenGL debugging  Frame Debugging and Profiling  Shader Debugging and Pixel History  Frame Debugging and Dynamic Shader Editing  OpenGL API & Hardware Trace  Supports up to OpenGL 4.2 Core — And a bunch of useful extensions  https://developer.nvidia.com/nvidia-nsight-visual-studio-edition
  • 70. OpenGL related Linux improvements  Support for EGL on desktop Linux within X11 (r331)  OpenGL-based Framebuffer Capture (NvFBC), for remote graphics (r331)  Support for Quad-Buffered stereo + Composite X extension (GLX_EXT_stereo_tree) (r337)  Support for G-SYNC (Variable Refresh Rate) (r340)  Support for Tegra K1: NVIDIA SOC with Kepler graphics core — Linux Tegra K1 (Jetson) support leverages same X driver, OpenGL implementation as desktop NVIDIA GPUs — NVIDIA also contributing to Nouveau for K1 support  Coming soon: Framebuffer Object creation dramatically faster!
  • 71. Beyond OpenGL 4.5  Path Rendering Path rendering and blend modes  Resolution-independent 2D rendering  Not your classic 3D hardware rendering Earlier Illustrator demo showed this  NV_path_rendering + NV_blend_equation_advanced
  • 72. PostScript Tiger with Perspective Warping No textures! Paths rendered from resolution-independent 2D paths (outlines)
  • 73. Render Fancy Text from Outlines
  • 74. Paths + Text + 3D all at once
  • 75. Web Page Rendering every glyph from its outlines!
  • 76. Zoom in and visualize glyph outline control points
  • 77. Beyond OpenGL 4.5  Advanced scene rendering with ARB_multi_draw_indirect — Added to OpenGL 4.3  Bring even more processing onto the GPU with NV_bindless_multi_draw_indirect — Even less work for the CPU – no Vertex Buffer Object (VBO) binds between draws  Covered in depth by Christoph Kubisch yesterday — SG4117: OpenGL Scene Rendering Techniques
  • 78. NV_bindless_multi_draw_indirect  DrawIndirect combined with Bindless struct DrawElementsIndirect { GLuint count; GLuint instanceCount; GLuint firstIndex; GLint baseVertex; GLuint baseInstance; } struct BindlessPtr { Gluint index; Gluint reserved; GLuint64 address; GLuint64 length; } struct DrawElementsIndirectBindlessCommandNV { DrawElementsIndirect cmd; GLuint reserved; BindlessPtr index; BindlessPtr vertex[]; } Change vertex buffers per draw iteration! Change index buffer per draw iteration! MultiDrawElementsIndirectBindlessNV(enum mode, enum type, const void *indirect, sizei drawCount, sizei stride, int vertexBufferCount); Caveat: Does the CPU know the drawCount? The GL_BUFFER_GPU_ADDRESS_NV of the buffer object
  • 79. NV_bindless_multi_draw_indirect_count  Source the drawCount from a buffer object void MultiDrawElementsIndirectBindlessCountNV( enum mode, enum type, const void * indirect, intptr drawCount, sizei maxDrawCount, sizei stride, int vertexBufferCount ); drawCount now an offset into the bound GL_PARAMETER_BUFFER_ARB buffer range.
  • 80. Khronos OpenGL BOF at SIGGRAPH  Date: Wednesday, August 13 2014  Venue: Marriott Pinnacle Hotel, next to the Convention Center  Website: http://s2014.siggraph.org  Times: 5pm-7pm OpenGL and OpenGL ES Track  BOF After-Party: 7:30pm until late — Rumor: Free beer and door prizes

Notes de l'éditeur

  1. 2009 2 versions OpenGL 3.1 and 3.2 2010 3 versions OpenGL 3.3, 4.0 and 4.1 Can incrementally use new features without breaking your app What about 2014?
  2. I’ll go through all of these in more detail
  3. This is how the picture looks now… move on. Still
  4. OpenGL is made up of extensions…
  5. Here are all the extensions – grouped.
  6. Some OpenGL 4.x hardware can’t support these, so they aren’t in 4.5 core spec.
  7. Can I play with it today? Yes you can. GeForce 400 Series and up.
  8. GLEW with OpenGL 4.5 released today.
  9. First and biggest new feature of OpenGL 4.5 is DSA. Finally. Some binding is slow, especially things like framebuffer object.
  10. GetInteger may cause a stall – a client/server sync. The Bind has cost.
  11. Shorter, more concise code. No longer need to use an active texture selector.
  12. Just easier and nicer programmer.
  13. The new DSA commands don’t create objects. They only work on created objects. Convenient way to generate name and create object in one shot.
  14. The new texture functions don’t require a target parameter. All covered; texture creation, compressed textures, texture update and readback, state sets and gets, texbo and mipmap generation. Can all be done without binding.
  15. New to DSA. Can also manipulate framebuffer objects. Modify attachments. You can even clear the framebuffer and copy from one to another.
  16. All the buffer objects functions have DSA equivalents.
  17. All functions that operate on a vertex array object are now available in DSA. You can even modify the element array binding.
  18. Functions that dealt with compatibility like the ARB_image and matrix functions were left out. EXT_DSA functions can create objects. 4.5 DSA functions can’t.
  19. Accesses to out-of-bounds data is no longer undefined. GL termination is no longer and option. Writes are discarded and reads contained undefined data. WEB BROWER
  20. Example of a safe API.
  21. Example of polling for a reset notification. Provide some user feedback if the GPU got reset.
  22. OpenGL ES 3.1 released in March. OpenGL 4.5 includes the new functionality.
  23. NVIDIA GPUs support OpenGL ES 3.1 and all the recent extensions.
  24. OpenGL 4.5 is a super-set of OpenGL ES 3.1. But if you want to only use OpenGL ES 3.1 and the ES extensions, then create an ES profile. Our OpenGL ES 3.1 is fully conformance and listed at Khronos.
  25. We measured a 2x performance boost in MakeCurrent time with this simple implementation.
  26. Disabling implicit flush requires a new context.
  27. DX zero y-coordinate is left left – GL is bottom left. If you can’t modify the transform matrices in the shader simply flip the y-origin with glClipControl. Most useful when rendering to a texture when you want to keep the texcoords DX like. The depth scale when transforming device normalized coordinates to window coordinates is modified from [-1,-1] to [0,1]. Can provide direct mapping of you near/far planes. So there is no potential for numerical loss.
  28. Doesn’t make much sense for SAMPLES_PASSED. But good for some of the new query targets like GL_TRANSFORM_FEEDBACK_OVERFLOW.
  29. Using coarse may take less cycles than fine on some implementations.
  30. In shader query of the number of samples in a texture or image.
  31. New ARB extension.
  32. Compute invocations includes both dispatch and indirect dispatching.
  33. New ARB extension. REHEARSE
  34. Texture can be bound to both FBO and for texturing. Before this was an error. Rehearse
  35. Object must not self-intersect. Keep drawing objects until you hit an overlap. Issue the texture barrier and continue.
  36. Supported for a couple of months. Subset of NV_blend_equation_advanced which can be supported by more vendors. Maxwell and TK1 allow intersecting draws without having to use glBlendBarrier since it’s implemented native. We expose the _coherent extension for this.
  37. Not sure how this escaped GL for so long. Now you can finally read a sub image of a texture without having to read the whole thing.
  38. Sparse buffer, similar to sparse texture. Not whole buffer has to be resident. Writes to uncommitted pages are discarded and reads return undefined data.
  39. Flash slide. For your reference.
  40. Mark
  41. How do you debug GL in a platform agnostic way? KHR_debug Shipped for a couple of years now, but some folks haven’t heard of this. Got a question yesterday that can be answered by KHR_debug. All kinds of useful data. Like buffer placement and eviction, performance warnings.
  42. THIS ONE ANIMATES. – What’s wrong with this code? Uses compatibility is not the right answer!
  43. You don’t need a debug context to do this.
  44. Hard to tie debug output with the app. Use labels, markers and groups.
  45. Step frame by frame. Inspect pixels – where did the come from, what shaders were involved. Breakpoints and inspection of shaders. Interactive debugging.
  46. Linux improvements over the last year. EGL, Framebuffer capture, quad-buffered stereo, G-Sync monitors and Linux for Tegra K1. Order your Jetson development kit today!
  47. Finally one last feature we released recently. First – we already support this. Code on next page.
  48. What if you don’t know the drawCount – doing culling in the GPU for example?
  49. drawCount comes from a buffer object.